I have a script that compares in some way each line from file1
and file2
, and outputs the lines if there is a difference. I want to make it faster - right now it's in Python. I could use threads, but I would like to know is there some easier way to improve it?
Since each test is independent, it could run in parallel - I just need to make sure that each line from file1
is compared with each line from file2
.
EDIT: The bottleneck so far is the processor(comparison process); the disc usage isn't that big, but the core with program is 100%. Note that files are "large"(e.g. over 20MB), so I understand that it takes some time to process them.