An O(N^2) algorithm? What if the file is 100kb?
Even if it compared 10,240 bytes per second. It would still take
12 days to run it only
once.
print (100*1024)**2/10240.0/3600.0/24.0
That's one of the worst implementations for such a simple problem I've ever come across.
An O(N) algorithm, such as using an associative array, is just as simple to implement, and will take 10 seconds under the exact same circumstances...