First of all, if you're new to programming and the real world, you almost certainly have the wrong ideas about application development, performance tuning, efficient coding, and the whys and wherefores thereof. That isn't intended to be a mortal insult, but the probability of it being true is very high. I would recommend that you visit
this link. The paths that you choose to think-walk might be more productive.
As an example, I am currently messing around with porting some image-analysis code to Python (from C/C++). On my first shot, it took 5-10 seconds to load the file and 10-15 seconds to process it. Those relative times have no bearing on where performance issues need to be addressed. The file loading process is: pick the file menu; pick the open option; pick the directory; pick the file; pick my nose; pick OK. Honing those actions, especially picking my nose, is not the path to success. The file is picked once. The image is processed with a function FOR EACH PIXEL. That process may involve accessing multiple pixels for each pixel. Obviously, it can get out of hand. When you have an idea that some section of code is critical, you may be right. Find out. Profile the code. Here is a sample profile (made AFTER I cut the time in half by switching image libraries).
:
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.691 1.691 11.567 11.567 Filter.py:332(SobelEdge3)
15876 1.562 0.000 2.268 0.000 Filter.py:38(gradient)
15876 1.035 0.000 2.978 0.000 Image.py:670(crop)
16384 0.775 0.000 2.009 0.000 ImageDraw.py:133(_getink)
16384 0.747 0.000 3.138 0.000 ImageDraw.py:226(point)
15880 0.732 0.000 0.943 0.000 ImageFile.py:115(load)
15876 0.701 0.000 1.001 0.000 Image.py:1546(__init__)
15876 0.546 0.000 0.546 0.000 :0(crop)
16390 0.524 0.000 0.923 0.000 Image.py:79(isStringType)
15876 0.488 0.000 1.490 0.000 Image.py:793(getdata)
15876 0.456 0.000 1.002 0.000 Image.py:1563(load)
15879 0.407 0.000 0.407 0.000 :0(range)
32788 0.399 0.000 0.399 0.000 :0(isinstance)
16384 0.381 0.000 0.381 0.000 :0(draw_points)
16385 0.312 0.000 0.312 0.000 :0(draw_ink)
15894 0.301 0.000 0.301 0.000 Image.py:392(__init__)
15876 0.300 0.000 0.300 0.000 :0(sqrt)
15892 0.211 0.000 0.211 0.000 Image.py:525(load)
It is important to view these results as relative, particularly if you don't know your profiler intimately, or how your system may behave from time to time. The next profile is for processing the same image with the same code:
:
1 1.398 1.398 9.612 9.612 Filter.py:363(SobelEdge)
15876 1.394 0.000 1.961 0.000 Filter.py:56(gradient)
15876 0.843 0.000 2.419 0.000 Image.py:670(crop)
16384 0.685 0.000 1.765 0.000 ImageDraw.py:133(_getink)
16384 0.625 0.000 2.736 0.000 ImageDraw.py:226(point)
15883 0.594 0.000 0.774 0.000 ImageFile.py:115(load)
15876 0.571 0.000 0.803 0.000 Image.py:1546(__init__)
16390 0.468 0.000 0.841 0.000 Image.py:79(isStringType)
15876 0.409 0.000 0.409 0.000 :0(crop)
32794 0.373 0.000 0.373 0.000 :0(isinstance)
15876 0.354 0.000 0.764 0.000 Image.py:1563(load)
16384 0.346 0.000 0.346 0.000 :0(draw_points)
15876 0.333 0.000 1.096 0.000 Image.py:793(getdata)
15879 0.309 0.000 0.309 0.000 :0(range)
15876 0.258 0.000 0.258 0.000 :0(sqrt)
16385 0.240 0.000 0.240 0.000 :0(draw_ink)
15900 0.232 0.000 0.232 0.000 Image.py:392(__init__)
15904 0.180 0.000 0.180 0.000 Image.py:525(load)
Before I invested any time in optimization, I decided, on Arevos' recommendation, to apply a tool called
Psyco. Rather than apply it to the program as a whole (who needs to optimize the nose-picking, remember?), I applied it specifically to the processing function (SobelEdge). This automatically applies it to all functions referenced by SobelEdge, also. The results are as follows:
:
1 2.329 2.329 3.448 3.448 Filter.py:363(SobelEdge)
15876 0.666 0.000 0.892 0.000 Image.py:1546(__init__)
15900 0.227 0.000 0.227 0.000 Image.py:392(__init__)
15904 0.225 0.000 0.225 0.000 Image.py:525(load)
Note from the lack of detail that it obviously generated an optimized proxy and ran that. The change is highly significant. Can you always expect such results? No. Why? Because your code, if it's nearly optimal, has less room for improvement. I decided to implement my own class for image manipulation. I still refer to the image library to get the data when I instantiate the class, and when I store the resulting, processed image. The profile for this new code is:
:
1 0.985 0.985 3.332 3.332 Filter.py:363(SobelEdge)
15876 1.164 0.000 1.692 0.000 Filter.py:56(gradient)
15876 0.652 0.000 0.652 0.000 Filter.py:40(getMap)
The application of Psyco has the following result:
:
1 0.863 0.863 2.515 2.515 Filter.py:363(SobelEdge)
15876 0.896 0.000 0.896 0.000 Filter.py:40(getMap)
15876 0.754 0.000 0.754 0.000 Filter.py:56(gradient)
While far less staggering, the result is still significant. You may appreciate that if the function were scaled up in order to process many images, in a batch fashion, things would be even more important. One would probably even resort to writing the procedure in a lower-level language and interfacing it with Python.