Programming Forums

Programming Forums (http://www.programmingforums.org/forumindex.php)
-   Assembly (http://www.programmingforums.org/forum20.html)
-   -   Which is faster? (http://www.programmingforums.org/showthread.php?t=10042)

rsnd May 29th, 2006 6:59 AM

Which is faster?
 
Well...I know I am good at making stupid and "obvious" posts and I love making them. This is probably one of them.


Well, which is faster?

Old fashoned C "memcopy" and StrCpyN etc which probably uses its superb algorithems or whatever.

Or

inline assembly: Unaligned "rep movsb" copying in 32 bit pentium 3 or whatever

Tutorials and docs I read totally scares me...they go like "500 * ecx * extra couple of thousand for unaligned * couple of hundreds more if its "byte" " or something clock cycles(yeah exaggerated) :S

DaWei May 29th, 2006 7:08 AM

Which is faster? A car going east or a car going west? Library functions in C are all over the place, depending on who wrote them. The best simple ones like copies are almost always written in assembler, but some aren't. Some copy or compare functions will copy bytes until they hit an even boundary, then copy entire words. The only difference between those functions and pure assembly is the wrapper overhead. A person familiar with assembler could drop into in-line and whup his own C-writing ass but it would be problematic as to whether or not his assembler would whup a true expert.

I don't know why the speed quibbles 'scare' you. You can move a barrel of apples much faster than you can move them one at a time.

Narue May 29th, 2006 8:22 AM

>Well, which is faster?
I'll answer your question with another question: Why does it matter? If you haven't profiled your code and found the library functions to be a bottleneck, you're doing nothing but spinning your wheels.

rsnd May 29th, 2006 10:47 PM

Well, sometimes I dont trust those library functions. For example...the StrCmpN function...
:

                  mov        edi,edi
                  push        ebp
                  mov        ebp,esp
                  sub        esp,00000018h
                  mov        eax,[L77FCD280]
                  push        esi
                  mov        esi,[ebp+08h]
                  push        edi
                  mov        edi,[ebp+0Ch]
                  mov        [ebp-04h],eax
                  lea        eax,[ebp-18h]
                  push        eax
                  push        00000000h
                  call        [KERNEL32.dll!GetCPInfo]  ; <---- wtf
                  test        eax,eax
                  jz        L77F809D7
                  cmp        byte ptr [ebp-12h],00h
                  jnz        L77F8F358
 L77F809D7:
                  xor        eax,eax
 L77F809D9:
                  push        eax
                  push        [ebp+10h]
                  push        edi
                  push        esi
                  call        SUB_L77F70E8C  ;<-- goes off somewhere
                  mov        ecx,[ebp-04h]
                  pop        edi
                  pop        esi
                  call        SUB_L77F64020 ;<--- again
                  leave  ;<-- never seen this 1 b4 lol
                  retn        000Ch

If I did it in assembly it would have been something like
:

    push esi
    push edi
    push ecx
    mov esi, [esp+16] ;32 bit?
    mov edi, [esp+20]
    mov ecx, [esp+24]
    rep cmpsb
    ;do something tricky to put stuff in eax


    ;pop vals

and thats it. No calling "GetCPInfo" or other stuff. Plain and works.

But then, I did not do any alignment stuff and I'm accessing "bytes" which makes it really really slow? I mean...this is in an app which has to look for a given 1024 (may not always be 1024...any arbitary data size specified by user...min = 1024 and then increases on each run) of bytes of data in a 24156 (dependent on data size...[anothe user specified number...min 1024..increase on each run] * data size) byte buffer 120 times per second in order to encode and decode instructions in time and stay synchronized with a device. You lagg behind => you die. Which means it needs to make the best use of its time slot as it will be running at same processor priority as another app which has to do a lot of work to produce that 1024 bytes and to send to my program => will probably take most of cpu time? Well, I cant feel how fast or slow each is going. Probably because I have a vary fast duel core processor. But it may not be the case. They might be running one of those old 33mhz processor or even older models lol. (probably not that old...oldest machine my uni has that is still in used by nerds at microelectronics department) So its reely scary ^_^. I dont want to die before everyone else lol.

Narue May 30th, 2006 7:56 AM

>Well, sometimes I dont trust those library functions.
It's a shame that you don't think other programmers can write good code.

>For example...the StrCmpN function...
Is a mature implementation that's supposedly tuned for the compiler and system that it was written for. If it's longer than what you would have written by hand, then you have to consider optimizations (they tend to make code longer), bug fixes and workarounds (that your code would not have), and other improvements that make the function solid and professional. Your ad hoc version would lose all of that, and you wouldn't even know it. The programmers who wrote it aren't necessarily incompetent.

The only way to work for performance is to test both options thoroughly and optimize where necessary. If that means writing your own version, so be it. Personally, I wouldn't trust the judgement of strangers on a forum who aren't working on the project for any of my optimization needs. ;)

DaWei May 30th, 2006 8:00 AM

Quote:

Personally, I wouldn't trust the judgement of strangers on a forum who aren't working on the project for any of my optimization needs.
Well, if they don't have a tongue ring, you can prolly trust 'em :D .

rsnd Jun 4th, 2006 6:38 AM

Quote:

Originally Posted by Narue
>Well, sometimes I dont trust those library functions.
It's a shame that you don't think other programmers can write good code.

It's not that. Outcome of someone's code is a result of their motivation. Their motivation comes from their need. Their need may not be relevent to mine. You said it yourself: "they tend to make code longer." They tend to look at increasing processor speed and ram size and ignore 'performance' factors. Even my teacher is like "should use the Integer object in java instead of its primitive version." Thats right...he said "primitive" version. I dont see how they can keep going like this. Processor speeds seem to have stablized around 3.something ghz and now they are going for multi cores. Whats next? Hardware operating system? I reckon things are going in all the wrong directions. Anyways...my point is...they dont die if their stuff is slow.

DaWei Jun 4th, 2006 7:02 AM

I think you have some misperceptions. Those who write library functions for reliable compilers are not writing code to their needs -- they're writing code to YOUR needs. You also may not understand the implications of "longer" code. There is very often a tradeoff between code size and speed. That's why there's inlining. That's why your good compiler offers you the choice between speed and size optimization. I seriously doubt that you could write a library function that came within 20% of the effectiveness of the function that ships with your compiler. I know what resource starvation is (as in memory that costs $1000 per megabyte or more). There is obviously a maximum bang for minimum bucks. Just as obviously, it moves as the technology changes. A tiny minority of programmers will be working on systems that require the ultimate in performance and efficiency. One cannot afford to dedicate expensive programmers to tasks that aren't cost-effective. One way to achieve higher performance IS through hardware. Why do you think they have separate graphics devices? Cache? Numeric co-processors? You have no real earthly idea about the wrong direction or the right direction. For the sake of your mental outlook, I hope you get one of those tiny percentage of the jobs that require cutting-edge performance regardless of cost.

rsnd Jun 4th, 2006 7:13 AM

lol
I aint no programmer lol so I wont need to look for those "tiny percentage of the jobs" :P
my point was...programs dont need to be big(as in doing extra redundant stuff) and just because technology is advanceing fast, we dont need to slack off.

Narue Jun 4th, 2006 9:28 AM

Quote:

It's not that. Outcome of someone's code is a result of their motivation. Their motivation comes from their need. Their need may not be relevent to mine. You said it yourself: "they tend to make code longer." They tend to look at increasing processor speed and ram size and ignore 'performance' factors. Even my teacher is like "should use the Integer object in java instead of its primitive version." Thats right...he said "primitive" version. I dont see how they can keep going like this. Processor speeds seem to have stablized around 3.something ghz and now they are going for multi cores. Whats next? Hardware operating system? I reckon things are going in all the wrong directions. Anyways...my point is...they dont die if their stuff is slow.
I develop for a popular commercial compiler. Even if I'm never going to use a library function that I write, I write it to the best of my ability. Anything less would be an insult to end-users, such as yourself. Your argument is that I'm willfully ignorant of performance considerations because I think hardware will keep up the slack, which couldn't be further from the truth. Performance is second only to correctness when I'm writing code for you. All false modesty aside, I'm probably a much better programmer than you are, with significantly more experience. Saying that you don't trust your libraries (and your reasoning for saying that) is tantamount to saying that you think I'm incompetent.

Now, your library is written to be generic. That's an unbreakable requirement that affects performance, so I won't tell you that you can't write faster code with an ad hoc approach than I would with a generic approach. What I will tell you is what I've told you before. Prefer the library, and if it's not fast or small enough for your needs after profiling, then you can take the time to write a customized solution. 9 times out of 10, the library will be sufficient, so that process is a much more efficient use of your time.


All times are GMT -5. The time now is 2:36 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC