![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Programming Guru
![]() ![]() |
Hashing In C
I'm wondering what the best way to hash in C is, without using anyone else's code?
Is it acceptable to use the standard associative containers from C++, and then make the rest of the code C? Or can that end with bad results? Or is it a better idea to write my own naive implementation in C? Note: In most scenarios where I need the hash table, the hashing function is quite clear, and could be implemented without a hash table (IE, just a quick key -> array index conversion). However, there are sometimes occassional collisions due to unexpected input, that I need to handle. That's what brings me to this post. |
|
|
|
|
|
#2 |
|
Professional Programmer
![]() Join Date: Sep 2005
Posts: 419
Rep Power: 4
![]() |
Re: Hashing In C
>I'm wondering what the best way to hash in C is
It depends on the needs of your program. Like sorting there are tons of ways to hash, and they're all valid solutions. >without using anyone else's code? Could you be more specific about this requirement?
__________________
Even if the voices aren't real, they have some pretty good ideas. |
|
|
|
|
|
#3 |
|
Programming Guru
![]() ![]() |
Re: Hashing In C
It's for a contest, so written code can be brought with us, but only if it's our own. And if it's too long, I can't spend 20+ minutes typing it all out from my notes, so it has to be something compact.
This is the current solution I have. I tried to keep it as compact as possible, but capable of insertion and retrieval. This current example is meant to be specific to input where the key is "A-Z", but there could often be unexpected input such as "a", or "z", or "az". Therefore, the only reason a hash table is being used instead of an array of 26 elements, is to handle those exceptions. Performance of the buckets is not important, so building a linked list for collisions isn't tragic. C Syntax (Toggle Plain Text)
Last edited by Sane; Jan 8th, 2008 at 9:40 PM. |
|
|
|
|
|
#4 |
|
Programmer
Join Date: Nov 2007
Posts: 86
Rep Power: 1
![]() |
Re: Hashing In C
if this is for the ccc which allows c or c++ and your problem requires a hashtable, you should just use c++ and take advantage of the stl.
|
|
|
|
|
|
#5 |
|
Professional Programmer
![]() Join Date: Sep 2005
Posts: 419
Rep Power: 4
![]() |
Re: Hashing In C
>you should just use c++ and take advantage of the stl
C++0x hasn't been ratified yet, so any use of hash tables (unordered_list) or anything like it would be risking no support by the compiler. >but only if it's our own The problem with hashing is that if you make something up, it's highly likely to suck. That's why most wise programmers use existing algorithms. You can still use the algorithm without using someone else's code, but with the one I'm suggesting it's pretty hard to "personalize" something so simple. >so it has to be something compact A good, fast, and small algorithm is Bernstein's hash. It's so short that it's nearly impossible to forget, and it has surprisingly good properties: c Syntax (Toggle Plain Text)
__________________
Even if the voices aren't real, they have some pretty good ideas. |
|
|
|
|
|
#6 |
|
Programming Guru
![]() ![]() |
Re: Hashing In C
Well it's almost always in the circumstance where the hash function is known. Such as there are only 26 keys between "A" and "Z" ... thus the hash function is key[0]-'A'.
The point of using the hash table is to handle any unexpected input, such as "a", "z", or "az". That's where the hash table, along with my solution, comes in. I just need the best way to handle the scenario where the hash function is of small importance, and performance of the buckets is unimportant... whether it be something from the standard library, or a smaller version of the code I posted. Are associative containers in C++ not usable like Hash Tables, and a part of the standard library? P.S. I recall reading up on Bernstein's algorithm in an article you wrote, over a year ago. Good stuff. ![]() |
|
|
|
|
|
#7 |
|
Professional Programmer
![]() Join Date: Sep 2005
Posts: 419
Rep Power: 4
![]() |
Re: Hashing In C
>I just need the best way to handle the scenario where
>the hash function is of small importance, and >performance of the buckets is unimportant... Then any collision resolution strategy will work. All of the basic ones are painfully simple to remember and implement in a short time. Hell, hash tables are one of the few data structures that I can write and get working on the first try. >Are associative containers in C++ not usable like Hash Tables They're like hash tables, but may not be implemented as such. std::map, for example, is usually implemented as a red black tree. If one of those classes solves your problem, it'll be a lot easier than writing your own hash table.
__________________
Even if the voices aren't real, they have some pretty good ideas. |
|
|
|
|
|
#8 | |
|
Programming Guru
![]() ![]() |
Re: Hashing In C
Hmm.
So let's look at this objectively now. This isn't quite the same as the requirements I mentioned in the first post, but I'll go ahead anyways. Here is an imaginary assignment: Quote:
Would you recommend any changes? Could this be easier if I used the STL? Thanks for all your help. C Syntax (Toggle Plain Text)
Edit: Uploaded a file of half a million randomly generated integers for performance testing. It's amazing how much a bad hashing function affects the performance. But I feel that an execution time of ~2s, as is, is perfect. Post Edit: The file upload won't work... Last edited by Sane; Jan 9th, 2008 at 6:34 PM. |
|
|
|
|
|
|
#9 |
|
Expert Programmer
Join Date: Jun 2005
Posts: 881
Rep Power: 4
![]() |
Re: Hashing In C
Quickly done STL version (using your input loop) - untested - you probably would need to fix up the includes
c++ Syntax (Toggle Plain Text)
|
|
|
|
|
|
#10 |
|
Programming Guru
![]() ![]() |
Re: Hashing In C
Interesting! That's much, much easier to implement. That could be written in 2 minutes, whereas mine took about 20. I now feel like I'll be wasting so much time on this contest by not using the STL. I'll have to ask the judges what they want to see...
It's interesting to note that mine is about 200% faster on the file with half a million integers. I guess that's because the stl map isn't a hash table. And your code worked, but you had a mismatched bracket before the greater than sign. Cpp Syntax (Toggle Plain Text)
|
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Custom Hashing Algorithms | Sane | C | 21 | Oct 16th, 2006 10:05 PM |
| MD5 hashing... | Un4given | C++ | 3 | May 31st, 2005 7:51 PM |
| Hashing Collashing | mtanti | Coder's Corner Lounge | 0 | Apr 10th, 2005 12:29 PM |