Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

 
 
Thread Tools Display Modes
Prev Previous Post in Thread   Next Post in Thread Next
Old Feb 27th, 2006, 6:32 PM   #1
hoffmandirt
Hobbyist Programmer
 
hoffmandirt's Avatar
 
Join Date: Jul 2005
Location: PA
Posts: 125
Rep Power: 4 hoffmandirt is on a distinguished road
Send a message via AIM to hoffmandirt
Word Frequency Regular Expression

I have been working on a word frequency application that works as follows:

1. Retrieves line from text file.
2. Splits line on spaces.
3. Iterates through each word storing each word in a hash table assuming it is not already stored there. If it is, update the corresponding value by adding 1.
4. Repeat with next line.

My problem is that I don't have much experience with text processing or regular expressions and I am getting words such as "testing," I'm having trouble comming up with a regular expression that verifies if the current word is a word. I guess what I'm getting at is that I need a regular expression that allows punctuation, but not periods, commas, exclamtion points, and etc. Also any input on text processing and regular expressions is appreciated. Thanks.
hoffmandirt is offline   Reply With Quote
 

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 3:53 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC