![]() |
Hi,
Im currently building a text classification program which will eventually take a txt file in, tokenise it, remove normal words like 'the', 'and' etc, store it in a hashmap then count the number of occurences of that specific word. Ive built a streamtokeniser which is ok. Although id would like some advice on how to count the number of occurences of a specific word. eg airpane 6 tricky 1 Eventually i will need to do something with the number and the word so thats why i am storing them in a hashmap.. Heres my streamtokeniser anyway: :
import java.io.*; In this code i have a counting mechanism which queries the hashmap to see if it already contains the token as a key. If it does, the corresponding Counter object is incremented to indicate that another instance of this word has been found. If not, a new Counter is created – since the Counter constructor initializes its value to one, this also acts to count the word. Is this right? if so how would i get the output to look like my example, because when i just print the hashmap it comes out across the screen eg: Counter@f4a24a Does anyone have any advice or tips. Thanks :banana: ps. the smilies on this forum are better than any other forum i post to. |
As far as I can tell, it is doing exactly what you asked it to do, which is print out the location in memory where counts is stored.
If you want to print out the actual contents of the Hashmap, you are going to have to create a loop that pulls elements out one ata time, then assemble them in a readable fashion. Unless I am totally misunderstanding what it is that you are trying to do :huh: |
So i need a loop to pull the token out..
I need an output like this. Token | Number Of Occurences airplane | 4 tricky | 3 how do i make it print the count of specific words like above any ideas ? thanks :mellow: |
Just off the top of my head, I would use an array that stored your tokens, along with each time that particular token appeared.
Good luck. Now I have my own asignment to work on. :) |
| All times are GMT -5. The time now is 12:55 AM. |
Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC