Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Feb 14th, 2007, 2:04 AM   #1
rwm
Professional Programmer
 
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2 rwm is on a distinguished road
problem with tokenizing

I guess I am pretty rusty at the moment No sorry, make that VERY rusty...

I'm having a problem with this program:

Given an input file that contains:

this is my loverly file

its very interesting innit!

hello?

I only get:

this is my loverly file

its very interesting innit!

The last word is being missed?

Heres the code:

#include <iostream>
#include <fstream>
using namespace std;

const char *inputFile = "c:\\testFile.txt";

int main() {

	ifstream in(inputFile,ios::in);
	if(!in) {

		cout << "error: could not get input file";

		return 1;

	}

	char ch;
	char token[100];
	unsigned int index=0;

	while(in.get(ch)) {

		//get tokens
		if(ch == ' ') {

			//got a token
			token[index] = '\0';

			cout << ' ' << token;

			token[0] = '\0';
			index = 0;

		} else {

			token[index] = ch;

			index++;

		}

	}

	cout << "\n\npress a key to exit...";
	while(!kbhit()) ;

	return 0;

}

I can't exactly remember how to store the token using a pointer, its very embarassing I know

I tried a test like this:

char *ptr = new char[100];

for(char ch='a'; ch<'f'; ch++) {

	*ptr = ch;

	ptr++;

}

*ptr = '\0';

but when I try to print it:

cout << ptr;

i get garbage... im doing something very wrong i know!

my cheeks are burning !

anyway, hope someone can help me out?

Thx

PS: reminder to myself to sit down and go over some "fundamentals" this weekend... hehe!
rwm is offline   Reply With Quote
Old Feb 14th, 2007, 2:27 AM   #2
rwm
Professional Programmer
 
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2 rwm is on a distinguished road
well i just realised that all i had to do was:

		//get tokens
		if(ch == ' ' || ch == '\n' || ch == '\r') {

		...

but any help with the embarassing pointer problem would be greatly appreciated!

thx!
rwm is offline   Reply With Quote
Old Feb 14th, 2007, 3:30 AM   #3
rwm
Professional Programmer
 
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2 rwm is on a distinguished road
hi,

well i decided that writing a GetToken function would be a much better way to parse the file:

#include <iostream>
#include <fstream>
#include <conio.h>
using namespace std;

enum TokenType {NUL,DEFORMER,COLON,VALUE};

//get token
TokenType GetToken(char *p,char token[100]) {

	unsigned int index = 0;

	while(*p != ' ' && *p != '\n') {

		token[index] = *p;

		*p++; index++;

	}

	token[index] = '\0';

	//get token type
	if(!strcmp("deformer",token)) {

		//got a deformer
		return DEFORMER;

	}

	return NUL;

}

int main() {

	char *p = "deformer is me";
	char token[100];

	if(GetToken(p,token) == DEFORMER) {

		cout << "got a deformer";

	}

	cout << "\n\npress a key to exit...";
	while(!kbhit()) ;

	return 0;

}

it works, but the problem is i dont know how I can get a pointer to the file contents?

any suggestions?

hope someone can help me out!

thx!
rwm is offline   Reply With Quote
Old Feb 14th, 2007, 3:32 AM   #4
rwm
Professional Programmer
 
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2 rwm is on a distinguished road
something along the lines of:

ifstream in("myfile",ios::in);

char *ptr;

ptr = in.get(); //doesnt work because in.get() returns an integer

then i can do this:

GetToken(ptr,token);
rwm is offline   Reply With Quote
Old Feb 14th, 2007, 5:23 AM   #5
pegasus001
Hobbyist Programmer
 
pegasus001's Avatar
 
Join Date: Nov 2006
Location: 163H
Posts: 213
Rep Power: 2 pegasus001 is on a distinguished road
Just try tellg(). This member function takes no parameters and returns a value of type pos_type that is an integer which represents the current position of the get stream pointer.
__________________
You never test the depth of a river with both feet.
The believer is happy. The doubter is wise.
Free speech carries with it some freedom to listen.
The next generation will always surpass the previous one. It`s one of the never ending cycles of life.
pegasus001 is offline   Reply With Quote
Old Feb 14th, 2007, 5:38 AM   #6
pegasus001
Hobbyist Programmer
 
pegasus001's Avatar
 
Join Date: Nov 2006
Location: 163H
Posts: 213
Rep Power: 2 pegasus001 is on a distinguished road
Before i forget to move the pointer of a file to a new location use the seekg(int offs, seekdir direc).

off(offset) is an int, and means how many positions to move from the direc.
direc(direction) is an enumeration (ios::beg, ios::cur, ios::end) and specifies where to start counting before moving the pointer.
__________________
You never test the depth of a river with both feet.
The believer is happy. The doubter is wise.
Free speech carries with it some freedom to listen.
The next generation will always surpass the previous one. It`s one of the never ending cycles of life.
pegasus001 is offline   Reply With Quote
Old Feb 14th, 2007, 6:32 AM   #7
rwm
Professional Programmer
 
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2 rwm is on a distinguished road
hey thx,

i already started using seekg and peek...

jeez cant believe how out of touch i am!

lol

ta for help!
rwm is offline   Reply With Quote
Old Feb 14th, 2007, 7:03 AM   #8
DaWei
Resident Grouch
 
DaWei's Avatar
 
Join Date: Jun 2005
Posts: 6,453
Rep Power: 10 DaWei is on a distinguished road
You do realize the extraction operator has a function that tokenizes on whitespace, right? Also, if your OS is Windows, open the file in binary mode or seeks and tells won't work properly. The use of "conio" is non-standard and blows your portability, if that's of any concern to you.
__________________
Abstraction doesn't make it impossible to write bad code; it makes it possible to write superior code.
Contributor's Corner: Grumpy on C++ Exceptions DaWei on Pointers
DaWei is offline   Reply With Quote
Old Feb 15th, 2007, 4:08 AM   #9
rwm
Professional Programmer
 
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2 rwm is on a distinguished road
hey DaWei, no i didnt know there was a standard tokenizer - could you give a bit more details?

well im only testing this on Windows at the moment - im only using conio for keeping the console open while i test....

well im trying to write a tokenizer class:

doesnt seem to be working properly, here it is:

class Tokenizer {

	public:
		//constructors
		Tokenizer(const char *file);
		Tokenizer(const char *file,istream in);

		//destructor
		~Tokenizer();

		//operators
		string operator++();
		string operator--();

	private:
		//member data
		ifstream mIn;
		char *mFile;

	protected:
		//

};

//std dependencies
#include <iostream>
using namespace std;

//local dependencies
#include "Tokenizer.h"

//constructor 1
Tokenizer::Tokenizer(const char *file) {

	//read in passed file
	mIn.open(file,ios::in);

	if(!mIn) {

		cerr << "error: could not open file for reading!";
		exit(1);

	}

}

//constructor2 - use own ifstream object
Tokenizer::Tokenizer(const char *file,istream in) {

	//

}

//destructor
Tokenizer::~Tokenizer() {

	//close stream
	mIn.close();

}

//get next token from stream
string Tokenizer::operator++() {

	//data
	string token;

	//skip all whitespace
	while(mIn.peek() == ' ') {

		mIn.seekg(1,ios::cur);

	}

	//get token
	char ch;
	while(mIn.get(ch)) {

		//get token
		if(ch == ' ' || ch == '\n') {

			//got token
			//check if it is a valid token
			if(token == "") {

				return "kNull";

			}

			return token;

		} else {

			token.push_back(ch);

		}

	}

	//didnt get a token
	return "kNull";

}

//get previous token from the stream
string Tokenizer::operator--() {

	//

	return "";

}

it doesnt seem to print the last character, im testing by running this:


int main() {

	//
	Tokenizer tokenizer(inputFile);

	cout << endl << tokenizer++;
	cout << endl << tokenizer++;
	cout << endl << tokenizer++;
	cout << endl << tokenizer++;

	...

it returns kNull for the last token... im really battling...

i want to be able to do something like this:

	while(tokenizer++ != "atoken") {

		cout << endl << tokenizer++;

	}

but not working...

damn i need way more practice!

if any better suggestions/ideas please shout out! im really starting to understand the importance of practical programming, ive never really done any practical programming - and it all makes sense that "practise makes perfect"....
rwm is offline   Reply With Quote
Old Feb 15th, 2007, 4:10 AM   #10
rwm
Professional Programmer
 
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2 rwm is on a distinguished road
it would be nice to be able to extract the n'th token from the stream, for example

	string tok = tokenizer+=5; //get the 5th token from this one

but first im trying to figure out how to extract tokens properly....

any help/suggestions would be greatly appreciated!

thx!
rwm is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Storing BLOBs in a database - problem jonyzz Other Programming Languages 8 Jan 31st, 2007 4:38 AM
Changing icons problem Pedja C# 8 Mar 25th, 2006 8:03 AM
Huge arrays in C (game-oriented problem theme) Rather Generic C 6 Mar 19th, 2006 1:09 AM
cgi/perl script + IE problem joyceshee Perl 2 Jan 24th, 2006 11:10 AM




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 7:37 PM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC