![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Professional Programmer
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2
![]() |
problem with tokenizing
I guess I am pretty rusty at the moment
No sorry, make that VERY rusty...I'm having a problem with this program: Given an input file that contains: this is my loverly file its very interesting innit! hello? I only get: this is my loverly file its very interesting innit! The last word is being missed? Heres the code: #include <iostream>
#include <fstream>
using namespace std;
const char *inputFile = "c:\\testFile.txt";
int main() {
ifstream in(inputFile,ios::in);
if(!in) {
cout << "error: could not get input file";
return 1;
}
char ch;
char token[100];
unsigned int index=0;
while(in.get(ch)) {
//get tokens
if(ch == ' ') {
//got a token
token[index] = '\0';
cout << ' ' << token;
token[0] = '\0';
index = 0;
} else {
token[index] = ch;
index++;
}
}
cout << "\n\npress a key to exit...";
while(!kbhit()) ;
return 0;
}I can't exactly remember how to store the token using a pointer, its very embarassing I know ![]() I tried a test like this: char *ptr = new char[100];
for(char ch='a'; ch<'f'; ch++) {
*ptr = ch;
ptr++;
}
*ptr = '\0';but when I try to print it: cout << ptr; i get garbage... im doing something very wrong i know! my cheeks are burning ! anyway, hope someone can help me out? Thx PS: reminder to myself to sit down and go over some "fundamentals" this weekend... hehe! |
|
|
|
|
|
#2 |
|
Professional Programmer
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2
![]() |
well i just realised that all i had to do was:
//get tokens
if(ch == ' ' || ch == '\n' || ch == '\r') {
...but any help with the embarassing pointer problem would be greatly appreciated! thx! |
|
|
|
|
|
#3 |
|
Professional Programmer
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2
![]() |
hi,
well i decided that writing a GetToken function would be a much better way to parse the file: #include <iostream>
#include <fstream>
#include <conio.h>
using namespace std;
enum TokenType {NUL,DEFORMER,COLON,VALUE};
//get token
TokenType GetToken(char *p,char token[100]) {
unsigned int index = 0;
while(*p != ' ' && *p != '\n') {
token[index] = *p;
*p++; index++;
}
token[index] = '\0';
//get token type
if(!strcmp("deformer",token)) {
//got a deformer
return DEFORMER;
}
return NUL;
}
int main() {
char *p = "deformer is me";
char token[100];
if(GetToken(p,token) == DEFORMER) {
cout << "got a deformer";
}
cout << "\n\npress a key to exit...";
while(!kbhit()) ;
return 0;
}it works, but the problem is i dont know how I can get a pointer to the file contents? any suggestions? hope someone can help me out! thx! |
|
|
|
|
|
#4 |
|
Professional Programmer
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2
![]() |
something along the lines of:
ifstream in("myfile",ios::in);
char *ptr;
ptr = in.get(); //doesnt work because in.get() returns an integerthen i can do this: GetToken(ptr,token); |
|
|
|
|
|
#5 |
|
Hobbyist Programmer
Join Date: Nov 2006
Location: 163H
Posts: 213
Rep Power: 2
![]() |
Just try tellg(). This member function takes no parameters and returns a value of type pos_type that is an integer which represents the current position of the get stream pointer.
__________________
You never test the depth of a river with both feet. The believer is happy. The doubter is wise. Free speech carries with it some freedom to listen. The next generation will always surpass the previous one. It`s one of the never ending cycles of life. |
|
|
|
|
|
#6 |
|
Hobbyist Programmer
Join Date: Nov 2006
Location: 163H
Posts: 213
Rep Power: 2
![]() |
Before i forget to move the pointer of a file to a new location use the seekg(int offs, seekdir direc).
off(offset) is an int, and means how many positions to move from the direc. direc(direction) is an enumeration (ios::beg, ios::cur, ios::end) and specifies where to start counting before moving the pointer.
__________________
You never test the depth of a river with both feet. The believer is happy. The doubter is wise. Free speech carries with it some freedom to listen. The next generation will always surpass the previous one. It`s one of the never ending cycles of life. |
|
|
|
|
|
#7 |
|
Professional Programmer
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2
![]() |
hey thx,
i already started using seekg and peek... jeez cant believe how out of touch i am! lol ta for help! |
|
|
|
|
|
#8 |
|
Resident Grouch
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jun 2005
Posts: 6,453
Rep Power: 10
![]() |
You do realize the extraction operator has a function that tokenizes on whitespace, right? Also, if your OS is Windows, open the file in binary mode or seeks and tells won't work properly. The use of "conio" is non-standard and blows your portability, if that's of any concern to you.
__________________
Abstraction doesn't make it impossible to write bad code; it makes it possible to write superior code. Contributor's Corner: Grumpy on C++ Exceptions DaWei on Pointers |
|
|
|
|
|
#9 |
|
Professional Programmer
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2
![]() |
hey DaWei, no i didnt know there was a standard tokenizer - could you give a bit more details?
well im only testing this on Windows at the moment - im only using conio for keeping the console open while i test.... well im trying to write a tokenizer class: doesnt seem to be working properly, here it is: class Tokenizer {
public:
//constructors
Tokenizer(const char *file);
Tokenizer(const char *file,istream in);
//destructor
~Tokenizer();
//operators
string operator++();
string operator--();
private:
//member data
ifstream mIn;
char *mFile;
protected:
//
};//std dependencies
#include <iostream>
using namespace std;
//local dependencies
#include "Tokenizer.h"
//constructor 1
Tokenizer::Tokenizer(const char *file) {
//read in passed file
mIn.open(file,ios::in);
if(!mIn) {
cerr << "error: could not open file for reading!";
exit(1);
}
}
//constructor2 - use own ifstream object
Tokenizer::Tokenizer(const char *file,istream in) {
//
}
//destructor
Tokenizer::~Tokenizer() {
//close stream
mIn.close();
}
//get next token from stream
string Tokenizer::operator++() {
//data
string token;
//skip all whitespace
while(mIn.peek() == ' ') {
mIn.seekg(1,ios::cur);
}
//get token
char ch;
while(mIn.get(ch)) {
//get token
if(ch == ' ' || ch == '\n') {
//got token
//check if it is a valid token
if(token == "") {
return "kNull";
}
return token;
} else {
token.push_back(ch);
}
}
//didnt get a token
return "kNull";
}
//get previous token from the stream
string Tokenizer::operator--() {
//
return "";
}it doesnt seem to print the last character, im testing by running this: int main() {
//
Tokenizer tokenizer(inputFile);
cout << endl << tokenizer++;
cout << endl << tokenizer++;
cout << endl << tokenizer++;
cout << endl << tokenizer++;
...it returns kNull for the last token... im really battling... i want to be able to do something like this: while(tokenizer++ != "atoken") {
cout << endl << tokenizer++;
}but not working... damn i need way more practice! if any better suggestions/ideas please shout out! im really starting to understand the importance of practical programming, ive never really done any practical programming - and it all makes sense that "practise makes perfect".... |
|
|
|
|
|
#10 |
|
Professional Programmer
Join Date: Jan 2007
Location: Cape Town
Posts: 291
Rep Power: 2
![]() |
it would be nice to be able to extract the n'th token from the stream, for example
string tok = tokenizer+=5; //get the 5th token from this one but first im trying to figure out how to extract tokens properly.... any help/suggestions would be greatly appreciated! thx! |
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Storing BLOBs in a database - problem | jonyzz | Other Programming Languages | 8 | Jan 31st, 2007 4:38 AM |
| Changing icons problem | Pedja | C# | 8 | Mar 25th, 2006 8:03 AM |
| Huge arrays in C (game-oriented problem theme) | Rather Generic | C | 6 | Mar 19th, 2006 1:09 AM |
| cgi/perl script + IE problem | joyceshee | Perl | 2 | Jan 24th, 2006 11:10 AM |