Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Oct 10th, 2006, 2:47 AM   #1
iradic
Programmer
 
Join Date: Feb 2006
Posts: 36
Rep Power: 0 iradic is on a distinguished road
(new) strtox function?

Here is strtox function. It's a variant of tokenize string functions. I couldn't find any similar function on the net so I wrote one.

Args:
source - string to be tokenized
token - token
tox - delimiter to be seen as token
sep - delimiter as in strtok
str - string delimiter
eol - collect all chars to end of line, inclusive (like // in C)
esc - escape char (used in string delimiter only)

There is lot of room for imporvement...

Questions:
1) How to use staitc variables in strtox... so that it remembers args between calls (tox sep str eol and esc). Do I need module level static vars or function level ...???
2) Do you know any similar function out there?

What do you think...

/* strtox() */
char *
strtox (char **source, char **token, size_t limit,
        char *tox, char *sep, char *str, char *eol, char esc)
{
  int flag;
  char *lim, *a, *all, *tx, *sp, *sr, *el;

  char *string = *source;
  char *buf = *token;

  if (!*string)
    return NULL;

  tx = tox;
  sp = sep;
  sr = str;
  el = eol;

  all =
    malloc (strlen (tox) + strlen (sep) + strlen (str) + strlen (eol) + 1);
  assert (all);

  all = xstrcat (all, tox, sep, str, eol, NULL);

  flag = 0;
  lim = buf + limit - 1;

  while (*string && buf < lim) {
    for (a = all; *a && *string; a++) { /* for all */
      if (*string == *a) {      /* we have something... */
        if (flag == 1) {        /* already have token? (return it) */
          for (sp = sep; *sp && *string; sp++) {        /* for sep advance pointer */
            if (*string == *sp) {
              ++string;
              sp = sep;         /* reset */
            }
          }
          free (all);
          *buf = 0;
          return string;        /* return token */
        }
        for (sp = sep; *sp && *string; sp++) {  /* for sep advance pointer */
          if (*string == *sp) {
            ++string;
            sp = sep;           /* reset */
          }
        }
        for (tx = tox; *tx && *string; tx++) {  /* for tox */
          if (*string == *tx) {
            *buf++ = *string++;
            for (sp = sep; *sp && *string; sp++) {      /* for sep advance pointer */
              if (*string == *sp) {
                ++string;
                sp = sep;
              }
            }
            *buf = 0;
            free (all);
            return string;      /* return if tox */
          }
        }
        for (sr = str; *sr && *string; sr++) {  /* for str */
          if (*string == *sr) {
            while (*string && buf < lim) {      /* collect until str ends */
              *buf++ = *string++;
              if (*string == *sr && *(string - 1) != esc) {
                *buf++ = *string++;
                for (sp = sep; *sp && *string; sp++) {  /* for sep advance pointer */
                  if (*string == *sp) {
                    ++string;
                    sp = sep;
                  }
                }
                *buf = 0;
                free (all);
                return string;
              }
            }
            sr = str;           /* reset */
          }
        }
        for (el = eol; *el && *string; el++) {  /* for eol */
          if (*string == *el) {
            while (*string && buf < lim) {      /* collect until end of line */
              if (*string == '\n' || *string == '\r') {
                *++string;
                for (sp = sep; *sp && *string; sp++) {  /* for sep advance pointer */
                  if (*string == *sp) {
                    ++string;
                    sp = sep;
                  }
                }
                free (all);
                *buf = 0;
                return string;  /* return eol */
              }
              *buf++ = *string++;       /* collect eol */
            }
            el = eol;           /* reset (if multiple) */
          }
        }
        a = all;
      }
    }
    *buf++ = *string++;         /* the rest... */
    flag = 1;                   /* set token flag */
  }

  *buf = 0;
  free (all);
  return string;
}

Here is the sample input / output:
rules:
  static char *tox = "'().";
  static char *sep = " \t\n\r";
  static char *str = "\"";
  static char *eol = ";";
  static char esc = '\\';

input:
(I) you "he" 
	tab ; she
	; then
	we they.

output:
-- buf: <(>
-- buf: <I>
-- buf: <)>
-- buf: <you>
-- buf: <"he">
-- buf: <tab>
-- buf: <; she>
-- buf: <; then>
-- buf: <we>
-- buf: <they>
-- buf: <.>

EDIT: In attachment is the source for the test program.
Thanks, bye
Attached Files
File Type: txt attach.c.txt (5.8 KB, 10 views)
iradic is offline   Reply With Quote
Old Oct 10th, 2006, 10:48 AM   #2
Narue
Professional Programmer
 
Narue's Avatar
 
Join Date: Sep 2005
Posts: 419
Rep Power: 4 Narue is on a distinguished road
>Here is strtox function. It's a variant of tokenize string functions.
You need to change the name. strtox suggests a naive attempt to convert a string to hexadecimal, so you're inadvertently misleading people. Also, any function beginning with str and followed by a lower case letter is reserved by the implementation.

>1) How to use staitc variables in strtox... so that it remembers args between calls (tox sep str eol and esc).
Don't. One of the biggest problems with strtok is that it destroys any shred of reentrance by using a static variable internally. That's why you see things like strsep and ad hoc tokenizers all over the place.

>2) Do you know any similar function out there?
Nothing with the exact semantics you seem to want. But strsep is a nice, mature implementation that you can derive from.
__________________
Even if the voices aren't real, they have some pretty good ideas.
Narue is offline   Reply With Quote
Old Oct 10th, 2006, 2:30 PM   #3
iradic
Programmer
 
Join Date: Feb 2006
Posts: 36
Rep Power: 0 iradic is on a distinguished road
Ok, thanks for input...

Could you (or someone else) tell me the difference between using "const char" and "char" in args...
i only know that it should prevent changing arg passed as "const char" - is this only on compile time? ... it seems not really helpfull but I see in lot of functions...

Bye
iradic is offline   Reply With Quote
Old Oct 10th, 2006, 3:07 PM   #4
DaWei
Resident Grouch
 
DaWei's Avatar
 
Join Date: Jun 2005
Posts: 6,453
Rep Power: 10 DaWei is on a distinguished road
It tells the compiler that the const object should not be changed in the body of the function. It's an aid to you in preventing slip-ups. I don't know about you, but I need all the help I can get; particularly since I formed a lot of habits before there were standards.
__________________
Abstraction doesn't make it impossible to write bad code; it makes it possible to write superior code.
Contributor's Corner: Grumpy on C++ Exceptions DaWei on Pointers
DaWei is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Combining languages titaniumdecoy Other Programming Languages 12 Jul 13th, 2006 3:03 PM
Compiling Maverik 6.2 (from C) megamind5005 C 16 May 3rd, 2006 6:41 PM
libraries matko C 1 Jan 22nd, 2006 3:12 PM
Jackpot game zorin Visual Basic 3 Jun 10th, 2005 2:19 PM
airport Log program using 3D linked List : problem reading from file gemini_shooter C++ 0 Mar 2nd, 2005 5:12 PM




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 10:30 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC