Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Mar 19th, 2006, 12:33 AM   #1
programmingnoob
Hobbyist Programmer
 
Join Date: Feb 2006
Posts: 154
Rep Power: 3 programmingnoob is on a distinguished road
writing a scanner (lexical analysis)

This assignment is to write a scanner for a simple programming
language to do the lexical analysis of statements in the language. The set
of tokens for the language are essentially given by the declaration

   enum token {program, var, procedure, // both token and
               begin, end, integer,    // token type
               read, writeln, then,
               If,                 // if is the token
               Else,               // else is the token
               scln,               // ; is the token
               cln,                // : is the token
               cma,                // , is the token
               asgn,               // := is the token
               plus,               // + is the token
               minus,              // - is the token
               mult,               // * is the token
               Div,                // div is the token
               eql,                // = is the token
               neq,                // <> is the token
               lss,                // < is the token
               gtr,                // > is the token
               lp,                 // ( is the token
               rp,                 // ) is the token
               id, Int, String, error, eof};

When the scanner finds the name of a token in the source file, it is
entered into a symbol table and the item pointer returned is used then to
represent the token. In this assignment the type of the attributes of a
symbol is

struct Attributes {
             Token type;
             int value;};

Each type of token is assigned a unique code as described in the
textbook; these codes are the constants defined in the enumerated type
Token given above. Your symbol table must be initialized so that all of the
predefined tokens (keywords and symbols like , and + ) and their codes are
stored in it before scanning. Of course, the keywords listed in Token
above are not considered to be identifiers because they are different kinds
of tokens. An identifier, whose type is id, is a string of letters and
digits whose first character is a letter. Strings constants are any number
of characters enclosed in double quotes ("); the string is the name of the
token and its type is String. There are 2 kind of special tokens. Error
tokens are generated by scanning characters that are not legal tokens; i.
e., their type is error. When encountering an end of file a special token
is generated whose type is eof. Note that EOF is a predefined C++ constant
whose value in most implementations is -1. An integer constant such as 3
is turned into a token (item pointer) and the value of the integer, in this
case 3, is stored as the value of the value attribute in its item.

Your scanner is an instance of a class whose declaration is of the
form

class Scanner {
              public:
                   Scanner(string s); // s is the name of the input file
                   item<Attributes> * get();
              private:
                   ifstream fin; // opened by the constructor scanner
                   ...          };

The constructor scanner opens the file named s for the input file stream
fin. The function get finds the next token in the input file and returns
its item pointer. This function also prints the characters in the input
file as they are scanned including comments. Comments are like the C++ //
comments.

Test your scanner on the source code in file test4 on the 337
Blackboard site. After scanning this program (and printing the source code
which get does) print out the sequence of tokens produced by your scanner.
For each token in this sequence print its name and its type. For tokens of
type integer also print its integer value as well. Of course, other tokens
do not have such a value.
programmingnoob is offline   Reply With Quote
Old Mar 19th, 2006, 12:33 AM   #2
programmingnoob
Hobbyist Programmer
 
Join Date: Feb 2006
Posts: 154
Rep Power: 3 programmingnoob is on a distinguished road
as my name suggests, i'm a programming noob ...

item<Attributes> * get();
^^ what does the above statement mean?
programmingnoob is offline   Reply With Quote
Old Mar 19th, 2006, 12:59 AM   #3
programmingnoob
Hobbyist Programmer
 
Join Date: Feb 2006
Posts: 154
Rep Power: 3 programmingnoob is on a distinguished road
oh also ... based on the project description, am i supposed to hard-code the scanner or use finite-automata theory and regular expressions and all?
programmingnoob is offline   Reply With Quote
Old Mar 19th, 2006, 1:06 AM   #4
grumpy
Programming Guru
 
grumpy's Avatar
 
Join Date: Jun 2005
Location: Adelaide, South Australia
Posts: 1,221
Rep Power: 5 grumpy is on a distinguished road
Firstly, there is little point in posting your assignment questions here. If you don't understand the basic intent of an assignment, ask the person who gave it to you; we can't guess. People here won't help you with assignments as the purpose of assignments is that you learn by doing them. They will only help you if you ask particular questions about specific problems (eg if you're doing the assignment, and run into something you don't understand). Second, have a look at the sticky thread at the top of the C++ forum entitled "How to post a question" (or something similar). It will give you tips on how to ask questions in a way to increase your chances of getting a useful answer.

class Scanner {
              public:
                   Scanner(string s); // s is the name of the input file
                   item<Attributes> * get();
              private:
                   ifstream fin; // opened by the constructor scanner
                   ...          };
item<Attributes> * get() is a declaration of a member function named get() which returns a pointer to an object of type item<Attributes>.

item<Attributes> would be a particular instantation of a template class named item, which would be declared as something like;
template<class T> class item
{
     // whatever item is, in terms of type T
};

Note that item is not a standard class in the C++ library, so I can't tell you what it does. It is presumably something specific to your assignment.

One little quibble I picked up in your assignment question: the line;
Quote:
Note that EOF is a predefined C++ constant whose value in most implementations is -1.
is incorrect. EOF is a predefined constant in the C library. It (along with all sorts of things related to C I/O) is deprecated in C++ (a formal way of saying "it is supported for now, but its usage is discouraged and it may be removed from a future version of the C++ standard"). And there is no requirement (in either the C or the C++ standards) for EOF to have a value of -1. And there several implementations in which it is not -1.
grumpy is offline   Reply With Quote
Old Mar 19th, 2006, 1:37 AM   #5
programmingnoob
Hobbyist Programmer
 
Join Date: Feb 2006
Posts: 154
Rep Power: 3 programmingnoob is on a distinguished road
thanks a lot!




so the assignment gives the token declaration ...the enum token ...

now i have to feed it into symbol table ... how do i do that?
i dont wanna do it manually.
i hope there is a better way of inserting enum token in the symbol table
programmingnoob is offline   Reply With Quote
Old Mar 19th, 2006, 1:35 PM   #6
mikaoj
Programmer
 
mikaoj's Avatar
 
Join Date: Aug 2005
Location: Norway
Posts: 56
Rep Power: 0 mikaoj is an unknown quantity at this point
Symbol table, do you mean an intermediate representation?
__________________
Heh.
mikaoj is offline   Reply With Quote
Old Mar 19th, 2006, 4:12 PM   #7
programmingnoob
Hobbyist Programmer
 
Join Date: Feb 2006
Posts: 154
Rep Power: 3 programmingnoob is on a distinguished road
Quote:
Originally Posted by prog master
Symbol table, do you mean an intermediate representation?
hmmm yeah you may think so
programmingnoob is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 11:54 PM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC