![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Hobbyist Programmer
Join Date: Feb 2006
Posts: 154
Rep Power: 3
![]() |
writing a scanner (lexical analysis)
This assignment is to write a scanner for a simple programming
language to do the lexical analysis of statements in the language. The set of tokens for the language are essentially given by the declaration enum token {program, var, procedure, // both token and
begin, end, integer, // token type
read, writeln, then,
If, // if is the token
Else, // else is the token
scln, // ; is the token
cln, // : is the token
cma, // , is the token
asgn, // := is the token
plus, // + is the token
minus, // - is the token
mult, // * is the token
Div, // div is the token
eql, // = is the token
neq, // <> is the token
lss, // < is the token
gtr, // > is the token
lp, // ( is the token
rp, // ) is the token
id, Int, String, error, eof};When the scanner finds the name of a token in the source file, it is entered into a symbol table and the item pointer returned is used then to represent the token. In this assignment the type of the attributes of a symbol is struct Attributes {
Token type;
int value;};Each type of token is assigned a unique code as described in the textbook; these codes are the constants defined in the enumerated type Token given above. Your symbol table must be initialized so that all of the predefined tokens (keywords and symbols like , and + ) and their codes are stored in it before scanning. Of course, the keywords listed in Token above are not considered to be identifiers because they are different kinds of tokens. An identifier, whose type is id, is a string of letters and digits whose first character is a letter. Strings constants are any number of characters enclosed in double quotes ("); the string is the name of the token and its type is String. There are 2 kind of special tokens. Error tokens are generated by scanning characters that are not legal tokens; i. e., their type is error. When encountering an end of file a special token is generated whose type is eof. Note that EOF is a predefined C++ constant whose value in most implementations is -1. An integer constant such as 3 is turned into a token (item pointer) and the value of the integer, in this case 3, is stored as the value of the value attribute in its item. Your scanner is an instance of a class whose declaration is of the form class Scanner {
public:
Scanner(string s); // s is the name of the input file
item<Attributes> * get();
private:
ifstream fin; // opened by the constructor scanner
... };The constructor scanner opens the file named s for the input file stream fin. The function get finds the next token in the input file and returns its item pointer. This function also prints the characters in the input file as they are scanned including comments. Comments are like the C++ // comments. Test your scanner on the source code in file test4 on the 337 Blackboard site. After scanning this program (and printing the source code which get does) print out the sequence of tokens produced by your scanner. For each token in this sequence print its name and its type. For tokens of type integer also print its integer value as well. Of course, other tokens do not have such a value. |
|
|
|
|
|
#2 |
|
Hobbyist Programmer
Join Date: Feb 2006
Posts: 154
Rep Power: 3
![]() |
as my name suggests, i'm a programming noob ...
item<Attributes> * get(); |
|
|
|
|
|
#3 |
|
Hobbyist Programmer
Join Date: Feb 2006
Posts: 154
Rep Power: 3
![]() |
oh also ... based on the project description, am i supposed to hard-code the scanner or use finite-automata theory and regular expressions and all?
|
|
|
|
|
|
#4 | |
|
Programming Guru
![]() Join Date: Jun 2005
Location: Adelaide, South Australia
Posts: 1,221
Rep Power: 5
![]() |
Firstly, there is little point in posting your assignment questions here. If you don't understand the basic intent of an assignment, ask the person who gave it to you; we can't guess. People here won't help you with assignments as the purpose of assignments is that you learn by doing them. They will only help you if you ask particular questions about specific problems (eg if you're doing the assignment, and run into something you don't understand). Second, have a look at the sticky thread at the top of the C++ forum entitled "How to post a question" (or something similar). It will give you tips on how to ask questions in a way to increase your chances of getting a useful answer.
class Scanner {
public:
Scanner(string s); // s is the name of the input file
item<Attributes> * get();
private:
ifstream fin; // opened by the constructor scanner
... };item<Attributes> would be a particular instantation of a template class named item, which would be declared as something like; template<class T> class item
{
// whatever item is, in terms of type T
};Note that item is not a standard class in the C++ library, so I can't tell you what it does. It is presumably something specific to your assignment. One little quibble I picked up in your assignment question: the line; Quote:
|
|
|
|
|
|
|
#5 |
|
Hobbyist Programmer
Join Date: Feb 2006
Posts: 154
Rep Power: 3
![]() |
thanks a lot!
so the assignment gives the token declaration ...the enum token ... now i have to feed it into symbol table ... how do i do that? i dont wanna do it manually. i hope there is a better way of inserting enum token in the symbol table |
|
|
|
|
|
#6 |
|
Programmer
Join Date: Aug 2005
Location: Norway
Posts: 56
Rep Power: 0
![]() |
Symbol table, do you mean an intermediate representation?
__________________
Heh. |
|
|
|
|
|
#7 | |
|
Hobbyist Programmer
Join Date: Feb 2006
Posts: 154
Rep Power: 3
![]() |
Quote:
|
|
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|