![]() |
writing a scanner (lexical analysis)
This assignment is to write a scanner for a simple programming
language to do the lexical analysis of statements in the language. The set of tokens for the language are essentially given by the declaration :
enum token {program, var, procedure, // both token andWhen the scanner finds the name of a token in the source file, it is entered into a symbol table and the item pointer returned is used then to represent the token. In this assignment the type of the attributes of a symbol is :
struct Attributes {Each type of token is assigned a unique code as described in the textbook; these codes are the constants defined in the enumerated type Token given above. Your symbol table must be initialized so that all of the predefined tokens (keywords and symbols like , and + ) and their codes are stored in it before scanning. Of course, the keywords listed in Token above are not considered to be identifiers because they are different kinds of tokens. An identifier, whose type is id, is a string of letters and digits whose first character is a letter. Strings constants are any number of characters enclosed in double quotes ("); the string is the name of the token and its type is String. There are 2 kind of special tokens. Error tokens are generated by scanning characters that are not legal tokens; i. e., their type is error. When encountering an end of file a special token is generated whose type is eof. Note that EOF is a predefined C++ constant whose value in most implementations is -1. An integer constant such as 3 is turned into a token (item pointer) and the value of the integer, in this case 3, is stored as the value of the value attribute in its item. Your scanner is an instance of a class whose declaration is of the form :
class Scanner {The constructor scanner opens the file named s for the input file stream fin. The function get finds the next token in the input file and returns its item pointer. This function also prints the characters in the input file as they are scanned including comments. Comments are like the C++ // comments. Test your scanner on the source code in file test4 on the 337 Blackboard site. After scanning this program (and printing the source code which get does) print out the sequence of tokens produced by your scanner. For each token in this sequence print its name and its type. For tokens of type integer also print its integer value as well. Of course, other tokens do not have such a value. |
as my name suggests, i'm a programming noob ...
:
item<Attributes> * get(); |
oh also ... based on the project description, am i supposed to hard-code the scanner or use finite-automata theory and regular expressions and all?
|
Firstly, there is little point in posting your assignment questions here. If you don't understand the basic intent of an assignment, ask the person who gave it to you; we can't guess. People here won't help you with assignments as the purpose of assignments is that you learn by doing them. They will only help you if you ask particular questions about specific problems (eg if you're doing the assignment, and run into something you don't understand). Second, have a look at the sticky thread at the top of the C++ forum entitled "How to post a question" (or something similar). It will give you tips on how to ask questions in a way to increase your chances of getting a useful answer.
:
class Scanner {item<Attributes> would be a particular instantation of a template class named item, which would be declared as something like; :
template<class T> class itemNote that item is not a standard class in the C++ library, so I can't tell you what it does. It is presumably something specific to your assignment. One little quibble I picked up in your assignment question: the line; Quote:
|
thanks a lot!
so the assignment gives the token declaration ...the enum token ... now i have to feed it into symbol table ... how do i do that? i dont wanna do it manually. i hope there is a better way of inserting enum token in the symbol table |
Symbol table, do you mean an intermediate representation?
|
Quote:
|
| All times are GMT -5. The time now is 5:08 AM. |
Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC