+1 714-215-3097

(Solved) : Write Program C Please Help Remainder Semester Building Programs Interpret Small Language Q35561158


For the remainder of the semester we will be building programsthat interpret a small language. The language will have constants,a small number of keywords, and some operators.

The remainder of the semester will be broken into threepieces:

Program 2 – Lexical analyzer

Program 3 – Parser

Program 4 – Interpreter

For Program 2, the lexical analyzer, you will be provided with adescription of the lexical syntax of the language. You will producea lexical analysis function and a program to test it. The lexicalanalyzer function must have the following calling signature:

Token getNextToken(istream& in, int& linenumber);

The first argument to getNextToken is a reference to an istreamthat the function should read from. The second argument togetNextToken is a reference to an integer that contains the currentline number. getNextToken will update this integer every time itreads a newline. getNextToken returns a Token. A Token is a classthat contains a TokenType, a string for the lexeme, and the linenumber that the token was found on.

A header file, tokens.h, will be provided for you. It contains adeclaration for the Token class, and a declaration for all of theTokenType values. You MUST use the header file that is provided.You may NOT change it.

The lexical rules of the language are as follows:

1. The language has identifiers, which are defined to be aletter followed by zero or more letters or numbers. This will bethe TokenType ID.

2. The language has integer constants, which are defined to beone or more digits. This will be the TokenType ICONST.

3. The language has string constants, which are a double-quotedsequence of characters, all on the same line. This will be theTokenType SCONST.

4. A string constant can include escape sequences: a backslashfollowed by a character. The sequence n should be interpreted as anewline. The sequence should be interpreted as a backslash. Allother escapes should simply be interpreted as the character afterthe backslash.

5. The language has reserved the keywords print, set, if, loop,begin, end. They will be TokenTypes PRINT SET IF LOOP BEGINEND.

6. The language has several operators. They are + – * / ( )which will be TokenTypes PLUS MINUS STAR SLASH LPAREN RPAREN

7. The language recognizes a semicolon as the token SC

8. The language recognizes a newline as the token NL

9. A comment is all characters from a # to the end of the line;it is ignored and is not returned as a token. NOTE that a # in themiddle of an SCONST is NOT a comment!

10. Whitespace between tokens can be used for readability. Itserves to delimit tokens.

11. An error will be denoted by the ERR token.

12. End of file will be denoted by the DONE token.

Note that any error detected by the lexical analyzer shouldresult in the ERR token, with the lexeme value equal to the stringrecognized when the error was detected. Note also that both ERR andDONE are unrecoverable. Once the getNextToken function returns aToken for either of these token types, you shouldn’t callgetNextToken again.

The assignment is to write the lexical analyzer function andsome test code around it. It is a good idea to implement thelexical analyzer in one source file, and the main test program inanother source file.

The test code is a main() program that takes several commandline arguments:

-v (optional) if present, every token is printed when it isseen

-strings (optional) if present, print out all the stringconstants in alphabetical order

-ids (optional) if present, print out all of the identifiers inalphabetical order filename (optional) if present, read from thefilename; otherwise read from standard in

Note that no other flags (arguments that begin with a dash) arepermitted. If an unrecognized flag is present, the program shouldprint “UNRECOGNIZED FLAG {arg}”, where {arg} is whatever flag wasgiven, and it should stop running. At most one filename can beprovided, and it must be the last command line argument. If morethan one filename is provided, the program should print “ONLY ONEFILE NAME ALLOWED” and it should stop running

If the program cannot open a filename that is given, the programshould print “CANNOT OPEN {arg}”, where {arg} is the filenamegiven, and it should stop running. The program should repeatedlycall getNextToken until it returns DONE or ERR. If it returns DONE,the program proceeds to handling the -strings and -ids options, inthat order. It should then print summary information and exit.

If getNextToken returns ERR, the program should print “Error online N ({lexeme})”, where N is the line number for the token andlexeme is the lexeme from the token, and it should stoprunning.

If the -v option is present, the program should print each tokenas it is read and recognized, one token per line. The output formatfor the token is the token name in all capital letters (forexample, the token LPAREN should be printed out as the stringLPAREN. In the case of token ID, ICONST, and SCONST, the token nameshould be followed by a space and the lexeme in parens. Forexample, if the identifier “hello” is recognized, the -v output forit would be ID (hello).

The -strings option should cause the program to print STRINGS:on a line by itself, followed by every string constant found, onestring per line, in alphabetical order. If there are no SCONSTs inthe input, then nothing is printed.

The -ids option should cause the program to print IDENTIFIERS:followed by a comma-separated list of every identifier found, inalphabetical order. If there are no IDs in the input, then nothingis printed.

The summary information is as follows:

Total lines: L

Total tokens: N

Where L is the number of input lines and N is the number oftokens (not counting DONE).

If L is zero, no further lines are printed.


● Compiles ● Argument error cases ● Files that cannot be opened● Too many filenames ● Properly handles a zero length file ●Recognizes keywords and identifiers ● Summary information ● -vmode


● Recognizes all remaining tokens ● Recognizes string with anewline in it as an error ● Recognizes string with a # in it as astring, not a comment ● Recognizes single character token types ●Supports -strings and -ids

Leave a Reply

Your email address will not be published. Required fields are marked *