JFlex Lecture 16 Section 3.5, JFlex Manual Robb T. Koether
by user
Comments
Transcript
JFlex Lecture 16 Section 3.5, JFlex Manual Robb T. Koether
JFlex Lecture 16 Section 3.5, JFlex Manual Robb T. Koether Hampden-Sydney College Mon, Feb 23, 2015 Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 1 / 30 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 2 / 30 Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 3 / 30 JFlex JFlex is a lexical analyzer generator in Java. It is based on the lexical analyzer generator “JLex,” which is based on the lexical analyzer generator “lex.” The gnu lexical analyzer “flex” is also based on lex. JFlex reads a description of a set of tokens and outputs a Java program that will process those tokens. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 4 / 30 JFlex Overview JFlex will create a Java program Yylex.java, equivalent to our program MyLexer.java. The Yylex class contains a function yylex(), which will return the next token from the input file, equivalent to our function next_token(). Each token is described by a regular expression. We provide actions to be taken when Yylex matches a token. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 5 / 30 JFlex Overview In fact, we can tell JFlex to Rename Yylex.java as MyLexer.java. Rename yylex()as next_token(). Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 6 / 30 The JFlex Input File We will use the extension .flex for the input files to JFlex. The file is divided into three parts. User code Directives Rules These three sections are separated by %%. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 7 / 30 Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 8 / 30 JFlex User Code Any code written in the user-code section is copied directly into the Java source file created by JFlex. This code is included before the definition of the Yylex class. Typically, this section is used for import statements. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 9 / 30 Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 10 / 30 JFlex Directives Any code bracketed within %{ and %} is copied directly into the Yylex class, at the beginning of the class. Although this code is incorporated into the Yylex class, it is not incorporated into any Yylex member function. Thus, we may define Yylex class variables or additional member functions. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 11 / 30 The init and eof Directives Code bracketed within %init{ and %init} is copied into the Yylex constructor. Code bracketed within %eof{ and %eof} is copied into the Yylex function yy_do_eof(), which is called exactly once upon end of file. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 12 / 30 JFlex Token Types Unless we specify otherwise, the data type of the returned tokens is Yytoken. This class is not created automatically. We may change the return type to int by typing one of the directives %integer or %int. We may change the return type to Integer by typing the directive %intwrap. We may set the return type to any other type by using the directive %type type. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 13 / 30 JFlex Token Types and EOF If the return type is Yytoken or Integer, then the EOF token is null. If the return type is int, then the EOF token is -1. For any other type, we need to specify the EOF value. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 14 / 30 JFlex Token Types and EOF By using the %eofval directive, we may indicate what value to return upon EOF. We write %eofval{ return new type(eof-value); %eofval} Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 15 / 30 JFlex debug Directive If we include the %debug directive, then JFlex will create a main() function in the Yylex class, allowing us to run the lexer without an attached parser. The main() function will receive an input filename from the command line. It will process the tokens in that file and report information about each token. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 16 / 30 Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 17 / 30 JFlex Rules Each JFlex rule consists of a regular expression and an action to be taken when the expression is matched. The associated action is a segment of Java code, enclosed in braces { }. Typically, the action will be to return the appropriate token. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 18 / 30 JFlex Regular Expressions Regular expressions are expressed using ASCII characters (32 127). The following characters are metacharacters. ? * + | ( ) ˆ $ . [ ] { } " \ Metacharacters have special meaning; they do not represent themselves. All other characters represent themselves. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 19 / 30 JFlex Regular Expressions Regular Expression r r? r* r+ r |s rs Matches One occurrence of r Zero or one occurrence of r Zero or more occurrences of r One or more occurrences of r r or s r concatenated with s r and s are regular expressions. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 20 / 30 JFlex Regular Expressions Parentheses are used for grouping. The expression ("+"|"-")? represents an optional plus or minus sign. If a regular expression begins with ˆ, then it is matched only at the beginning of a line. If a regular expression ends with $, then it is matched only at the end of a line. The dot . matches any non-newline character. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 21 / 30 JFlex Regular Expressions Brackets [ ] match any single character listed within the brackets. For example, [abc] matches a or b or c. [A-Za-z] matches any letter. If the first character after [ is ˆ, then the brackets match any character except those listed. [ˆA-Za-z] matches any nonletter. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 22 / 30 JFlex Regular Expressions A single character within double quotes " " or after \ represents itself, except for n, r, b, t, and f. Metacharacters lose their special meaning and represent themselves when they stand alone within single quotes or follow \. "?" and \? match ?. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 23 / 30 JFlex Escape Sequences Escape Sequence \n \r \b \t \f Matches newline (LF) carriage return (CR) backspace (BS) tab (TB) form feed (FF) If a character c is not a special escape-sequence character, then \c matches c. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 24 / 30 Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 25 / 30 The JAR Files JFlex uses a number of Java class. These classes have been compiled and are stored in the Java archive file flex-1.6.0.jar. Assignment 6 will have instructions on how to download and install this file. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 26 / 30 Running JFlex The lexical analyzer generator is the Main class in the JFlex folder. To create a lexical analyzer from the file filename.flex, type java jflex.Main filename.flex This produces a file Yylex.java (or whatever we named it), which must be compiled to create the lexical analyzer. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 27 / 30 Running the Lexical Analyzer Example (Using the Yylex Class) InputStreamReader isr = new InputStreamReader(System.in); BufferedReader br = new BufferedReader(isr); Yylex lexer = new Yylex(br); token = lexer.yylex(); To run the lexical analyzer, a Yylex object must first be created. The Yylex constructor has one parameter, specifying a Reader. We will convert standard input, which is an InputStream, to a buffered reader. Then call the function yylex() to get the next token. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 28 / 30 Outline 1 Introduction 2 JFlex User Code 3 JFlex Directives 4 JFlex Rules 5 Running JFlex 6 Assignment Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 29 / 30 Assignment Assignment Read Section 3.5, which is about lex, not JFlex, but they are very similar. Robb T. Koether (Hampden-Sydney College) JFlex Mon, Feb 23, 2015 30 / 30