...

JFlex Lecture 16 Section 3.5, JFlex Manual Robb T. Koether

by user

on
Category: Documents
11

views

Report

Comments

Transcript

JFlex Lecture 16 Section 3.5, JFlex Manual Robb T. Koether
JFlex
Lecture 16
Section 3.5, JFlex Manual
Robb T. Koether
Hampden-Sydney College
Mon, Feb 23, 2015
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
1 / 30
1
Introduction
2
JFlex User Code
3
JFlex Directives
4
JFlex Rules
5
Running JFlex
6
Assignment
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
2 / 30
Outline
1
Introduction
2
JFlex User Code
3
JFlex Directives
4
JFlex Rules
5
Running JFlex
6
Assignment
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
3 / 30
JFlex
JFlex is a lexical analyzer generator in Java.
It is based on the lexical analyzer generator “JLex,” which is based
on the lexical analyzer generator “lex.”
The gnu lexical analyzer “flex” is also based on lex.
JFlex reads a description of a set of tokens and outputs a Java
program that will process those tokens.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
4 / 30
JFlex Overview
JFlex will create a Java program Yylex.java, equivalent to our
program MyLexer.java.
The Yylex class contains a function yylex(), which will return
the next token from the input file, equivalent to our function
next_token().
Each token is described by a regular expression.
We provide actions to be taken when Yylex matches a token.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
5 / 30
JFlex Overview
In fact, we can tell JFlex to
Rename Yylex.java as MyLexer.java.
Rename yylex()as next_token().
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
6 / 30
The JFlex Input File
We will use the extension .flex for the input files to JFlex.
The file is divided into three parts.
User code
Directives
Rules
These three sections are separated by %%.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
7 / 30
Outline
1
Introduction
2
JFlex User Code
3
JFlex Directives
4
JFlex Rules
5
Running JFlex
6
Assignment
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
8 / 30
JFlex User Code
Any code written in the user-code section is copied directly into
the Java source file created by JFlex.
This code is included before the definition of the Yylex class.
Typically, this section is used for import statements.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
9 / 30
Outline
1
Introduction
2
JFlex User Code
3
JFlex Directives
4
JFlex Rules
5
Running JFlex
6
Assignment
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
10 / 30
JFlex Directives
Any code bracketed within %{ and %} is copied directly into the
Yylex class, at the beginning of the class.
Although this code is incorporated into the Yylex class, it is not
incorporated into any Yylex member function.
Thus, we may define Yylex class variables or additional member
functions.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
11 / 30
The init and eof Directives
Code bracketed within %init{ and %init} is copied into the
Yylex constructor.
Code bracketed within %eof{ and %eof} is copied into the
Yylex function yy_do_eof(), which is called exactly once upon
end of file.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
12 / 30
JFlex Token Types
Unless we specify otherwise, the data type of the returned tokens
is Yytoken.
This class is not created automatically.
We may change the return type to int by typing one of the
directives %integer or %int.
We may change the return type to Integer by typing the
directive %intwrap.
We may set the return type to any other type by using the directive
%type type.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
13 / 30
JFlex Token Types and EOF
If the return type is Yytoken or Integer, then the EOF token is
null.
If the return type is int, then the EOF token is -1.
For any other type, we need to specify the EOF value.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
14 / 30
JFlex Token Types and EOF
By using the %eofval directive, we may indicate what value to
return upon EOF.
We write
%eofval{
return new type(eof-value);
%eofval}
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
15 / 30
JFlex debug Directive
If we include the %debug directive, then JFlex will create a
main() function in the Yylex class, allowing us to run the lexer
without an attached parser.
The main() function will receive an input filename from the
command line.
It will process the tokens in that file and report information about
each token.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
16 / 30
Outline
1
Introduction
2
JFlex User Code
3
JFlex Directives
4
JFlex Rules
5
Running JFlex
6
Assignment
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
17 / 30
JFlex Rules
Each JFlex rule consists of a regular expression and an action to
be taken when the expression is matched.
The associated action is a segment of Java code, enclosed in
braces { }.
Typically, the action will be to return the appropriate token.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
18 / 30
JFlex Regular Expressions
Regular expressions are expressed using ASCII characters (32 127).
The following characters are metacharacters.
? * + | ( ) ˆ $ . [ ] { } " \
Metacharacters have special meaning; they do not represent
themselves.
All other characters represent themselves.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
19 / 30
JFlex Regular Expressions
Regular
Expression
r
r?
r*
r+
r |s
rs
Matches
One occurrence of r
Zero or one occurrence of r
Zero or more occurrences of r
One or more occurrences of r
r or s
r concatenated with s
r and s are regular expressions.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
20 / 30
JFlex Regular Expressions
Parentheses are used for grouping.
The expression
("+"|"-")?
represents an optional plus or minus sign.
If a regular expression begins with ˆ, then it is matched only at the
beginning of a line.
If a regular expression ends with $, then it is matched only at the
end of a line.
The dot . matches any non-newline character.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
21 / 30
JFlex Regular Expressions
Brackets [ ] match any single character listed within the
brackets.
For example,
[abc] matches a or b or c.
[A-Za-z] matches any letter.
If the first character after [ is ˆ, then the brackets match any
character except those listed.
[ˆA-Za-z] matches any nonletter.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
22 / 30
JFlex Regular Expressions
A single character within double quotes " " or after \ represents
itself, except for n, r, b, t, and f.
Metacharacters lose their special meaning and represent
themselves when they stand alone within single quotes or follow \.
"?" and \? match ?.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
23 / 30
JFlex Escape Sequences
Escape
Sequence
\n
\r
\b
\t
\f
Matches
newline (LF)
carriage return (CR)
backspace (BS)
tab (TB)
form feed (FF)
If a character c is not a special escape-sequence character, then
\c matches c.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
24 / 30
Outline
1
Introduction
2
JFlex User Code
3
JFlex Directives
4
JFlex Rules
5
Running JFlex
6
Assignment
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
25 / 30
The JAR Files
JFlex uses a number of Java class.
These classes have been compiled and are stored in the Java
archive file flex-1.6.0.jar.
Assignment 6 will have instructions on how to download and
install this file.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
26 / 30
Running JFlex
The lexical analyzer generator is the Main class in the JFlex
folder.
To create a lexical analyzer from the file filename.flex, type
java jflex.Main filename.flex
This produces a file Yylex.java (or whatever we named it),
which must be compiled to create the lexical analyzer.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
27 / 30
Running the Lexical Analyzer
Example (Using the Yylex Class)
InputStreamReader isr = new InputStreamReader(System.in);
BufferedReader br = new BufferedReader(isr);
Yylex lexer = new Yylex(br);
token = lexer.yylex();
To run the lexical analyzer, a Yylex object must first be created.
The Yylex constructor has one parameter, specifying a Reader.
We will convert standard input, which is an InputStream, to a
buffered reader.
Then call the function yylex() to get the next token.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
28 / 30
Outline
1
Introduction
2
JFlex User Code
3
JFlex Directives
4
JFlex Rules
5
Running JFlex
6
Assignment
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
29 / 30
Assignment
Assignment
Read Section 3.5, which is about lex, not JFlex, but they are very
similar.
Robb T. Koether (Hampden-Sydney College)
JFlex
Mon, Feb 23, 2015
30 / 30
Fly UP