This course is an introduction to development tools for embedded systems. Our focus will be on the compiler construction, from the most basic to some introduction to optimization, object code format, and some topics related to embedded systems.
In the introduction to the design and implementation of compilers for an imperative language, we will study various components in a compiler, including scanners, parsers, symbol tables, semantic analyzers, and code generators, and the overall organization of a compiler. In the scanners, we will study regular expressions and the transformation of a regular expression into a deterministic finite state machine. In the parsers, we will study LL and LR parsing methods and the construction of various parsing tables. There are many tools that can generate scanners and parsers from formal specifications. We will also study these tools (lex and yacc in particular) and use them to generate the scanner and parser for our project.
We will study and implement the code generation components of a compiler. The symbol tables are a data structure shared by most components in the compiler. We will study how to process variables, types, expressions, records, arrays, pointers, and simple control structures. Procedures and packages will also be studied. Finally, we will study techniques for code generation and local optimization. If time permits, we will study global optimization techniques in compilers.
This course will include an integrated programming project. We will give you a detailed specification of a simplified Pascal language. It will be our target language. We will also supply the documents of a scanner generator (lex) and of a parser generator (yacc). You will implement a scanner, a parser, and other components used to generate code for a machine of your choice. You can use X86 assembly, MIPS assembly, Java bytecode, ARM code, etc. To help you make progress, we will set three check points during the semester and a final demonstration at the end of the semester. You will need to finish a scanner, a parser, and a primitive code generator at the checkpoints, respectively. At the end of the semester, you will need to show (off) a complete compiler that generate code. In the project, you will need to implement a minimum set of features, which will be announced later in the class. If you implement extra features, you will earn extra points.
C.N. Fischer, R.K. Cytron, and R.J. LeBlanc, Jr., Crafting a Compiler, Addison-Wesley, 2010. (table of contents)
(reference only) A.V. Aho, M.S. Lam, R. Sethi, and J.D. Ullman, Compilers, Principles, Techniques, and Tools, 2nd Ed., Addison-Wesley, 2007. Chapters 9-10.(table of contents)
(reference only) C.N. Fischer and R.L. LeBlanc, Jr., Crafting a Compiler with C, Benjamin/Cummings, 1991. (table of contents)
(reference only) A.V. Aho, R. Sethi, and J.D. Ullman, Compilers, Principles, Techniques, and Tools, Addison-Wesley, 1985. (table of contents)
(reference only) Andrew W. Appel, Modern Compiler Implementation In Java, Cambridge Univ. Press, 1997.
(reference only) K. Dowd, High-performance Computing (chapters 2 and 4), O'Relly, 1992 ???. RISC vs CISC, what an optimizing compiler does
(reference only) David H. D. Warren, Logic Programming and Compiler Writing, Software Practice and Experience, Vol. 10, pp. 97-125, 1980.
(reference only) J. R. Levine, Linkers and Loaders, Morgan Kaufmann Publishers, San Francisco, 2000.
The compiler design course is closely associated with a project. Three checkpoints are set up to help you make progress. At the final demonstration, you are required to show off a complete compiler. We will provide you with a grading sheet that details the grading policy. We urge to you to make continual progress in your project since this is one of the best ways to learn the course thoroughly.
Class material is available on www. See below. There is a README file explaining how to use the class material.
First Checkpoint: The scanner. Due date: October 5, 2011, Wednesday, beginning of class (4th week)At the first checkpoint, you will need to finish the scanner, using the lex tool. You need to demonstrate your scanner with all the reserved words and symbols plus 10 correct names and comments. In addition, you need to use 10 incorrect symbols to demonstrate that your scanner can handle the incorrect names properly (e.g., issue some error messages). The output of your scanner should list each token in the input, together with the line number and the position of the first character of the token. In case the token is an identifier or a number, the string representing the token must also be printed. You do NOT need to print a copy of the output nor the program in order to save some paper. We will use on-line demonstration in the first checkpoint.
If the input is "354+126*(920-27/190)", the output from the scanner should be the sequence of tokens: "34", "+", "126", "*", "(", "920", "-", "27", "/", "190" and ")".
There are several test cases with which you may try your scanner. You may also implement a pragma (pragma ListOn and pragma ListOff) to turn on or off the listing of the source program.
You may use lex (or flex) to implement the scanner. There is a file for token definitions. You may modify the token definitions for your project. A flex manual is also included in this web page.
The following issues should be addressed:
NOTE: It seems inhumane to ask the students to build parse trees for such a large language as Ada/CS. Therefore, no parse trees need be built in this checkpoint.
At the second checkpoint, you will need to finish the parser. You need to demonstrate your scanner with the test programs we will provide. As in the first checkpoint, you do NOT need to print a copy of the output nor the program. We will use on-line demonstration.
You may use yacc or bison to implement the parser. There is a file for syntax definitions. You may modify the syntax definitions. A bison manual is also included in this web page.
For this checkpoint, your parser should answer accept for correct test file and reject for incorrect test files. For incorrect input, the positions of the errors should be indicated. The parser should try to parse to the end of the input and catch as many errors as possible. Your parser should generate a parse tree or abstract syntax tree for correct input.
You need to address the following issues:
At the third checkpoint, you will need to finish the symbol table and related routines. You need to show that your symbol table can correctly handle block entry and exit, procedure names, and parameters. As usual, do not print a copy of the program or the output. On-line demonstration is enough.
You will need about 5 days to implement a primitive working symbol table.
You may use C++ STL 'map' . The class is essentially a red-black tree.
You need to address the following issues:
At the last checkpoint, you will have the chance to show off your complete compiler. We will schedule an interview with each students and test the features that you have implemented. You need to bring your grade sheet to the interview. Please do not print your program or output. We will give you a grade for the project during the interview. In the project, you will need to implement a minimum set of features, which will be announced later in the class. If you implement extra features, you will earn extra points.
Here is a list of semantic checks:
The target machine may be X86, MIPS, ARM, Java Virtual Machine, etc. A manual of Java virtual machine is included in the next section, and use any web browser to run it.
IT IS RECOMMENDED THAT YOU USE CHINESE AND ENGLISH TO ANSWER QUESTIONS IN THE EXAMS. DO NOT USE ANY OTHER LANGUAGES.
Midterm: November 9, 2011, Wednesday.
Final: November 11, 2012, Wednesday.
Here is material for the proejct: (1) Additional explanation of Mini-Pascal, (2) Evaluation criteria, and (3) Some test cases. Here is the material for the 1st (scanner) project. Here is the bison and flex binary for Windows: ZIP version and RAR version. Here is the material for the 2nd (parser) project. HERE is a TINY, BUT CMPLETE PARSER IN YACC (for the 2nd project). Here is the material for the 3rd (symbol table and declaration) project. Here is the material for the final (code generation) project.
Here is the CLASSNOTES FOR CHAPTER 1. Here is the CLASSNOTES FOR CHAPTER 2. Here is the CLASSNOTES FOR CHAPTER 3. Here is the CLASSNOTES FOR CHAPTER 4. Here is the CLASSNOTES FOR CHAPTER 5. Here is the CLASSNOTES FOR CHAPTER 6. Here is the CLASSNOTES FOR CHAPTER 7. Here is the DOUBLE-DISPATCH IN JAVA. Here is the CLASSNOTES FOR CHAPTER 8. Here is the CLASSNOTES FOR JAVA EXCEPTIONS. Here is the CLASSNOTES FOR CHAPTER 9. Here is the CLASSNOTES FOR JAVA BYTE CODE GENERATION. Here is the CLASSNOTES FOR CHAPTER 10. Here is the CLASSNOTES FOR CHAPTER 11. Here is the CLASSNOTES FOR CHAPTER 12.
Here is the 教育改進計畫課程講義-Java Virtual Machine-2009-03-09.