qccompiler
Design and implementation of a compiler from scratch, including:
- Lexical analysis
- Syntactic analysis
- Abstract syntax tree (AST) construction
- Semantic analysis
- Code generation
It’s written in C and it compiles input files written in qC, a small subset of the language ANSI C (C89/C90). The generated code is in C, but very close to Assembly.
Features of qC
- use variables and literals of types character and integer (both with signal)
- function declarations/calls, with recursion support
- pointers to variables and literals and to other pointers
- unidimensional arrays for integers, characters or pointers
- literals of type string
- arithmetic and logic expressions (check language grammar)
- simple relational operations
- pointer operations
- assign operations
- control operations (if-else and while)
- output operations (simplified version of printf)
- conversion between integers and strings - operations itoa and atoi
- comments of type /* … */
Tokens
| Token | Meaning |
|---|---|
| ID | alphameric case sensitive sequences beginning with a letter where ‘_’ is also allowed |
| INTLIT | sequence of digits without unnecessary left pad zeros |
| CHRLIT | single character (except newline or single quote) or escape sequence (\n, \t, , ', " and \0) between single quotes |
| STRLIT | sequence of characters (except newline or single quote) and/or escape sequences between double quotes |
| AMP | & |
| AND | && |
| ASSIGN | = |
| AST | * |
| ATOI | atoi |
| CHAR | char |
| COMMA | , |
| DIV | / |
| ELSE | else |
| EQ | == |
| GE | >= |
| GT | > |
| IF | if |
| INT | int |
| ITOA | itoa |
| LBRACE | { |
| LE | <= |
| LPAR | ( |
| LSQ | [ |
| LT | < |
| MINUS | - |
| MOD | % |
| NE | != |
| NOT | ! |
| OR | || |
| PLUS | + |
| PRINTF | printf |
| RBRACE | } |
| RETURN | return |
| RPAR | ) |
| RSQ | ] |
| SEMI | ; |
| WHILE | while |
| RESERVED | C keywords not used in qC |
Grammar (EBNF notation)
Start → (FunctionDefinition | FunctionDeclaration | Declaration) {FunctionDefinition | FunctionDeclaration | Declaration}
FunctionDefinition → TypeSpecifier FunctionDeclarator LBRACE {Declaration} {Statement} RBRACE
FunctionDeclaration → TypeSpecifier FunctionDeclarator SEMI
FunctionDeclarator → {AST} ID LPAR [ParameterList] RPAR
ParameterList → ParameterDeclaration {COMMA ParameterDeclaration}
ParameterDeclaration → TypeSpecifier {AST} ID
Declaration → TypeSpecifier Declarator {COMMA Declarator} SEMI
TypeSpecifier → CHAR | INT
Declarator → {AST} ID [LSQ INTLIT RSQ]
Statement → [Expression] SEMI
Statement → LBRACE {Statement} RBRACE
Statement → IF LPAR Expression RPAR Statement [ELSE Statement]
Statement → WHILE LPAR Expression RPAR Statement
Statement → RETURN Expression SEMI
Expression → Expression ASSIGN Expression
Expression → Expression (AND | OR) Expression
Expression → Expression (EQ | NE | LT | GT | LE | GE) Expression
Expression → Expression (PLUS | MINUS | AST | DIV | MOD) Expression
Expression → (AMP | AST | PLUS | MINUS | NOT) Expression
Expression → Expression LSQ Expression RSQ
Expression → ID LPAR [Expression {COMMA Expression}] RPAR
Expression → (PRINTF | ATOI) LPAR Expression RPAR
Expression → ITOA LPAR Expression COMMA Expression RPAR
Expression → ID | INTLIT | CHRLIT | STRLIT | LPAR Expression RPAR
Usage
$ make
$ ./qccompiler [OPTIONS] < input.qc
Options:
-t print the abstract syntax tree and stop after syntatic analysis.
-s print the symbol table and stop after semantic analisys.
-c allways compile the program (unless errors occur).
-o allways compile the program (unless errors occur) and print compiled program to file.
If both flags -s and -t are set, the proccess stops after the semantic analysis.
The result of the compilation is written to the file `result.c
Dependencies
- flex
- yacc
- gcc