qccompiler
Design and implementation of a compiler from scratch, including:
- Lexical analysis
- Syntactic analysis
- Abstract syntax tree (AST) construction
- Semantic analysis
- Code generation
It’s written in C and it compiles input files written in qC, a small subset of the language ANSI C (C89/C90). The generated code is in C, but very close to Assembly.
Features of qC
- use variables and literals of types character and integer (both with signal)
- function declarations/calls, with recursion support
- pointers to variables and literals and to other pointers
- unidimensional arrays for integers, characters or pointers
- literals of type string
- arithmetic and logic expressions (check language grammar)
- simple relational operations
- pointer operations
- assign operations
- control operations (if-else and while)
- output operations (simplified version of printf)
- conversion between integers and strings - operations itoa and atoi
- comments of type /* … */
Tokens
Token | Meaning |
---|---|
ID | alphameric case sensitive sequences beginning with a letter where ‘_’ is also allowed |
INTLIT | sequence of digits without unnecessary left pad zeros |
CHRLIT | single character (except newline or single quote) or escape sequence (\n, \t, , ', " and \0) between single quotes |
STRLIT | sequence of characters (except newline or single quote) and/or escape sequences between double quotes |
AMP | & |
AND | && |
ASSIGN | = |
AST | * |
ATOI | atoi |
CHAR | char |
COMMA | , |
DIV | / |
ELSE | else |
EQ | == |
GE | >= |
GT | > |
IF | if |
INT | int |
ITOA | itoa |
LBRACE | { |
LE | <= |
LPAR | ( |
LSQ | [ |
LT | < |
MINUS | - |
MOD | % |
NE | != |
NOT | ! |
OR | || |
PLUS | + |
PRINTF | printf |
RBRACE | } |
RETURN | return |
RPAR | ) |
RSQ | ] |
SEMI | ; |
WHILE | while |
RESERVED | C keywords not used in qC |
Grammar (EBNF notation)
Start → (FunctionDefinition | FunctionDeclaration | Declaration) {FunctionDefinition | FunctionDeclaration | Declaration}
FunctionDefinition → TypeSpecifier FunctionDeclarator LBRACE {Declaration} {Statement} RBRACE
FunctionDeclaration → TypeSpecifier FunctionDeclarator SEMI
FunctionDeclarator → {AST} ID LPAR [ParameterList] RPAR
ParameterList → ParameterDeclaration {COMMA ParameterDeclaration}
ParameterDeclaration → TypeSpecifier {AST} ID
Declaration → TypeSpecifier Declarator {COMMA Declarator} SEMI
TypeSpecifier → CHAR | INT
Declarator → {AST} ID [LSQ INTLIT RSQ]
Statement → [Expression] SEMI
Statement → LBRACE {Statement} RBRACE
Statement → IF LPAR Expression RPAR Statement [ELSE Statement]
Statement → WHILE LPAR Expression RPAR Statement
Statement → RETURN Expression SEMI
Expression → Expression ASSIGN Expression
Expression → Expression (AND | OR) Expression
Expression → Expression (EQ | NE | LT | GT | LE | GE) Expression
Expression → Expression (PLUS | MINUS | AST | DIV | MOD) Expression
Expression → (AMP | AST | PLUS | MINUS | NOT) Expression
Expression → Expression LSQ Expression RSQ
Expression → ID LPAR [Expression {COMMA Expression}] RPAR
Expression → (PRINTF | ATOI) LPAR Expression RPAR
Expression → ITOA LPAR Expression COMMA Expression RPAR
Expression → ID | INTLIT | CHRLIT | STRLIT | LPAR Expression RPAR
Usage
$ make
$ ./qccompiler [OPTIONS] < input.qc
Options:
-t
print the abstract syntax tree and stop after syntatic analysis.
-s
print the symbol table and stop after semantic analisys.
-c
allways compile the program (unless errors occur).
-o
allways compile the program (unless errors occur) and print compiled program to file.
If both flags -s
and -t
are set, the proccess stops after the semantic analysis.
The result of the compilation is written to the file `result.c
Dependencies
- flex
- yacc
- gcc