diff options
author | Tristan Gingold <tgingold@free.fr> | 2020-01-04 17:59:44 +0100 |
---|---|---|
committer | Tristan Gingold <tgingold@free.fr> | 2020-01-06 18:20:28 +0100 |
commit | 09af03505bbd72f676394415c16c14bea5154513 (patch) | |
tree | 8444da62bd105b00aa2321acc3785357c550b0cb /doc | |
parent | 61c0e71793576646cb8374cd462bcda7cf6e410e (diff) | |
download | ghdl-09af03505bbd72f676394415c16c14bea5154513.tar.gz ghdl-09af03505bbd72f676394415c16c14bea5154513.tar.bz2 ghdl-09af03505bbd72f676394415c16c14bea5154513.zip |
doc: add internals/ (WIP). Add a part for index.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/index.rst | 23 | ||||
-rw-r--r-- | doc/internals/AST.rst | 95 | ||||
-rw-r--r-- | doc/internals/Frontend.rst | 24 | ||||
-rw-r--r-- | doc/internals/Overview.rst | 18 |
4 files changed, 159 insertions, 1 deletions
diff --git a/doc/index.rst b/doc/index.rst index 7357f11ba..4229cc4ef 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -102,8 +102,29 @@ :caption: Development :hidden: - genindex development/Synthesis development/Debugging development/CodingStyle development/Roadmap + +.. raw:: latex + + \part{Internals} + +.. toctree:: + :caption: Internals + :hidden: + + internals/Overview + internals/Frontend + internals/AST + +.. raw:: latex + + \part{Index} + +.. toctree:: + :caption: Index + :hidden: + + genindex diff --git a/doc/internals/AST.rst b/doc/internals/AST.rst new file mode 100644 index 000000000..4e77f8f3c --- /dev/null +++ b/doc/internals/AST.rst @@ -0,0 +1,95 @@ +.. _INT:AST: + +AST +### + +Introduction +************ + +The AST is the main data structure of the front-end and is created by the parser. + +AST stands for Abstract Syntax Tree. + +This is a tree because it is a graph with nodes and links between nodes. As the graph +is acyclic and each node but the root has only one parent (the link that point to it). +In the front-end there is only one root which represent the set of libraries. + +The tree is a syntax tree because it follows the grammar of the VHDL language: there +is for example a node per operation (like `or`, `and` or `+`), a node per declaration, +a node per statement, a node per design unit (like entity or architecture). The front-end needs to represent the source file using the grammar because most of the +VHDL rules are defined according to the grammar. + +Finally, the tree is abstract because it is an abstraction of the source file. Comments and layout aren't kept in the syntax tree. Furthermore, if you rename a +declaration or change the value of a literal, the tree will have exactely the same +shape. + +But we can also say that the tree is neither abstract, nor syntaxic and nor a tree. + +It is not abstract because it contains all the information from the source file +(except comments) are available in the AST, inclusing the location. So the source +file can be reprinted (the name unparsed is also used) from the AST. If a mechanism +is also added to deal with comments, the source file can even be pretty-printed from +the AST. + +It is not purely syntaxic because the semantic analysis pass decorate the tree +with semantic information. For example the type of each expression and sub-expression +is computed. This is necessary to detect some semantic error like assigning an array +to an integer. + +Finally, it is not anymore a tree because new links are added during semantic +analysis. Simple names are linked to their declaration. + +The AST in GHDL +*************** + +The GHDL AST is described in file :file:`vhdl-nodes.ads`. + +An interesting particularity about the AST is the presence of a +meta-model. The meta-model is not formally described (so, there is no +meta-meta-model), but it is very simple: there are 3 kinds of vertices: + +* variable list of nodes (`List`). These are like vectors as the + length can be changed. + +* Fixed lists of nodes (`Flist`). The length of a fixed list is defined at creation. + +* Nodes. A node has a kind (`Iir_Kind` which is also defined in the file), and fields. + The kind is set at creation and cannot be changed, while fields can be. + +The meta-model describes the type of the fields: most of them are +either a node reference, a boolean flag or a enumerated type (like +`Iir_Staticness`). But there are various node references. A node can either owns +another node, which means this is the main reference to the node; or a node can +reference another node without owning it. + +Why a meta-model ? +****************** + +Having a meta-model allows to build algorithm that deals with any +node. The dumper (in file :file:`vhdl-disp_tree.ad[sb]`) is used to +dump a node and possibly its sub-nodes. This is very useful while +debugging GHDL. It is written using the meta-model, so it knows how to display +a boolean and the various other enumerated types, and how to display a list. To +display a node, it just gets the kind of the type, prints the kind name and queries +all the fields of the node. There is nothing particular to a specific kind, so you +don't need to modify the dumper if you add a node. + +The dumper won't be a strong enough reason by itself to have a meta-model. But +the pass to create instances is a good one. When a vhdl-2008 package is instantiated, +at least the package declaration is created in the AST (this is needed because there +are possibly new types). And creating an instance using the meta-model is much +simpler (and much more generic) that creating the instance using directly the nodes. +The code to create instances is in files :file:`vhdl-sem_inst.ad[sb]`. + +The meta-model also structures the tree. We know that each node is owned only by one node, and that each node is owned (except the top-level one). So it is possible to +free a sub-tree. It is also possible to check that the tree is well-formed. + +Dealing with ownership +********************** + +TBC: two fields, Is_Ref, Second_XXX; Rust & Scripts. + +Node Type +********* + +TBC: 32-bit, extensions. diff --git a/doc/internals/Frontend.rst b/doc/internals/Frontend.rst new file mode 100644 index 000000000..6d5e1da5c --- /dev/null +++ b/doc/internals/Frontend.rst @@ -0,0 +1,24 @@ +.. _INT:Frontend: + +Front-end +######### + +Input files (or source files) are read by `files_map.ad[sb]`. Only regular files can be read, because they are read entirely before being scanned. This simplifies the scanner, but this also allows to have a unique index for each character in any file. Therefore the source location is a simple 32-bit integer whose type is `Location_Type`. From the location, `files_map` can deduce the source file (type is `Source_File_Entry`) and then the offset in the source file. There is a line table for each source file in order to speed-up the conversion from file offset to line number and column number. + +The scanner (file :file:`vhdl-scanner.ad[sb]`) reads the source files and creates token +from them. The tokens are defined in file :file:`vhdl-tokens.ads`. Tokens are scanned +one by one, so the scanner doesn't keep in memory the previous token. Integer or +floating point numbers are special tokens because beside the token itself there is +also a variable for the value of the number. + +For identifiers there is a table containing all identifiers. This is implemented by +file :file:`name_table.ad[sb]`. Each identifier is associated to a 32-bit number +(they are internalized). So the number is used to reference an identifier. About +one thousand identifiers are predefined (by :file:`std_names.ad[sb]`). Most of +them are reserved identifiers (or keywords). When the scanner find an identifier, it +checks if it is a keyword. In that case it changes the token to the keyword token. + +The procedure `scan` is called to get the next token. The location of the token and +the location after the token are available to store it in the parser tree. + +The main clieant of the scanner is the parser. diff --git a/doc/internals/Overview.rst b/doc/internals/Overview.rst new file mode 100644 index 000000000..3be8772b4 --- /dev/null +++ b/doc/internals/Overview.rst @@ -0,0 +1,18 @@ +.. _INT:Overview: + +Overview +######## + +`GHDL` is architectured like a traditionnal compiler. It has: + +* a driver (sources in :file:`src/ghdldrv`) to call the programs (compiler, assembler, linker) if needed. + +* a library (sources in :file:`src/grt`) to help execution at run-time. + +* a front-end (sources in :file:`src/vhdl`) to parse and analyse VHDL. + +* a back-end (in fact many, sources are in :file:`src/ortho`) to generate code. + +The architecture is modular. For example, it is possible to use the front-end in the `libghdl` library for the language server or to do synthesis (sources in :file:`src/synth`) instead of code generation. + +The main work is performed by the front-end, which is documented in the next chapter. |