aboutsummaryrefslogtreecommitdiffstats
path: root/docs/source/CHAPTER_Optimize.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/source/CHAPTER_Optimize.rst')
-rw-r--r--docs/source/CHAPTER_Optimize.rst330
1 files changed, 330 insertions, 0 deletions
diff --git a/docs/source/CHAPTER_Optimize.rst b/docs/source/CHAPTER_Optimize.rst
new file mode 100644
index 000000000..53e0a67aa
--- /dev/null
+++ b/docs/source/CHAPTER_Optimize.rst
@@ -0,0 +1,330 @@
+.. _chapter:opt:
+
+Optimizations
+=============
+
+Yosys employs a number of optimizations to generate better and cleaner results.
+This chapter outlines these optimizations.
+
+Simple optimizations
+--------------------
+
+The Yosys pass opt runs a number of simple optimizations. This includes removing
+unused signals and cells and const folding. It is recommended to run this pass
+after each major step in the synthesis script. At the time of this writing the
+opt pass executes the following passes that each perform a simple optimization:
+
+- Once at the beginning of opt:
+
+ - opt_expr
+ - opt_merge -nomux
+
+- Repeat until result is stable:
+
+ - opt_muxtree
+ - opt_reduce
+ - opt_merge
+ - opt_rmdff
+ - opt_clean
+ - opt_expr
+
+The following section describes each of the opt\_ passes.
+
+The opt_expr pass
+~~~~~~~~~~~~~~~~~
+
+This pass performs const folding on the internal combinational cell types
+described in :numref:`Chap. %s <chapter:celllib>`. This means a cell with all
+constant inputs is replaced with the constant value this cell drives. In some
+cases this pass can also optimize cells with some constant inputs.
+
+.. table:: Const folding rules for $_AND\_ cells as used in opt_expr.
+ :name: tab:opt_expr_and
+ :align: center
+
+ ========= ========= ===========
+ A-Input B-Input Replacement
+ ========= ========= ===========
+ any 0 0
+ 0 any 0
+ 1 1 1
+ --------- --------- -----------
+ X/Z X/Z X
+ 1 X/Z X
+ X/Z 1 X
+ --------- --------- -----------
+ any X/Z 0
+ X/Z any 0
+ --------- --------- -----------
+ :math:`a` 1 :math:`a`
+ 1 :math:`b` :math:`b`
+ ========= ========= ===========
+
+.. How to format table?
+
+:numref:`Table %s <tab:opt_expr_and>` shows the replacement rules used for
+optimizing an $_AND\_ gate. The first three rules implement the obvious const
+folding rules. Note that ‘any' might include dynamic values calculated by other
+parts of the circuit. The following three lines propagate undef (X) states.
+These are the only three cases in which it is allowed to propagate an undef
+according to Sec. 5.1.10 of IEEE Std. 1364-2005 :cite:p:`Verilog2005`.
+
+The next two lines assume the value 0 for undef states. These two rules are only
+used if no other substitutions are possible in the current module. If other
+substitutions are possible they are performed first, in the hope that the ‘any'
+will change to an undef value or a 1 and therefore the output can be set to
+undef.
+
+The last two lines simply replace an $_AND\_ gate with one constant-1 input with
+a buffer.
+
+Besides this basic const folding the opt_expr pass can replace 1-bit wide $eq
+and $ne cells with buffers or not-gates if one input is constant.
+
+The opt_expr pass is very conservative regarding optimizing $mux cells, as these
+cells are often used to model decision-trees and breaking these trees can
+interfere with other optimizations.
+
+The opt_muxtree pass
+~~~~~~~~~~~~~~~~~~~~
+
+This pass optimizes trees of multiplexer cells by analyzing the select inputs.
+Consider the following simple example:
+
+.. code:: verilog
+ :number-lines:
+
+ module uut(a, y); input a; output [1:0] y = a ? (a ? 1 : 2) : 3; endmodule
+
+The output can never be 2, as this would require ``a`` to be 1 for the outer
+multiplexer and 0 for the inner multiplexer. The opt_muxtree pass detects this
+contradiction and replaces the inner multiplexer with a constant 1, yielding the
+logic for ``y = a ? 1 : 3``.
+
+The opt_reduce pass
+~~~~~~~~~~~~~~~~~~~
+
+This is a simple optimization pass that identifies and consolidates identical
+input bits to $reduce_and and $reduce_or cells. It also sorts the input bits to
+ease identification of shareable $reduce_and and $reduce_or cells in other
+passes.
+
+This pass also identifies and consolidates identical inputs to multiplexer
+cells. In this case the new shared select bit is driven using a $reduce_or cell
+that combines the original select bits.
+
+Lastly this pass consolidates trees of $reduce_and cells and trees of $reduce_or
+cells to single large $reduce_and or $reduce_or cells.
+
+These three simple optimizations are performed in a loop until a stable result
+is produced.
+
+The opt_rmdff pass
+~~~~~~~~~~~~~~~~~~
+
+This pass identifies single-bit d-type flip-flops ($_DFF\_, $dff, and $adff
+cells) with a constant data input and replaces them with a constant driver.
+
+The opt_clean pass
+~~~~~~~~~~~~~~~~~~
+
+This pass identifies unused signals and cells and removes them from the design.
+It also creates an ``\unused_bits`` attribute on wires with unused bits. This
+attribute can be used for debugging or by other optimization passes.
+
+The opt_merge pass
+~~~~~~~~~~~~~~~~~~
+
+This pass performs trivial resource sharing. This means that this pass
+identifies cells with identical inputs and replaces them with a single instance
+of the cell.
+
+The option -nomux can be used to disable resource sharing for multiplexer cells
+($mux and $pmux. This can be useful as it prevents multiplexer trees to be
+merged, which might prevent opt_muxtree to identify possible optimizations.
+
+FSM extraction and encoding
+---------------------------
+
+The fsm pass performs finite-state-machine (FSM) extraction and recoding. The
+fsm pass simply executes the following other passes:
+
+- Identify and extract FSMs:
+
+ - fsm_detect
+ - fsm_extract
+
+- Basic optimizations:
+
+ - fsm_opt
+ - opt_clean
+ - fsm_opt
+
+- Expanding to nearby gate-logic (if called with -expand):
+
+ - fsm_expand
+ - opt_clean
+ - fsm_opt
+
+- Re-code FSM states (unless called with -norecode):
+
+ - fsm_recode
+
+- Print information about FSMs:
+
+ - fsm_info
+
+- Export FSMs in KISS2 file format (if called with -export):
+
+ - fsm_export
+
+- Map FSMs to RTL cells (unless called with -nomap):
+
+ - fsm_map
+
+The fsm_detect pass identifies FSM state registers and marks them using the
+``\fsm_encoding = "auto"`` attribute. The fsm_extract extracts all FSMs marked
+using the ``\fsm_encoding`` attribute (unless ``\fsm_encoding`` is set to
+"none") and replaces the corresponding RTL cells with a $fsm cell. All other
+fsm\_ passes operate on these $fsm cells. The fsm_map call finally replaces the
+$fsm cells with RTL cells.
+
+Note that these optimizations operate on an RTL netlist. I.e. the fsm pass
+should be executed after the proc pass has transformed all RTLIL::Process
+objects to RTL cells.
+
+The algorithms used for FSM detection and extraction are influenced by a more
+general reported technique :cite:p:`fsmextract`.
+
+FSM detection
+~~~~~~~~~~~~~
+
+The fsm_detect pass identifies FSM state registers. It sets the ``\fsm_encoding
+= "auto"`` attribute on any (multi-bit) wire that matches the following
+description:
+
+- Does not already have the ``\fsm_encoding`` attribute.
+- Is not an output of the containing module.
+- Is driven by single $dff or $adff cell.
+- The ``\D``-Input of this $dff or $adff cell is driven by a multiplexer tree
+ that only has constants or the old state value on its leaves.
+- The state value is only used in the said multiplexer tree or by simple
+ relational cells that compare the state value to a constant (usually $eq
+ cells).
+
+This heuristic has proven to work very well. It is possible to overwrite it by
+setting ``\fsm_encoding = "auto"`` on registers that should be considered FSM
+state registers and setting ``\fsm_encoding = "none"`` on registers that match
+the above criteria but should not be considered FSM state registers.
+
+Note however that marking state registers with ``\fsm_encoding`` that are not
+suitable for FSM recoding can cause synthesis to fail or produce invalid
+results.
+
+FSM extraction
+~~~~~~~~~~~~~~
+
+The fsm_extract pass operates on all state signals marked with the
+(``\fsm_encoding != "none"``) attribute. For each state signal the following
+information is determined:
+
+- The state registers
+
+- The asynchronous reset state if the state registers use asynchronous reset
+
+- All states and the control input signals used in the state transition
+ functions
+
+- The control output signals calculated from the state signals and control
+ inputs
+
+- A table of all state transitions and corresponding control inputs- and
+ outputs
+
+The state registers (and asynchronous reset state, if applicable) is simply
+determined by identifying the driver for the state signal.
+
+From there the $mux-tree driving the state register inputs is recursively
+traversed. All select inputs are control signals and the leaves of the $mux-tree
+are the states. The algorithm fails if a non-constant leaf that is not the state
+signal itself is found.
+
+The list of control outputs is initialized with the bits from the state signal.
+It is then extended by adding all values that are calculated by cells that
+compare the state signal with a constant value.
+
+In most cases this will cover all uses of the state register, thus rendering the
+state encoding arbitrary. If however a design uses e.g. a single bit of the
+state value to drive a control output directly, this bit of the state signal
+will be transformed to a control output of the same value.
+
+Finally, a transition table for the FSM is generated. This is done by using the
+ConstEval C++ helper class (defined in kernel/consteval.h) that can be used to
+evaluate parts of the design. The ConstEval class can be asked to calculate a
+given set of result signals using a set of signal-value assignments. It can also
+be passed a list of stop-signals that abort the ConstEval algorithm if the value
+of a stop-signal is needed in order to calculate the result signals.
+
+The fsm_extract pass uses the ConstEval class in the following way to create a
+transition table. For each state:
+
+1. Create a ConstEval object for the module containing the FSM
+2. Add all control inputs to the list of stop signals
+3. Set the state signal to the current state
+4. Try to evaluate the next state and control output
+5. If step 4 was not successful:
+
+ - Recursively goto step 4 with the offending stop-signal set to 0.
+ - Recursively goto step 4 with the offending stop-signal set to 1.
+
+6. If step 4 was successful: Emit transition
+
+Finally a $fsm cell is created with the generated transition table and added to
+the module. This new cell is connected to the control signals and the old
+drivers for the control outputs are disconnected.
+
+FSM optimization
+~~~~~~~~~~~~~~~~
+
+The fsm_opt pass performs basic optimizations on $fsm cells (not including state
+recoding). The following optimizations are performed (in this order):
+
+- Unused control outputs are removed from the $fsm cell. The attribute
+ ``\unused_bits`` (that is usually set by the opt_clean pass) is used to
+ determine which control outputs are unused.
+
+- Control inputs that are connected to the same driver are merged.
+
+- When a control input is driven by a control output, the control input is
+ removed and the transition table altered to give the same performance without
+ the external feedback path.
+
+- Entries in the transition table that yield the same output and only differ in
+ the value of a single control input bit are merged and the different bit is
+ removed from the sensitivity list (turned into a don't-care bit).
+
+- Constant inputs are removed and the transition table is altered to give an
+ unchanged behaviour.
+
+- Unused inputs are removed.
+
+FSM recoding
+~~~~~~~~~~~~
+
+The fsm_recode pass assigns new bit pattern to the states. Usually this also
+implies a change in the width of the state signal. At the moment of this writing
+only one-hot encoding with all-zero for the reset state is supported.
+
+The fsm_recode pass can also write a text file with the changes performed by it
+that can be used when verifying designs synthesized by Yosys using Synopsys
+Formality .
+
+Logic optimization
+------------------
+
+Yosys can perform multi-level combinational logic optimization on gate-level
+netlists using the external program ABC . The abc pass extracts the
+combinational gate-level parts of the design, passes it through ABC, and
+re-integrates the results. The abc pass can also be used to perform other
+operations using ABC, such as technology mapping (see :numref:`Sec %s
+<sec:techmap_extern>` for details).