apparmor_re.h - control flags for hfa generation
expr-tree.{h,cc} - abstract syntax tree (ast) built from a regex parse
parse.{h,y} - code to parse a regex into an ast
hfc.{h,cc} - code to build and manipulate a hybrid finite automata (state
             machine).
flex-tables.h - basic defines used by chfa
chfa.{h,cc} - code to build a highly compressed runtime readonly version
              of an hfa.
aare_rules.{h,cc} - code to that binds parse -> expr-tree -> hfa generation
                    -> chfa generation into a basic interface for converting
		    rules to a runtime ready state machine.

Notes on the compiler pipeline order
============================================

Front End: Program driver logic and policy text parsing into an
           abstract syntax tree.
Middle Layer: Transforms and operations on the abstract syntax tree.
              Converts syntax tree into expression tree for back end.
Back End: transforms of syntax tree, and creation of policy HFA from
          expression trees and HFAs.


Basic order of the backend of the compiler pipe line and where the
dump information occurs in the pipeline.

===== Front End (parse -> AST ================
       |
       v
    yyparse
       |
+--->--+-->-+
|           |
|  +-->---- +---------------------------<-----------------------+
|  |        |                                                   |
|  |        v                                                   |
|  |      yylex                                                 |
|  |        |                                                   |
|  ^   token match                                              |
|  |        |                                                   |
|  |        +----------------------------+                      |
|  |        |                            |                      ^
|  |        v                            v                      |
|  +-<- rule match?                  preprocess                 |
|           |                            |                      |
|    early var expansion      +----------+-----------+          |
|           |                 |          |           |          |
^           v                 v          v           v          |
|   new rule() / new ent   include   variable   conditional     |
|           |                 |          |           |          |
|           v                 +---->-----+----->-----+----->----+
| new rule semantic check
|           |
+-----<-----+
            |
----------- | ------ End of Parse --------------------
            |
            v
post_parse_profile semantic check
       |
       v
  post_process
           |
           v
     add implied rules()
           |
           v
  process_profile_variables()
                    |
                    v
                  rule->expand_variables()
                    |
           +--------+
           |
           v
   replace aliases (to be moved to backend rewrite)
           |
           v
      merge rules
           |
           v
       profile->merge_rules()
                        |
                        v
                +-->--rule->is_mergeable()
                |                |
                ^                v
                |           add to table
                |                |
                +-------+--------+
                        |
                        v
                      sort->cmp()/oper<()
                        |
                      rule->merge()
                        |
           +------------+
           |
           v
  process_profile_rules
                   |
                   v
                 rule->gen_policy_re()
                            |
                            v
===== Mid layer (AST -> expr tree) =================
                            |
                     +-> add_rule()			(aare_rules.{h,cc})
                     |      |
                     |      v
                     |  rule parse			(parse.y)
                     |      |    |
                     |      |    v
                     |      |   expr tree		(expr-tree.{h,cc})
                     |      |       |
                     |      v       |
                     | unique perms |		(aare_rules.{h,cc})
                     |      |       |
                     |      +------ +
                     |      |
                     |      v
                     |  add to rules expr tree	(aare_rules.{h,c})
                     |      |
                     +------+
                            |
         +------------------+
         |
         v
    create_dfablob()
         |
         v
      expr tree
         |
         v
  create_chfa()		(aare_rules.cc)
         |
         v
  expr normalization	(expr-tree.{h,cc})
         |
         v
  expr simplification	(expr-tree.{h,c})
         |
         +- D expr-tree
         |
         +- D expr-simplified
         |
==== Back End - Create cHFA out of expr tree and other HFAs ====
         v
    hfa creation	(hfa.{h,cc})
         |
         +- D dfa-node-map
         |
         +- D dfa-uniq-perms
         |
         +- D dfa-states-initial
         |
         v
    hfa rewrite		(not yet implemented)
         |
         v
    filter deny		(hfa.{h,cc})
         |
         +- D dfa-states-post-filter
         |
         v
    minimization	(hfa.{h,cc})
         |
         +- D dfa-minimize-partitions
         |
         +- D dfa-minimize-uniq-perms
         |
         +- D dfa-states-post-minimize
         |
         v
   unreachable state removal	(hfa.{h,cc})
         |
         +- D dfa-states-post-unreachable
         |
         +- D dfa-states	constructed hfa
         |
         +- D dfa-graph
         |
         v
equivalence class construction
         |
         +- D equiv
         |
     diff encode		(hfa.{h,cc})
         |
         +- D diff-encode
         |
compute perms table
         |
         +- D compressed-dfa == perm table dump
         |
   compressed hfa			(chfa.{h,cc}
         |
         +- D compressed-dfa == transition tables
         |
         +- D dfa-compressed-states    - compress HFA in state form
         |
         v
  Return to Mid Layer


Notes on the compress hfa file format (chfa)
==============================================

The file format used is based on the GNU flex table file format
(--tables-file option; see Table File Format in the flex info pages and
the flex sources for documentation). The magic number used in the header
is set to 0x1B5E783D instead of 0xF13C57B1 though, which is meant to
indicate that the file format logically is not the same: the YY_ID_CHK
(check) and YY_ID_DEF (default), YY_ID_BASE tables are used differently.

The YY_ID_ACCEPTX tables either encode permissions directly, or are an
index, into an external tables.

There are two DFA table formats to support different size state machines
DFA16
  default/next/check - are 16 bit tables
DFA32
  default/next/check - are 32 bit tables

  DFA32 is limited to 2^24 states, due to the upper 8 bits being used
  as flags in the base table, unless the flags table is defined. When
  the flags table is defined, DFA32 can have a full 2^32 states.

In both DFA16 and DFA32
   base and accept are 32 bit tables.

State 0 is always used as the trap state. Its accept, base and default
fields should be 0.

State 1 is the default start state. Alternate start states are stored
external to the state machine.

If the flags table is not defined, the base table uses the lower 24
bits as index into the next/check tables, and the upper 8 bits are used
as flags.

The currently defined flags are
#define MATCH_FLAG_DIFF_ENCODE 0x80000000
#define MARK_DIFF_ENCODE 0x40000000
#define MATCH_FLAG_OOB_TRANSITION 0x20000000

Note the default[state] is used in two different ways.

1. When diff_encode is set, the state stores the difference to another
   state defined by default. The next field will only store the
   transitions that are unique to this state. Those transition may mask
   transitions in the state that the current state is relative to, also
   note the state that this state is relative might also be relative to
   another state. Cycles are forbidden and checked for by the verifier.
   The exact algorithm used to build these state difference will be
   discussed in another section.


States and transitions on specific characters to next states
------------------------------------------------------------
 1: ('a' => 2, 'b' => 3, 'c' => 4)
 2: ('a' => 2, 'b' => 3, 'd' => 5)

Table format - where D in base represnts Diff encode flag
----------------------
index: (default, base)
    0: (      0,    0)  <== dummy state (nonmatching)
    1: (      0,    0)
    2: (      1, D  256)

  index: (next, check)
      0: (   0,     0)  <== unused entry
	 (   0,     1)  <== ord('a') identical entries
  0+'a': (   2,     1)
  0+'b': (   3,     1)
  0+'c': (   4,     1)
	 (   0,     1)  <== (255 - ord('c')) identical entries
256+'c': (   0,     2)
256+'d': (   5,     2)

Here, state 2 is described as ('c' => 0, 'd' => 5), and everything else
as in state 1. The matching algorithm is as follows.

Scanner algorithm
---------------------------
  /* current state is in <state>, input character <c> */

  while (check[base[state] + c] != state) {
      diff = (FLAGS(base) & diff_encode);
      state = default[state];
      if (!diff)
         goto done;
  }
  state = next[base[state] + c];
  done:

  /* continue with the next input character */

2. When diff_encode is NOT set, the default state is used to represent
   all none matching transitions (ie. check[base[state] + c] != state).
   The dfa build will compute the transition with the most transitions
   and use that for the default state. ie.

   if we have
       1: ('a' => 2)
          ("[^a]" => 0)
   then 0 will be used as the default state

   if we have
       1: ("[^a]" => 2)
          ('a' => 0)
   then 2 will be used as the default state, and the only state encoded
   in the next/check tables will be for 'a'

The combination of the diff-encoded and non-diff encoded states performs
well even when there are many inverted or wildcard matches ("[^x]", ".").


Simplified Regexp scanner algorithm for non-diff encoded state (note
diff encode algorithm above works as well)

------------------------
  /* current state is in <state>, matching character <c> */
  if (check[base[state] + c] == state)
    state = next[base[state] + c];
  else
    state = default[state];
  /* continue with the next input character */


Each input character may cause several iterations in the while loop,
but due to guarantees in the build at most 2n states will be
transitioned for n input characters.  The expected number of states
walked is much closer to n and in practice due to cache locality the
diff encoded state machine is usually faster than a non-diff encoded
state machine with a strict n state for n input walk.


Comb Compression
-----------------

The next/check tables of states are only used to encode transitions
not covered by the default transition. The input byte is indexed off
the base value, covering 256 positions within the next/check
tables. However a state may only encode a few transitions within that
range, leaving holes.  These holes are filled by other states
transitions whose range will overlap.

   1: ('a' => 2, 'b' => 3, 'c' => 4)
   2: ('a' => 2, 'b' => 3, 'd' => 5)
   3: ('a' => 0, everything else => 5)

Regexp tables
-------------
index: (default, base)
    0: (      0,    0)  <== dummy state (nonmatching)
    1: (      0,    0)
    2: (      1,    3)
    3: (      5,    7)

  index: (next, check)
      0: (   0,     0)  <== unused entry
	 (   0,     0)  <== ord('a') identical, unused entries
  0+'a': (   2,     1)
  0+'b': (   3,     1)
  0+'c': (   4,     1)
  3+'a': (   2,     2)
  3+'b': (   3,     2)
  3+'c': (   0,     0)  <== entry is unused, hole that could be filled
  3+'d': (   5,     2)
  7+'a': (   0,     3)
	 (   0,     0)  <== (255 - ord('a')) identical, unused entries


Regexp tables comb compressed
-------------
index: (default, base)
    0: (      0,    0)
    1: (      0,    0)
    2: (      1,    3)
    3: (      5,    5)

  index: (next, check)
      0: (   0,     0)
	 (   0,     0)
  0+'a': (   2,     1)
  0+'b': (   3,     1)
  0+'c': (   4,     1)
  3+'a': (   2,     2)
  3+'b': (   3,     2)
  5+'a': (   0,     3)  <== entry was previously at 7+'a'
  3+'d': (   5,     2)
	 (   0,     0)  <== (255 - ord('a')) identical, unused entries


Out of Band Transitions (oobs)
---------------------------------

Out of band transitions (oobs) allow for a state to have transitions
that can not be triggered by input. Any state that has oobs must have
the OOB flag set on the state. An oob is triggered by subtracting the
oob number from the the base index value, to find the next and check
value. Current only single oob is supported. And all states using
an oob must have the oob flag set.

  if ((FLAG(base) & OOB) && check[base[state] - oob] == state)
    state = next[base[state]] - oob]

oobs might be expressed as a negative number eg. -1 for the first
oob. In which case the oob transition above uses a + oob instead.

If more oobs are needed a second oob flag can be allocated, and if
used in combination with the original, would allow a state to have
up to 3 oobs

  00 - none
  01 - 1
  10 - 2
  11 - 3


Diff Encode & Spanning Tree
============================================

State differential encoding is a technique to compress the dfa by
reducing the number of transition states that need to be stored for a
give state.  Finding an optimal differential encoding is an
np-complete problem, instead apparmor uses an algorithm that is
expexted to be nlogn but does have a worst case of O(n^2).

When the differential code approximates nlogn, and if it removes
enough transitions, it can speed up the compile time because it
reduces the number of transitions that need to be considered by the
O(n^2) comb compression. In the worst case when differential encoding
approximates O(n^2), and if it doesn't eliminate may transitions it
can slow the backend of the state machine build down by approximately
2x.

The algorithm used to do the differential encoding makes sure the
encode is done in such a way that to provide runtime guaratees that no
more that for an input of length n, no more than 2N states will be
traversed for an input of length N.

Understanding the Diff Encode
-----------------------------

To reduce the number of transitions a state has to encode states can
be made to encode their transitions as a differential to a "default"
state. If a transition is not represented in the current state the
default state is entered and the process is repeated until the
transition is found or the state is not marked to be encoded relative
to it default.

An example: Lets say we have two states A and B witht the following
transitions.

State A				State B
default -> NULL			default -> NULL
'a' -> X   			'a' -> X
'b' -> Y			'b' -> Y
       				'c' -> Z

diff encoding will not change state A but changes state B

State A	      	       	      	State B (diff*)
default -> NULL			default -> A
'a' -> X   			'c' -> Z
'b' -> Y

matches for state A do not change, but performing a match for state B
might. Matching input 'c' for state B has no difference the match finds
that Z is the transition for input 'c' and transitions just as it
would for the state machine not using diff encoding.

However the match for inputs 'a' and 'b' do change. For inputs 'a' and
'b' the match will not find a matching transition so the default
transition will be used which has become state A instead of null, in
addition the state is detected to be diff encoded, so instead of
stopping the match on the state transition, that match is rerun
against State A. For all inputs besides 'c' (which does not get passed
to state A, the transitions are the same as the original state B, so
after the match is run against state A, the match is now in the same
state it would have been without diff encoding.

The diff encode reduced the number of transitions stored in the state
machine by 2, at the cost of requiring a run time cost of transitioning
through 2 states to find the state for a single input.


Chaining diff state diff encoding

The diff encode is not limited a single state transition. Diff encoded
states can be chained to obtain further savings.


Eg. 2

State A			State B			State C
default -> NULL		default -> NULL		default -> NULL
'a' -> W   		'a' -> W   		'a' -> W
'b' -> X		'b' -> X		'b' -> X
       			'c' -> Y		'c' -> Y
       			    			'd' -> Z
becomes

State A			State B	(diff*)		State C (diff*)
default -> NULL		default -> A		default -> B
'a' -> W   		'c' -> Y   		'd' -> Z
'b' -> X

The match for State B is similar to example 1, transitioning to State A
to find the rest of its transitions. State C will match input 'd' but
for other inputs transition to state B, which can directly match input
'c' but will transition to state A for the rest of its transitions.

This time the diff encode has manged to remove 5 transitions from memory
at the cost of having to match against up to 2 states from state B, and
up to 3 states from State C.



Masking transitions

The diff encode is not limited to just encoding against states with
a strict subset of transitions as was done in example 1 and 2. It can
also add transitions to mask transitions down the chain, allowing
a broader set of states to be used when setting up diff encode chains.


Eg. 3

State A				State B
default -> NULL			default -> NULL
'a' -> W   			'a' -> X
'b' -> Y			'b' -> Y
       				'c' -> Z

can be diff encoded as
State A				State B (diff*)
default -> NULL			default -> A
'a' -> W   			'a' -> X
'b' -> Y			'c' -> Z

In this case only 1 transition is removed because the 'a' transition
is kept to override/mask the 'a' transition in state A. When doing
a transition from state B, its transitions will be consulted first
the correct transition to state X for input 'a' will be done.

Eg. 4.
State A				State B
default -> X			default -> X
'a' -> W   			'b' -> Y
'b' -> Y

can be encoded as

State A				State B (diff*)
default -> NULL			default -> A
'a' -> W   			'a' -> X
'b' -> Y

in this case the diff encode doesn't save any transitions so it is not
worth doing, but a similar situation with transitions could result in
savings.


Loops and diff encode termination

The diff encode if done wrong could create an infinite loop for a
single input. To avoid this the diff-encode must be built and verified
so every diff encoded state chain ends on a non-diff-encoded state.



Building the Diff Encode
-------------------------

State differential encoding can be quite effective in reducing the
number of states but it can increase both compile and matching time
(as multiple states must be traversed for a single match). However if
the states to encode against are chosen carefully, then both the
encoding time and matching time can be bounded. Differential encoding
can even result in a faster HFA as it can reduce the data needed to be
cached, resulting in improved cache line reuse.

The requirements AppArmor uses to choose the states to encode against are
    • The state must have been previously matched while walking the
      dfa (it will be hot in the cache then)
    • or the state must be at the same level in a DAG decomposition of
      the dfa, sharing a common ancestor (more on this below)

The first requirement was primarily for performance concerns but in
practice works out well for compression too, as states that are close
to each other often have similar transitions. The second allows
expanding the reach of the compression to a few more likely options
while keeping a potentially common hot path, without breaking
the other property of only referencing previously matched states.

In practice requirement 1 can not be met as each match string takes a
different path through the dfa. It can however be approximated by
converting the dfa into a directed acyclic graph (DAG) with the start
state as the root. The DAG provides a good approximation for
requirement 1 and at the same time it limits how many states have to be
considered for compression (only backwards in the DAG). It also
provides guarentees on how many states will be walked at run time (at
most 2n).

Eg.

TODO: ??? simple dfa to spanning tree graph


Converting the HFA into DAG for compression does have a limitation in
that it removes many of a states immediate neighbours from
consideration. In a DAG a states neighbours can be broken into five
classes, immediate predecessor, predecessor on another branch, sibling
on another branch, immediate successor, successor on another
branch. The immediate predecessor and immediate successor cases are
covered by the predecessor differential compression scheme described
above (successor as the current state is the successor state
predecessor, and thus will be considered when the successor is
differentially encoded). However the successor and predecessor on
another branch and sibling cases are not covered, and they maybe the
more optimal path for encoding, and may be the hot path the match came
through.

TODO: ??? diagram showing the 5 classes

To help account for this, AppArmor also compares to the immediate
successors of the state being consider if there are transitions
between the states. Sibling states are also considered if there are
transitions between the state and the sibling is differentially
encoded against a predecessor (not another sibling), or not
differentially encoded. This broadens the set of states considered but
limits it to states that were potentially matched against and thus in
the cache. It also has the property of looking backwards in the DAG
thus keeping the maximum number of states that are required to be
transitioned to in a match to a linear constant. If only branch
predecessor where used then the limit could be kept at 2n but because
immediate siblings can be used iff they transition to a predecessor
the limit is bounded to a slightly higher value of 5/2n.

When considering which state to differentially encode against AppArmor
computes a weighted value and chooses the best one. The value is
computed as follows.

    • For each defined transition in the state
    
        +0 to candidate state weight - if the transition is
           undefined in the candidate state (the transition must be
           represented in the current state)
	
        +1 to candidate state weight - if the candidate state has
           the same transition to the same state (the transition can be
           eliminated from the current state)
	
        -1 to candidate state weight - if the transition is defined
           in the candidate state and it is not the same transition
           (current state must add a transition entry to override
           candidate transition)
	
    • For each undefined transition in the state
    
        +0 to candidate state weight - if the transition is
           undefined in the candidate state
	
        -1 to candidate state weight - if the transition is defined
	   in candidate state (current state must add an entry to
	   override the candidate transition) The current state will be
	   differentially encoded against the candidate state with the
	   largest weight > 0. If there is no weighting > 0 then no
	   differential encoding for the state will be done as their is
	   no benefit to doing so.

Note: differential encoding can case reduce a state to 0 stored
transitions. This can happen when two states have the exact same
transitions but belong in different partitions when minimized. This
would happen for example when one state was an accept state and the
other a none accepting state. Other wise if states have the same
transitions they are redundant and removed during state minimization.


DFA WELDING
-----------
* TODO *


DFA SET OPERATIONS
------------------

If two DFAs can be aligned on partitioning scheme the set operations
of union, intersecions and differences can be used to create a new
DFA that represents the results of the operation.

* TODO *


DFA ALIAS REWRITE
-----------------
* TODO *
