Difference between revisions of "SCI/Specifications/SCI in action/Parser"

Jump to navigation Jump to search
Merging of the SCI documentation. Work in progress.
m (→‎The Parse tree: added missing ;)
(Merging of the SCI documentation. Work in progress.)
Line 149: Line 149:


==The Parse tree==
==The Parse tree==
This and the two following sections borrow some ideas and structures from abstract language theory. Readers might want to consider related literature.
This and the two following sections borrow some ideas and structures from abstract language theory. Readers might want to consider related literature.


Line 217: Line 216:
|}
|}


With the notable exception of the first rule, these rules constitute  <span style="white-space:nowrap"><i>P. V := {&chi;|&exist;R &isin;  P.&chi; &exist; R}</i></span>; typically, <span style="white-space:nowrap"><i> V = {0&chi;12&fnof;&hellip;0&chi;13&fnof;&sdot; s = m<sub>0</sub></i></span> of the first rule encountered; in all games observed, it was set to 0x13c. &Sigma; contains all word groups and class masks. For the sake of simplicity, we will consider rules matching composite class masks to be several rules. Here is a simplified example of what such a grammar might look like (the hexadecimal prefix '0x' is omitted for brevity):
With the notable exception of the first rule, these rules constitute  <span style="white-space:nowrap"><i>P. V := {&chi;|&exist;R &isin;  P.&chi; &exist R}</i></span>; typically, <span style="white-space:nowrap"><i> V = {0&chi;12&fnof;&hellip;0&chi;13&fnof;&sdot; s = m<sub>0</sub></i></span> of the first rule encountered; in all games observed, it was set to 0x13c. &Sigma; contains all word groups and class masks. For the sake of simplicity, we will consider rules matching composite class masks to be several rules. Here is a simplified example of what such a grammar might look like (the hexadecimal prefix '0x' is omitted for brevity):


In addition to this grammar, each right-hand non-terminal <i>m<sub>i</sub></i> carries its semantic value <i>&rho;<sub>i</sub></i> , which is not relevant for constructing a syntax tree, but must be considered for the semantic tree <i>T</i><sub>&pi;</sub>. These values were omitted in the example above. As in the example above, the grammar is a context-free (type 2) grammar, almost in Chomsky Normal Form (CNF) in SCI; constructing a grammar with CNF rules from it would be trivial.<ref>FreeSCI constructs a GNF (Greibach Normal Form) representation from these rules for parsing.</ref>
In addition to this grammar, each right-hand non-terminal <i>m<sub>i</sub></i> carries its semantic value <i>&rho;<sub>i</sub></i> , which is not relevant for constructing a syntax tree, but must be considered for the semantic tree <i>T</i><sub>&pi;</sub>. These values were omitted in the example above. As in the example above, the grammar is a context-free (type 2) grammar, almost in Chomsky Normal Form (CNF) in SCI; constructing a grammar with CNF rules from it would be trivial.<ref>FreeSCI constructs a GNF (Greibach Normal Form) representation from these rules for parsing.</ref>


Obviously, G is an ambiguous grammar. In SCI, rule precedence is implied by rule order, so the resulting left derivation tree is well-defined (in the example, it would be defined by <i>D<sub>0</sub>.<ref>In FreeSCI, you can use the ”parse” console command to retreive all possible left derivation trees.</ref>
Obviously, G is an ambiguous grammar. In SCI, rule precedence is implied by rule order, so the resulting left derivation tree is well-defined (in the example, it would be defined by <i>D<sub>0</sub></i>.<ref>In FreeSCI, you can use the ”parse” console command to retreive all possible left derivation trees.</ref>
<br>
 
====Parser grammar example====
<span style="white-space:nowrap">G = &lang;{&rang;12&fnof;&hellip;13e,(C<sub>1</sub>,C<sub>2</sub>,C<sub>4</sub>,&hellip;,C<sub>1</sub>00},P,13c)</span>
{| <i>
|----
|
|13c
|&#8614;
|13b 134
|----
|
|13c
|&#8614;
|13b 13d 133
|----
|
|13c
|&#8614;
|13b 13d
|----
|
|13c
|&#8614;
|13c
|----
|
|13c
|&#8614;
|13b 13d 13b 13d
|----
|
|13b
|&#8614;
|131 134
|----
|
|13b
|&#8614;
|131 13d  13d
|----
|rowspan=2| P = {
|13b
|&#8614;
|131
|----
|13d
|&#8614;
|134
|----
|
|131
|&#8614;
|C<sub>8</sub>0
|----
|
|133
|&#8614;
|C<sub>2</sub>0
|----
|
|134
|&#8614;
|C<sub>1</sub>0}
|----
|}</i>
 
===Semantics===
This is important, since the parser does much more than just accept or discard input. Using the semantic tags applied to each non-terminal on the right-hand side of a rule, it constructs what I will call the semantic parse tree <i>T</i><sub>&pi;</sub> , which attempts to describe what the input means. For each non-terminal rule
 
<i>r = &upsilon;<sub>0</sub> &#8614; &upsilon;<sub>1</sub>&upsilon;<sub>2</sub>&hellip;&upsilon;<sub>n</sub></i>
 
there are semantic tags <i>&sigma;<sub>r,1</sub>,&sigma;<sub>r,2</sub>&hellip;&sigma;<sub>r,n</sub> &isin; S</i> , as explained above. <i>T</i><sub>&pi;</sub> is now constructed from the resulting derivation and the semantic tags assiociated with each non-terminal of the rule used. The construction algorithm is explained below with <i>T</i><sub>&pi;</sub> being constructed from nodes, which have the following structure:
 
<i>NODE = {&#9671;} &cup; S &#215; V &#215; (NODE &cup; &Gamma;);*</i>
 
Where S is the set of possible semantic values, and V is the set of non-terminals as defined in the grammar. We will also use the sequence <i>&gamma;<sub>0</sub>,&gamma<sub>1</sub>,&gamma<sub>2</sub> &hellip; &gamma<sub>k-1</sub></i> which will represent the word groups the input tokens belonged to (in the exact order they were accepted), and the sequence <i>r<sub>0</sub>,r<sub>1</sub>,r<sub>2</sub> &hellip; r<sub>l-1</sub></i> , which will be the list of rules used to create the left derivation tree as described in the previous section.
 
 
<syntax type="?">
              Helper function sci_said_recursive: S \times V \times (V \cup \Sigma)* \to \Node
              Parameters: s \in S, Rule r \in V \times (V \cup \Sigma): v0 \to v1 v2 ... vi
              cnmr = cnr
              \Node n := s, v0
              FOR j := 1 TO i
                        IF (vj \in \Sigma) THEN
                                n := n, \gammacn\gamma
                                cn\gamma := cn\gamma + 1
                        ELSE
                                cnoldr := cnr
                                cnr := cnr + 1
                                n := n, sci_said_recursive(\sigmarmr,j, rcnoldr)
                        FI
              ROF
              RETURN (n)
 
 
              Helper function get_children: \Node \to \Node*
                              get_children((s, v, n0, n1 ... nm)) := n0, n1 ... nm
 
 
              Algorithm SCI-SAID-TREE
              cn\gamma := 0;
              cnr := 1;
              ntemp := ntemp, SCI-SAID-RECURSIVE(0, r0)
              root(T\Pi) := (141, 13f, get_children(ntemp))
</syntax>
 
Here is an example, based on the previous one:
 
====Example: Semantic tree example====
<i>
* k = 2
* &gamma;<sub>0</sub> = 842
* &gamma;<sub>1</sub> = 917
* l = 4
* r<sub>0</sub> = 13c &#8614; 13b 134
* &sigma;<sub>r<sub>0</sub>,1</sub> = 141
* &sigma;<sub>r<sub>0</sub>,2</sub> = 142
* r<sub>1</sub> = 13b &#8614; 131
* &sigma;<sub>1<sub>1</sub>,1</sub> = 141
* r<sub>2</sub> = 131 &#8614; C<sub>80</sub>
* r<sub>3</sub> = 134 &#8614; C<sub>10</sub>
</i>
 
The resulting tree would look like this:
<tt><pre>
(141 13f
&#09;(141 13b
&#09;&#09;(141 131 842)
&#09;)
&#09;(142 134 917)
)
</tt></pre>
 
 
==Notes==
<references/>
245

edits

Navigation menu