Difference between revisions of "SCI/Specifications/SCI in action/Parser"

SCI/Specifications/SCI in action/Parser (view source)

Revision as of 13:27, 9 February 2010

48 bytes removed , 13:27, 9 February 2010

→‎The Parse tree: Fixing lots of mistakes and errors (partially introduced during transition from LaTeX to wiki, partially older)

Fingolfin

1,079

edits

@@ Line 154: / Line 154: @@
 After tokenizing, looking up, and finally aliasing the data found in the parsed input string, the interpreter proceeds to build a parse tree <!-- <math>T_\pi</math> --> <i>T</i><sub>&pi;</sub> from the input tokens
-<br><i>I :=</i> &omega;<sub>0</sub>, &omega;<sub>1</sub>,&omega;<sub>2</sub> &hellip; &omega;<sub>n-1</sub> <!-- Mediawiki LaTeX <math>I:=~\omega_0,\omega_1,\omega_2 \ldots \omega_{n-1}</math> --><br>
+:<i>I :=</i> &omega;<sub>0</sub>, &omega;<sub>1</sub>, &omega;<sub>2</sub> &hellip; &omega;<sub>n-1</sub> <!-- Mediawiki LaTeX <math>I:=~\omega_0,\omega_1,\omega_2 \ldots \omega_{n-1}</math> --><br>
 where
@@ Line 161: / Line 161: @@
 * <i>&gamma;<sub>j</sub> &isin; &Gamma;</i>
 * <i>&mu;<sub>j</sub> &isin; 2<sup>c</sup></i>
-* <i>&omega;<sub>j</sub> = (&gamma;<sub>j</sub>, &mu;<sub>j</sub></i>
+* <i>&omega;<sub>j</sub> = (&gamma;<sub>j</sub>, &mu;<sub>j</sub>)</i>
 <!-- Math formulas
 * <math>\omega_j \in W</math>
@@ Line 172: / Line 172: @@
 For the following sections, we define
-* group <i>W &rarr; &Gamma;.group : (&gamma;,&mu;) (&gamma;,&mu;) &#8614; &gamma;</i>
+* group: <i>W &rarr; &Gamma;  : (&gamma;,&mu;)  &#8614; &gamma;</i>
-* classes: <i>W &rarr; C</i>.classes: (&gamma;,&mu;) &#8614; &mu;
+* classes: <i>W &rarr; C : (&gamma;,&mu;) &#8614; &mu;</i>
-* <i>C<sub>x</sub> = {&omega;|&omega; &isin; class(&omega;)}</i>
+* <i>C<sub>x</sub> = {&omega; &isin; W | x &isin; classes(&omega;)}</i>
 To do that, it uses the class masks <i>M</i> as input for a pushdown automaton (PDA) A built from a parser grammar; if <i>M</i> was accepted by <i>A</i>, the parse tree <i>T</i><sub>&pi;</sub> will be built from the matching syntax tree to represent the semantics.
@@ Line 207: / Line 207: @@
 |0x14d
 |Terminal
-|Match on word group: Matches if <span style="white-space:nowrap"><i>m<sub>i</sub> &#61; group(&omega;<sub>j</sub>)</i></span>
+|Match on word group: Matches if <span style="white-space:nowrap"><i>m<sub>i</sub> = group(&omega;<sub>j</sub>)</i></span>
 |----
 |0x154
@@ Line 215: / Line 215: @@
 |}
-With the notable exception of the first rule, these rules constitute  <span style="white-space:nowrap"><i>P. V := {&chi;|&exist;R &isin;  P.&chi; &exist R}</i></span>; typically, <span style="white-space:nowrap"><i> V = {0&chi;12&fnof;&hellip;0&chi;13&fnof;}&sdot; s = m<sub>0</sub></i></span> of the first rule encountered; in all games observed, it was set to 0x13c. &Sigma; contains all word groups and class masks. For the sake of simplicity, we will consider rules matching composite class masks to be several rules. Here is a simplified example of what such a grammar might look like (the hexadecimal prefix '0x' is omitted for brevity):
+With the notable exception of the first rule, these rules constitute  <span style="white-space:nowrap"><i>P. V := {x | &exist;R &isin; P. x &isin; R}</i></span>; typically, <span style="white-space:nowrap"><i> V = {0x12f,&hellip;,0x13f}&sdot; s = m<sub>0</sub></i></span> of the first rule encountered; in all games observed, it was set to 0x13c. &Sigma; contains all word groups and class masks. For the sake of simplicity, we will consider rules matching composite class masks to be several rules. Here is a simplified example of what such a grammar might look like (the hexadecimal prefix '0x' is omitted for brevity):
 In addition to this grammar, each right-hand non-terminal <i>m<sub>i</sub></i> carries its semantic value <i>&rho;<sub>i</sub></i> , which is not relevant for constructing a syntax tree, but must be considered for the semantic tree <i>T</i><sub>&pi;</sub>. These values were omitted in the example above. As in the example above, the grammar is a context-free (type 2) grammar, almost in Chomsky Normal Form (CNF) in SCI; constructing a grammar with CNF rules from it would be trivial.<ref>FreeSCI constructs a GNF (Greibach Normal Form) representation from these rules for parsing.</ref>
@@ Line 223: / Line 223: @@
 ====Parser grammar example====
-<span style="white-space:nowrap">G = &lang;{12&fnof;&hellip;13e},{C<sub>1</sub>,C<sub>2</sub>,C<sub>4</sub>,&hellip;,C<sub>100</sub>},P,13c&rang;</span>
+<span style="white-space:nowrap">G = &lang;{12f,&hellip;,13e},{C<sub>1</sub>,C<sub>2</sub>,C<sub>4</sub>,&hellip;,C<sub>100</sub>},P,13c&rang;</span>
 {| <i>
 |----
@@ Line 295: / Line 295: @@
 there are semantic tags <i>&sigma;<sub>r,1</sub>,&sigma;<sub>r,2</sub>&hellip;&sigma;<sub>r,n</sub> &isin; S</i> , as explained above. <i>T</i><sub>&pi;</sub> is now constructed from the resulting derivation and the semantic tags assiociated with each non-terminal of the rule used. The construction algorithm is explained below with <i>T</i><sub>&pi;</sub> being constructed from nodes, which have the following structure:
-<i>NODE = {&#9671;} &cup; S &#215; V &#215; (NODE &cup; &Gamma;);*</i>
+<i>NODE = {&#9671;} &cup; S &#215; V &#215; (NODE &cup; &Gamma;)*;</i>
 Where S is the set of possible semantic values, and V is the set of non-terminals as defined in the grammar. We will also use the sequence <i>&gamma;<sub>0</sub>,&gamma<sub>1</sub>,&gamma<sub>2</sub> &hellip; &gamma<sub>k-1</sub></i> which will represent the word groups the input tokens belonged to (in the exact order they were accepted), and the sequence <i>r<sub>0</sub>,r<sub>1</sub>,r<sub>2</sub> &hellip; r<sub>l-1</sub></i> , which will be the list of rules used to create the left derivation tree as described in the previous section.