1,079
edits
(One intermediate revision by the same user not shown) | |||
Line 218: | Line 218: | ||
With the notable exception of the first rule, these rules constitute the set of production rules <math>P</math>. Furthermore we have <math>V := \{x \mid \exists R \in P : x \in R\}</math>; typically, <math>V = \{ \mathrm{0x12f}, \ldots, \mathrm{0x13f}\}</math> and <math>s = m_0</math> of the first rule encountered; in all games observed, it was set to 0x13c. <math>\Sigma</math> contains all word groups and class masks. For the sake of simplicity, we will consider rules matching composite class masks to be several rules. Here is a simplified example of what such a grammar might look like (the hexadecimal prefix '0x' is omitted for brevity): | With the notable exception of the first rule, these rules constitute the set of production rules <math>P</math>. Furthermore we have <math>V := \{x \mid \exists R \in P : x \in R\}</math>; typically, <math>V = \{ \mathrm{0x12f}, \ldots, \mathrm{0x13f}\}</math> and <math>s = m_0</math> of the first rule encountered; in all games observed, it was set to 0x13c. <math>\Sigma</math> contains all word groups and class masks. For the sake of simplicity, we will consider rules matching composite class masks to be several rules. Here is a simplified example of what such a grammar might look like (the hexadecimal prefix '0x' is omitted for brevity): | ||
In addition to this grammar, each right-hand non-terminal < | In addition to this grammar, each right-hand non-terminal <math>m_i</math> carries its semantic value <math>\rho_i</math> , which is not relevant for constructing a syntax tree, but must be considered for the semantic tree <math>T_\pi</math>. These values were omitted in the example above. As in the example above, the grammar is a context-free (type 2) grammar, almost in Chomsky Normal Form (CNF) in SCI; constructing a grammar with CNF rules from it would be trivial.<ref>FreeSCI constructs a GNF (Greibach Normal Form) representation from these rules for parsing.</ref> | ||
Obviously, G is an ambiguous grammar. In SCI, rule precedence is implied by rule order, so the resulting left derivation tree is well-defined (in the example, it would be defined by < | Obviously, <math>G</math> is an ambiguous grammar. In SCI, rule precedence is implied by rule order, so the resulting left derivation tree is well-defined (in the Parser example further down, it would be defined by <math>D_0</math>.<ref>In FreeSCI, you can use the ”parse” console command to retrieve all possible left derivation trees.</ref> | ||
<br> | <br> | ||
====Parser grammar example==== | ====Parser grammar example==== | ||
< | <math>G = \langle \{ \mathrm{12f}, \ldots, \mathrm{13e} \}, \{ C_1, C_2, C_4, \ldots, C_{100} \}, P, \mathrm{13c} \rangle</math> | ||
{| <i> | {| <i> | ||
|---- | |---- | ||
Line 290: | Line 290: | ||
===Semantics=== | ===Semantics=== | ||
This is important, since the parser does much more than just accept or discard input. Using the semantic tags applied to each non-terminal on the right-hand side of a rule, it constructs what I will call the semantic parse tree < | This is important, since the parser does much more than just accept or discard input. Using the semantic tags applied to each non-terminal on the right-hand side of a rule, it constructs what I will call the semantic parse tree <math>T_\pi</math> , which attempts to describe what the input means. For each non-terminal rule | ||
< | <math>r = v_0 \to v_1 v_2 \ldots v_n</math> | ||
there are semantic tags < | there are semantic tags <math>\sigma_{r,1}, \sigma_{r,2} \ldots \sigma_{r,n} \in S</math>, as explained above. <math>T_\pi</math> is now constructed from the resulting derivation and the semantic tags assiociated with each non-terminal of the rule used. The construction algorithm is explained below with <math>T_\pi</math> being constructed from nodes, which have the following recursive structure: | ||
< | <math> | ||
\newcommand{\Node}{\textrm{NODE}} | |||
\Node = \{\diamond\} \cup S \times V \times (\Node \cup \Gamma)*;</math> | |||
Where S is the set of possible semantic values, and V is the set of non-terminals as defined in the grammar. We will also use the sequence < | Where <math>S</math> is the set of possible semantic values, and <math>V</math> is the set of non-terminals as defined in the grammar. We will also use the sequence <math>\gamma_{0}, \gamma_{1}, \gamma_{2} \ldots \gamma_{k-1}</math> which will represent the word groups the input tokens belonged to (in the exact order they were accepted), and the sequence <math>r_{0}, r_{1}, r_{2} \ldots r_{l-1}</math>, which will be the list of rules used to create the left derivation tree as described in the previous section. | ||
<pre> | <pre> |
edits