1

as far as I know (read) about generating LR-Parsing tables is that the columns (= token), where the reduce-action is written into a cell for a certain state, depends on the Terminals, that are in the FOLLOW-set of the reduced symbol.

Is that correct?*

If so, then the next question that comes across my mind, is: what is it that determines the next state, into which to make a transition happen after reduction.

E.g., in state 5 a r6 means to reduce the symbol and then transition into state 6 (and there maybe consider the goto-table, into which state to transition further)

The states of the parse-table or the DFA - which an LR-Parser is - are paths in a graph representation. As bottom-up parser, an LR-Parser works by finding a path back to the start symbol / accepting state. The parse-table tries to find every such path.

And it seems complicated to me, to consequentially chose the right reduction-states.

Because it intimately depends on the states / ItemSets that preceded the current state as well as the states that totentially succeed the current state depending on the yet unread input tokens.

In comparison with reduce-actions, shift- and goto-actions seem easy as they are just the transitions that appear when moving the dot-position.

Thanks

PS: *If it is correct, then isn't it "double trouble" to generate LR1Items with the FOLLOW-Set as additional feature?

von spotz
  • 875
  • 7
  • 17
  • Haven't I already answered this? The state used for the transition is the one revealed by popping the production's right-hand side off of the stack. (The stack is precisely how the parser keeps track of the parse history.) – rici Nov 28 '20 at 17:51
  • Do you employ a value stack? The example configuration diagrams only use states on the stack like in the parsing-table. Other two possibilities: 1) Making use of a combination of both. 2) Using a value stack when creating the parsing table. Yours sincerely PS: Would you answer a question in our chat? – von spotz Nov 28 '20 at 20:30
  • Values can be placed on the stack or a parallel stack but that has little to do with parsing. My comment referred to states. Because the current state is pushed onto the stack every time a transition is taken (by a shift or a goto), a rhs symbol naturally corresponds to a state (the state just before the symbol was shifted/goto'd). And that's the state popped when the rhs is reduced. – rici Nov 28 '20 at 21:00
  • Popping the states corresponding to the rhs symbols in a reduction backs the state machine up to where it was before it shifted/goto'd the first symbol on the right hand side. In other words, the in which it can advance over the reduced left-hand side. – rici Nov 28 '20 at 21:01
  • Yes. So no value stack needed. – von spotz Nov 28 '20 at 21:08
  • Once you get the similarity between shift and goto, the lr parser algo becomes much clearer. States with the dot before a non-terminal have epsilon-transitions to the state which will eventually recognise that non-terminal. Once the non-terminal is recognised, it's reduced and the reduction puts the machine back into the state which took the epsilon transition. But now, instead of taking the epsilon transition, the machine "shifts" the non-terminal using the goto action. Which is just like shifting a terminal. So it moves to a state with the dot just after the non-terminal. – rici Nov 28 '20 at 21:35
  • Hello rici. The first comment however doesn't seem to me to answer my question on how to chose the next state when reducing **when generating the parse table.** Let's stick with the wikipedia example (https://en.wikipedia.org/wiki/LR_parser#Parse_table_for_the_example_grammar) After having reduced `int` or `id` to `Value` in states 8/9, one could further reduce `Value` to `Products` in state 7 `Products -> Value`, **but** the parsing-table choses otherwise. Namely to reduce ... (p1/2) – von spotz Nov 29 '20 at 17:52
  • .. ''Value'' to ''Products'' in state 6 (`Products -> Products * Value`). Why so? How so? You write "The state used for the transition is the one revealed by popping the production's right-hand side off of the stack." **as if there was only one item** where a further reduction is possible, as the wiki example demonstrates. Further there's the question what state to chose **after** the reduction to the lhs has happened ! In state 6 of the example there is no goto-instruction ! Thanks and best wishes! (p. 2/2) – von spotz Nov 29 '20 at 17:52
  • i insist, `r6` does not have anything to do with state 6. It means reduce using **production 6**. The state used is the one revealed on the stack (since there are several possibilities, you cannot know without running the parse on a particular input.) – rici Nov 29 '20 at 19:34
  • As to how to figure out the transition table during parser construction: there is no difference between shift and goto. You figure out the next state after the shift of `a` by taking just the items where the dot is just before an `a` and shifting the dot one symbol to the right. You figure out the next state after GOTO(`A`) by taking just the items where the dot is just before an `A` and shifting the dot one symbol to the right. – rici Nov 29 '20 at 19:38
  • If you understand how a predictive (LL) parser works, you can see the prediction action in the GOTO (if you squint a bit). The difference is that the parser didn't know in advance which prediction to make. But once it figures it out (i.e. reaching a reduction), it pops the stack just like a recursive descent parser would, by returning to where the prediction started. At that point, the parsing of the reduced non-terminal is done, and the parser can continue with the next symbol. – rici Nov 29 '20 at 19:44
  • In summary, when the parser does a reduction: the static table cannot tell it which state to use: it has to look back in the stack to figure that out. But the static table can tell it which of the possible productions has been identified. So that's the number in the table entry. – rici Nov 29 '20 at 19:46
  • I think I know how to employ the parsing table in a configuration diagram. you are in state/itemset , in the row for state n there is in the cell for the column with token *'+'* r, which means that you have to reduce the item in the current state/itemset where the is in right-most position to the lhs. then you transition to state/itemset , check if there is a goto instruction for the reduced lhs NonTerminal. ... (p. 1/ 2) – von spotz Nov 29 '20 at 20:02
  • ... If so, you transition to the state instructed by the goto-entry. If in that state there is a shift for the next input token, then you shift and transition to state But that was not my question ??? (p. 2/2) – von spotz Nov 29 '20 at 20:05
  • But you don't transition to state m. – rici Nov 29 '20 at 20:18
  • You see how many symbols are on the right-hand side of **production m**, pop that many states off the stack, and look up the non-terminal (of production m) in the goto table of the state which is now on top of the stack. – rici Nov 29 '20 at 20:21
  • There is guaranteed to be an entry in the goto table for that non-terminal in that state. If that weren't the case, you could never have gotten to the state with a reduce action. – rici Nov 29 '20 at 20:23
  • Both in in the wikipedia example and in the dragon book there is hardly a goto entry in every row where there is a reduce-action. I would post the table from the dragon book if i knew this was legal for proof. the related configuration diagram shows how the parser jumps into the state that is numbered in the reduce action. r5? $05; in state 5: no goto, only r6; in state 6: goto 3, $03; in state 3: no goto; only r4; and so on. What are you talking rici ? – von spotz Nov 30 '20 at 05:32
  • It's not the state which reduces. It's not the state whose number is in the reduce action. It's the state whose number is in the stack underneath the states popped by the reduction. The only states which have goto actions are the ones in which some item has the dot before a non-terminal (and that's the non-teminal for which there is a go to action). That's almost exactly the same as a shift action: a state can only have a shift action for a terminal if it contains an item with a dot before that terminal. – rici Nov 30 '20 at 05:54
  • The only difference is that the possibility of a conflict. Goto actions don't have conflicts. – rici Nov 30 '20 at 05:56
  • Hello rici, I dont understand what you mean by "It's not the state which reduces. It's not the state whose number is in the reduce action. It's the state whose number is in the stack underneath the states popped by the reduction." But of course you pop the current state on top of the stack. And then you transition to the state which is coded after the "r" like "r6" -> you reduce by popping the top of the stack which is the item `Foo -> Bar baz" and then you push state 6 on the stack or you push the state number that is in the goto in state 6 for the NonTerminal "Foo", which may be 2. – von spotz Dec 01 '20 at 17:12
  • the rest of your last two posts is obvious. cheers. – von spotz Dec 01 '20 at 17:12
  • The 6 in r6 is not a state number. – rici Dec 01 '20 at 17:22
  • First: Thanks for your patience with me. Second: Thanks for saying it so clearly. But then you really make me riddle: What is? Because the cenjecture would fit all configuration-/parsing- diagrams I know. And how is it then that you know what state to transition next into after reduction. Because as ascertained: There are not goto-entries in every row/item where there is a dot in a right-most position. Yours sincerely! – von spotz Dec 06 '20 at 00:15
  • Popping the stack automatically does a state transition, because the stack is a stack of states. Rather than think of a state machine and a stack, think of a state machine which can also move backwards through its history. The stack just implements this feature. When the state machine shifts a token or goto's a non-terminal, the transition is recorded so that it can later be reversed. When the machine reaches a reduce action, it starts moving backwards. Since it already moved forward over each symbol in the rhs, it needs to move backwards the same number of steps... – rici Dec 06 '20 at 01:19
  • ...once it does that, it will return to the state in which it shifted the first rhs of the production. That state must have a goto action for the lhs of the production, because the transition which was undone to get there was added to the state as an ε-closure of the lhs non-terminal. In short, to do a reduction the machine needs to know which production is being reduced: to know how far to backup and to know which non-terminal to goto. But it doesn't need to be told a state. That's already in its history. – rici Dec 06 '20 at 01:29
  • I have understood now what you mean by popping off as many states as there are symbols on the rhs. BUT: What do you mean by "But you don't transition to state m." ??? What else than "reduce and then transition to " does r, e.g. r6, mean ? Best wishes my friend ! – von spotz Dec 06 '20 at 18:16
  • The last post before mine is also clear. What is unclear is how to construct reduce-actions that take into consideration the history and the potential "hýsteron" in the path from `[S eof ]` . – von spotz Dec 06 '20 at 19:59
  • `r6` means "reduce using production 6". That is, pop as many states off the stack as there are symbols 9n the right hand side of production 6, and then lookup the left hand side of production 6 in the goto table of whatever state is now at the top of the stack to find the new state to transition to. You have noticed that the production's are numbered, right? – rici Dec 06 '20 at 21:09
  • The history is on the stack because every time you shift or goto, you push the new state onto the stack. So the no heavy lifting is needed to reduce. Just a few pops. That is the entire purpose of the stack. – rici Dec 06 '20 at 21:11
  • Here we can find the example from the dragon book: https://cs.nyu.edu/~gottlieb/courses/2000s/2008-09-fall/compilers/lectures/lecture-07.html In the configuration diagram it is obvious that it is state5 that is first reduced `F -> id` in step 1 (step index 0) `stack: $05 | input: id | rest of input: *id+id$ | action: reduce by F→id` still in state 5, and in step 2 the parser has transitioned to state 6, has looked up in the goto table there and pushed state 3 on the stack. – von spotz Dec 07 '20 at 05:57
  • 1
    Just a coincidence. The reduce action reduces production 6 (F->id) whose rhs has a single symbol. So one state is popped off the stack, and the machine returns to state 0. Now it looks up F in state 0's goto table, and follows that transition. It should be clear why it returns to state 0. It was in state 0 when it started on production 6 (an F) by shifting an id. When it returns to state 0, it has finished the F so it can now leave the state with a goto action instead of a shift action. – rici Dec 07 '20 at 06:13
  • Holy sh*t I'm sorry for being so blind. Must've gotten on your nervs pretty much. I'm sorry but look at the example configuration diagram. it absolutely fits my conjuncture as well. you come to the same states. only that with your interpretation it's easy to make ends meet, that is, generate a parsing table, and still get the transitions right. (part 1 / 2 ) – von spotz Dec 07 '20 at 12:57
  • Now there is nothing mysterious or arkane anymore about "the choice of the state to transition to after we reduced the item in the state in which we are with the in right-most position, such that it produces a part of a bottom up rightmost derivation to the accept state `S -> $ `" haha – von spotz Dec 07 '20 at 13:04
  • Can we proceed as usual. You formulate your answer like "The number that accompagnies the "r" for "reduce" means the ordinal number for the productions of the grammar. CAVE: The number doesn't mean the ordinal number of a state of the machine, even though circumstances can produce an haphazard semantic Isomorphisms, which we must be careful of." Then I accept this as "accepted answer" ? – von spotz Dec 07 '20 at 13:07
  • 1
    Ok, I'll make an answer out of some of these comments. – rici Dec 07 '20 at 13:12
  • Could you, in chat maybe, if you are still patient with me, explain to me, what role the first and the follow sets play in the generation of the parsing table? I conjecture that an reduce action is placed in a field where the column is an input symbol taken from the alphabet which is in the combined follow sets of all lhs-NonTerminals of the state. Yours sincererly, von Spotz – von spotz Dec 07 '20 at 13:13
  • @rici You forgot(?) to formulate the answer. This answer could be useful to many. – Sourav Kannantha B Jun 16 '23 at 12:12

0 Answers0