Is backreference available in Parslet?

Question

Is there a way to backreference a previous string in parslet similarly to the \1 functionality in typical regular expressions ?

I want to extract the characters within a block such as:

Marker SomeName
 some random text, numbers123
 and symbols !#%
SomeName

in which "Marker" is a known string but "SomeName" is not known a-priori, so I believe I need something like:

rule(:name) { ( match('\w') >> match('\w\d') ).repeat(1) } 
rule(:text_within_the_block) {
 str('Marker') >>  name >> any.repeat.as(:text_block) >> backreference_to_name 
}

What I don't know is how to write the backreference_to_name rule using Parslet and/or Ruby language.

Why do you need `backreference_to_name`? Can't you just have another `name`? — sawa, Jan 09 '14 at 14:18
The `rule(:name)` matches any string with numbers such as "Name123" or "Something456". The issue here is that I want to match the **same** string that was matched a second time, since that is what defines the text block termination in my case. I don't know the string _a priori_ so my idea was to use the`rule(:name)` matches any possible string in the first place and then (with some command I still don't know) I would backreference the string that was previously matched. — zml, Jan 09 '14 at 15:33
I see. But that is a typical parenthetical situation. You should look into how balanced parentheses are to be handled in that system. I guess it is probably done within syntax, not by lexical parsing. — sawa, Jan 09 '14 at 15:44

Nigel Thorne · Accepted Answer · 2014-04-18T11:37:46.743

From http://kschiess.github.io/parslet/parser.html

Capturing input

Sometimes a parser needs to match against something that was already matched against. Think about Ruby heredocs for example:
  str = <-HERE
    This is part of the heredoc.
  HERE
The key to matching this kind of document is to capture part of the input first and then construct the rest of the parser based on the captured part. This is what it looks like in its simplest form:
  match['ab'].capture(:capt) >>               # create the capture
    dynamic { |s,c| str(c.captures[:capt]) }  # and match using the capture

The key here is that the dynamic block returns a lazy parser. It's only evaluated at the point it's being used and gets passed it's current context to reference at the point of execution.

-- Updated : To add a worked example --

So for your example:

require 'parslet'    
require 'parslet/convenience'

class Mini < Parslet::Parser
    rule(:name) { match("[a-zA-Z]") >> match('\\w').repeat }
    rule(:text_within_the_block) {  
         str('Marker ') >>  
         name.capture(:namez).as(:name) >> 
         str(" ") >> 
         dynamic { |_,scope| 
            (str(scope.captures[:namez]).absent? >> any).repeat 
         }.as(:text_block) >> 
         dynamic { |src,scope| str(scope.captures[:namez])  } 
     }

    root (:text_within_the_block)
end
puts Mini.new.parse_with_debug("Marker BOB some text BOB") .inspect 
 #=> {:name=>"BOB"@7, :text_block=>"some text "@11}

This required a couple of changes.

I changed rule(:name) to match a single word and added a str(" ") to detect that word had ended. (Note: \w is short for [A-Za-z0-9_] so it includes digits)
I changed the "any" match to be conditional on the text not being the :name text. (otherwise it consumes the 'BOB' and then fails to match, ie. it's greedy!)

The documentation for Parslet is really, really good. It's very short, but deceptively packed with detail. I've repeatedly hit problems when learning Parslet and eventually solved it through digging into the source etc., only to find there is a passage in the documentation that covers exactly the case I was looking for. Also... you get faster replies if you email the mailing list. — Nigel Thorne, Feb 11 '14 at 02:31

score 0 · Answer 2 · answered Jan 10 '14 at 10:07

I don't exactly want to support stackoverflow, but as you seem to be a parslet user, here goes: Try asking on the mailing list for a real nice answer. (http://dir.gmane.org/gmane.comp.lang.ruby.parslet)

What you call back-reference here is called a 'capture' in parslet. Please see the example 'capture.rb' in parslets source tree.

Is backreference available in Parslet?

2 Answers2