7

I'm attempting to learn Racket, and in the process am attempting to rewrite a Python filter. I have the following pair of functions in my code:

def dlv(text):
    """
    Returns True if the given text corresponds to the output of DLV
    and False otherwise.
    """
    return text.startswith("DLV") or \
           text.startswith("{") or \
           text.startswith("Best model")

def answer_sets(text):
    """
    Returns a list comprised of all of the answer sets in the given text.
    """
    if dlv(text):
        # In the case where we are processing the output of DLV, each
        # answer set is a comma-delimited sequence of literals enclosed
        # in {}
        regex = re.compile(r'\{(.*?)\}', re.MULTILINE)
    else:
        # Otherwise we assume that the answer sets were generated by
        # one of the Potassco solvers. In this case, each answer set
        # is presented as a comma-delimited sequence of literals,
        # terminated by a period, and prefixed by a string of the form
        # "Answer: #" where "#" denotes the number of the answer set.
        regex = re.compile(r'Answer: \d+\n(.*)', re.MULTILINE)
    return regex.findall(text)

From what I can tell the implementation of the first function in Racket would be something along the following lines:

(define (dlv-input? text)
    (regexp-match? #rx"^DLV|^{|^Best model" text))

Which appears to work correctly. Working on the implementation of the second function, I currently have come up with the following (to start with):

(define (answer-sets text)
    (cond
        [(dlv-input? text) (regexp-match* #rx"{(.*?)}" text)]))

This is not correct, as regexp-match* gives a list of the strings which match the regular expression, including the curly braces. Does anyone know of how to get the same behavior as in the Python implementation? Also, any suggestions on how to make the regular expressions "better" would be much appreciated.

C. K. Young
  • 219,335
  • 46
  • 382
  • 435
ggelfond
  • 321
  • 1
  • 2
  • 5
  • 1
    May we ask the cause for migration? – Jon Clements Nov 28 '12 at 18:09
  • 1
    I wrote the original filter to learn Python, and am taking a similar approach to learning Racket. So it's just a personal "because I'd like to" reason. – ggelfond Nov 28 '12 at 18:17
  • 2
    Just to note: in order to match behavior precisely, the Python version would have needed to lift up the compiled regular expression values. Python may cache the result of `re.compile` for simple inputs, but you may not want to depend on this behavior for performance-critical code. – dyoo Nov 29 '12 at 00:04

1 Answers1

8

You are very close. You simply need to add #:match-select cadr to your regexp-match call:

(regexp-match* #rx"{(.*?)}" text #:match-select cadr)

By default, #:match-select has value of car, which returns the whole matched string. cadr selects the first group, caddr selects the second group, etc. See the regexp-match* documentation for more details.

C. K. Young
  • 219,335
  • 46
  • 382
  • 435