4

I'm trying to parse some string input but I'm struggling to see the solution. However, this must be a well-known pattern-- it's just one I don't encounter frequently.

Background: I have a short list of string keywords ("HEAD", "GET", "POST", "PUT") each of which are followed by additional string data. There can be multiple of the sequence, in any order ("KEYWORD blah blah blah KEYWORD blah blah blah"). There are no termination characters or ending keywords as XML would have -- there's either a new occurance of a keyword clause or the end of the input. Sample:

    str: {HEAD stuff here GET more stuff here POST other stuff here GET even more stuff here PUT still more stuff here POST random stuff}

The output I'd like to achieve:

    results: [
        "HEAD" ["stuff here"] 
        "GET"  ["more stuff here" "even more stuff here"] 
        "POST" ["other stuff here" "random stuff"] 
        "PUT"  ["still more stuff here"]
    ]

My poor attempt at this is:

    results: ["head" [] "get" [] "post" [] "put" []]
    rule1: ["HEAD" (r: "head") | "GET" (r: "get") | "POST" (r: "post") | "PUT" (r: "put")]
    rule2: [to "HEAD" | to "GET" | to "POST" | to "PUT" | to end]

    parse/all str [
        some [
            start: rule1 rule2 ending: 
            (offs: offset? start ending 
            append select results r trim copy/part start offs
            ) :ending 
        | skip]
    ]

I know that rule-2 is the clunker-- the use of the "to" operators is not the right way to think about this pattern; it skips to the next occurrance of the first available keyword in that rule block when I want it to find any of the keywords.

Any tips would be appreciated.

draegtun
  • 22,441
  • 5
  • 48
  • 71
Edoc
  • 339
  • 4
  • 10

3 Answers3

2

How about this...

;; parse rules
keyword: [{HEAD} | {GET} | {POST} | {PUT}]
content: [not keyword skip]

;; prep results block... ["HEAD" [] "GET" [] "POST" [] "PUT" []]
results: []
forskip keyword 2 [append results reduce [keyword/1 make block! 0]]

parse/case str [
    any [
        copy k keyword copy c some content (
            append results/:k trim c
        )
    ]
]

Using your str then results will have what you wanted....

["HEAD" ["stuff here"] "GET" ["more stuff here" "even more stuff here"] "POST" ["other stuff here" "random stuff"] "PUT" ["still more stuff here"]]
draegtun
  • 22,441
  • 5
  • 48
  • 71
  • Thanks -- and this goes to draegtun and sqlab -- I realized that an easy cheat would be for me to add my own terminator to the input, but I'm grateful to see the more parse-appropriate approaches. – Edoc Mar 15 '16 at 12:59
2

maybe not so elegant, but even working with Rebol2

results: ["HEAD" [] "GET" [] "POST" [] "PUT" []]
keyword: [{HEAD} | {GET} | {POST} | {PUT}]
parse/case str [
    any [
       [copy k keyword c1: ] | [skip c2:] 
       [[keyword | end]  (
           append results/:k trim copy/part c1 c2
         ) :c2 |
       ] 
    ]
]
sqlab
  • 6,412
  • 1
  • 14
  • 29
1

Here is another variant.

str: {HEAD stuff here GET more stuff here POST other stuff here GET even more stuff here PUT still more stuff here POST random stuff}
results: ["HEAD" [] "GET" [] "POST" [] "PUT" []]
possible-verbs: [ "HEAD" | "GET" | "POST" | "PUT" | end ]
parse/all str [
    some [
        to possible-verbs
        verb-start: (verb: first split verb-start " ")
        possible-verbs
        copy text to possible-verbs
        (if not none? verb [ append results/:verb trim text ])
    ]
]
probe results

Again, not perfect in terms of elegance and similar in approach.

johnk
  • 1,102
  • 7
  • 12