1

I have this string below:

"\n  - MyLibrary1 (from ‘repo_name’, branch ‘master’)\n  - AFNetworking (= 1.1.0)\n  - MyLibrary2 (from ‘repo_name’, branch ‘master’)\n  - Objective-C-HMTL-Parser (= 0.0.1)\n\n"

Of which I wish to extract the data and create a JSON like this below:

{
"MyLibrary1": “master”,
"AFNetworking": "1.1.0",
"MyLibrary2": “master”,
"Objective-C-HMTL-Parser": "0.0.1"
}

With the help of my previous post (Regex for huge string), I was able to get the data after '=' in the string.

I am working on modifying the same regex to get the word 'master'. With whatever I tried, in my match object I get first part as "MyLibrary1" and second part as "from ‘repo_name’, branch ‘master’".

Question: Can a regex contain a word? Can I add word 'branch' to get the word 'master' off the string?

Regex I tried - -\s*(.?)\s(\s*(.?)\s)

Rubular link - http://rubular.com/r/gPLIa0xqRC

Community
  • 1
  • 1
tech_human
  • 6,592
  • 16
  • 65
  • 107
  • Sure you can do this by modifying the regex. But, after this modification, the form will pretty much be set so there is not much room to make more additions. In that case, you would need to capture the contents inside the parenthesis and parse it separately. –  Sep 17 '14 at 23:48
  • Yeah, I was actually thinking to use 2 regex, one for getting the values after '=' and other for getting the values after 'from'. But with hwnd's answer below, looks like it can be achieved using one regex. But it's definitely true that my regex is overloaded and will start getting complex if I try to add anything more in it. – tech_human Sep 18 '14 at 12:32

2 Answers2

2

Yes, you can use the alternation operator in context to specify that it either matches an equal sign or any character except ) "zero or more" times preceded by the word "branch".

-\s*(\S+)\s*\(\s*(?:=|[^)]*\bbranch)\s*(\S+)\s*\)

Rubular

hwnd
  • 69,796
  • 4
  • 95
  • 132
  • +1 I like this part `[^)]*\bbranch` which searches from `)` backwards. –  Sep 18 '14 at 00:27
  • @sln Where have you been hiding? – hwnd Sep 18 '14 at 00:28
  • Been doing a software release. –  Sep 18 '14 at 00:31
  • Thanks for the reply and also for posting Regular expressions site link. I just started with regex and looking to learn it into depth this weekend. For now Rubular rocks! Your reply does return me what I need except that word 'master' has quotes around it. I modified it to obtain just the word master but I am still getting ending quotes. Am I doing something wrong here? Rubular link - http://rubular.com/r/4xKhief3hX – tech_human Sep 18 '14 at 12:49
  • Also since the regex is getting more bigger and complex, do you think I should break it down into two or such bigger and complex regex are normal? Just trying to learn. :) – tech_human Sep 18 '14 at 12:50
  • Okay I was able to get it. Rubular link - http://rubular.com/r/tNnmBJZt5i Is the regex correct? – tech_human Sep 18 '14 at 13:32
0

I think you want:

-\s*(.*?)\s*\(\s*.*branch ‘(.*?)’\s*\)

Be careful with the literal quotes, though, because they look like auto-formatted ones. Your real input may be single quotes (') around master. This is a rubular example.

zerodiff
  • 1,690
  • 1
  • 18
  • 23