3

I have one hocon configuration created from JSON file. I need to parse the following hocon and extract the values

sample hocon file: sample.json

    nodes=[
    {
        host=myhostname
        name=myhostname
        ports {
            # debug port
            debug=9384
            # http Port on which app running
            http=9380
            # https Port on which app running
            https=9381
            # JMX port
            jmx=9383
        }
        type=app
        vm-args=[
            "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram",
            "-XX:+UseConcMarkSweepGC -XX:+UseParNewGC ",
            "-XX:+UseTLAB -XX:CMSInitiatingOccupancyFraction=80 -XX:+ExplicitGCInvokesConcurrent -verbose:gc",
            "-XX:SurvivorRatio=8 -XX:+UseNUMA -XX:TargetSurvivorRatio=80 -XX:MaxTenuringThreshold=15",
            "-Xmx3200m -Xms3200m -XX:NewSize=1664m -XX:MaxNewSize=1664m -Xss1024k",
            "-server"
        ]
    }
]
profile=java-dev
resources {
cfg-repository {
    branch-name=master
    commit-id=HEAD
    password=sigma123
    url="http://localhost:9890/gitcontainer/demo-cfg"
    username=sadmin
}
databases=[
    {
        connection-string="oracle03:1522:si12c"
        name=cm
        password=coresmp601
        username=coresmp601cm
    },
    {
        connection-string="oracle03:1522:si12c"
        name=am
        password=coresmp601
        username=coresmp601am
    }
]
idp {
    url="https://sohanb:8097/idp"
}
keystores=[
    {
        file-location="/home/smp/runtime/ssl"
        name=identity
        passphrase=kspass
    }
]
admin {
    password=sigma123
    url="http://punws-sohanb.net:9002/"
    username=sadmin
}
}

Now from this hocon file i want to extract the vm-args. I have tried different bash tools and sed/awk commands but no luck.

Please suggest!

dawg
  • 98,345
  • 23
  • 131
  • 206
Sohan
  • 6,252
  • 5
  • 35
  • 56

2 Answers2

4

awk to the rescue!

 $ awk 'p&&$0~/"/{gsub("\"","");print} /vm-args/{p=1} ' hoconfile

            -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram,
            -XX:+UseConcMarkSweepGC -XX:+UseParNewGC ,
            -XX:+UseTLAB -XX:CMSInitiatingOccupancyFraction=80 -XX:+ExplicitGCInvokesConcurrent -verbose:gc,
            -XX:SurvivorRatio=8 -XX:+UseNUMA -XX:TargetSurvivorRatio=80 -XX:MaxTenuringThreshold=15,
            -Xmx3200m -Xms3200m -XX:NewSize=1664m -XX:MaxNewSize=1664m -Xss1024k,
            -server

from there you can format as desired.

UPDATE based on the updated input file you need to terminate printing by additional logic add /]/{p=0} between the two blocks as in:

$ awk 'p&&$0~/"/{gsub("\"","");print} /]/{p=0} /vm-args/{p=1}' file

you can pipe the output to tr -d ',' | tr -s ' ' to remove commas and squeeze spaces, or do the same in the awk script.

Explanation: a pattern match to "vm-args" sets the flag (p=1). If the flag is set and the line includes quotes print the line, if the line matches to close square brackets (]) set the flag off (p=0), so effectively stops if there are no more "vm-args" match in the file.

UPDATE: I changed the code slightly, now concatenates the lines into one, searches for the hostname, trimming the extra chars are done with tr and sed.

$ awk 'p && $0~/"/ {args=args $0 FS} 
       p && $0~/]/ {print args; exit} 
 /name=myhostname/ {h=1} 
    h && /vm-args/ {p=1}' file | 
 tr -d '",' | 
 tr -s ' ' | 
 sed 's/^ //'

-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+UseTLAB -XX:CMSInitiatingOccupancyFraction=80 -XX:+ExplicitGCInvokesConcurrent -verbose:gc -XX:SurvivorRatio=8 -XX:+UseNUMA -XX:TargetSurvivorRatio=80 -XX:MaxTenuringThreshold=15 -Xmx3200m -Xms3200m -XX:NewSize=1664m -XX:MaxNewSize=1664m -Xss1024k -server
karakfa
  • 66,216
  • 7
  • 41
  • 56
  • Looks good now, command is working fine ,last thingi am missing is to replace comma separated values with whilespace. As this i have to pass as vmargs to other script – Sohan Mar 07 '16 at 15:34
  • I made that worked by simple sed command. Works fine Thanks! It would be great if you can exmplain the parts of your awk command in answer. – Sohan Mar 07 '16 at 15:43
  • My nodes is kind of array, can i read the specific vm-args absed on `name=myhostname` So in nodes if `name=myhostname` then i get that vm-args only. – Sohan Mar 07 '16 at 16:03
  • 1
    if you're extracting one at a time, without complicating too much you can add another flag which will be set for the pattern `/name=myhostname/` and add that flag. It's difficult to write here, I'll update the answer. – karakfa Mar 07 '16 at 16:32
1

HOCON looks like JSON, but it's a wolf dressed in sheep clothes. In fact, HOCON configuration syntax is quite tricky, one can include multiple files, replace variables multiple times, merge variables, use environment variables, etc.

For this particular file, you could get what you want with awk/shell script, however if the file becomes more complex, or, if in the future you need to parse a more complex file, than you'd be better off with a tool that's specialized in parsing Hocon syntax. Such a tool exists, fortunately.

Use this tool: Hocon Config Printer

This tool fully parses Hocon syntax and outputs a regular JSON.

For your specific example, you can use:

hocon-config-printer sample.hocon.conf  | jq  '.nodes[0]."vm-args"'

Output:

[
  "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram",
  "-XX:+UseConcMarkSweepGC -XX:+UseParNewGC ",
  "-XX:+UseTLAB -XX:CMSInitiatingOccupancyFraction=80 -XX:+ExplicitGCInvokesConcurrent -verbose:gc",
  "-XX:SurvivorRatio=8 -XX:+UseNUMA -XX:TargetSurvivorRatio=80 -XX:MaxTenuringThreshold=15",
  "-Xmx3200m -Xms3200m -XX:NewSize=1664m -XX:MaxNewSize=1664m -Xss1024k",
  "-server"
]
Vlad Dinulescu
  • 1,173
  • 1
  • 14
  • 24