1

I have this (in the example shown I reduced it by removing many lines) non-trivial JSON retrieved from a Spark server:

{
  "spark.worker.cleanup.enabled": true,
  "spark.worker.ui.retainedDrivers": 50,
  "spark.worker.cleanup.appDataTtl": 7200,
  "fusion.spark.worker.webui.port": 8082,
  "fusion.spark.worker.memory": "4g",
  "fusion.spark.worker.port": 8769,
  "spark.worker.timeout": 30
}

I try to read fusion.spark.worker.memory but fail to do so. In my debug statements I can see that the information is there:

msg: "Spark memory: {{spark_worker_cfg.json}} shows this:

ok: [process1] => {
    "msg": "Spark memory: {u'spark.worker.ui.retainedDrivers': 50, u'spark.worker.cleanup.enabled': True, u'fusion.spark.worker.port': 8769, u'spark.worker.cleanup.appDataTtl': 7200, u'spark.worker.timeout': 30, u'fusion.spark.worker.memory': u'4g', u'fusion.spark.worker.webui.port': 8082}"
}

The dump using var: spark_worker_cfg shows this:

ok: [process1] => {
    "spark_worker_cfg": {
        "changed": false,
        "connection": "close",
        "content_length": "279",
        "content_type": "application/json",
        "cookies": {},
        "cookies_string": "",
        "failed": false,
        "fusion_request_id": "Pj2zeWThLw",
        "json": {
            "fusion.spark.worker.memory": "4g",
            "fusion.spark.worker.port": 8769,
            "fusion.spark.worker.webui.port": 8082,
            "spark.worker.cleanup.appDataTtl": 7200,
            "spark.worker.cleanup.enabled": true,
            "spark.worker.timeout": 30,
            "spark.worker.ui.retainedDrivers": 50
        },
        "msg": "OK (279 bytes)",
        "redirected": false,
        "server": "Jetty(9.4.12.v20180830)",
        "status": 200,
        "url": "http://localhost:8765/api/v1/configurations?prefix=spark.worker"
    }
}

I can't access the value using {{spark_worker_cfg.json.fusion.spark.worker.memory}}, my problem seems to be caused by the names containing dots:

The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'fusion'

I have had a look at two SO posts (1 and 2) that look like duplicates of my question but could not derive from them how to solve my current issue.

10465355
  • 4,481
  • 2
  • 20
  • 44
Marged
  • 10,577
  • 10
  • 57
  • 99
  • 1
    Looking again, I think the problem is the naming of the keys in the 'json' element is misleading. Does `{{ spark_worker_cfg['json']['fusion.spark.worker.memory'] }}` work correctly? – clockworknet Jan 15 '19 at 09:02
  • Yes, this works and returns `"msg": "4g"` – Marged Jan 15 '19 at 10:30
  • 1
    OK - hopefully that fixes your problem. It was not clear that in the keys within the 'json' element, the dots were literal. Referring to sub-elements of a data structure with dots looks cleaner, but in cases like these leads to problems. Using square bracket notation instead is slightly uglier, but resolves these kinds of ambiguity. – clockworknet Jan 15 '19 at 10:37

2 Answers2

2

The keys in the 'json' element of the data structure, contain literal dots, rather than represent a structure. This will causes issues, because Ansible will not know to treat them as literal if dotted notation is used. Therefore, use square bracket notation to reference them, rather than dotted:

- debug:
    msg: "{{ spark_worker_cfg['json']['fusion.spark.worker.memory'] }}"

(At first glance this looked like an issue with a JSON encoded string that needed decoding, which could have been handled:"{{ spark_worker_cfg.json | from_json }}")

clockworknet
  • 2,736
  • 1
  • 15
  • 19
  • Your first suggestion returns `Unexpected templating type error occurred on ({{ spark_worker_cfg.json | from_json }}): expected string or buffer"}` – Marged Jan 15 '19 at 08:53
  • Do you want to update your answer so I can accept it ? – Marged Jan 16 '19 at 08:44
1

You could use the json_query filter to get your results. https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters.html

msg="{{ spark_worker_cfg.json | json_query('fusion.spark.worker.memory') }}

edit: In response to your comment, the fact that we get an empty string returned leads me to believe that the query isn't correct. It can be frustrating to find the exact query while using the json_query filter so I usually use a jsonpath tool beforehand. I've linking one in my comment below but I, personally, use the jsonUtils addon in intelliJ to find my path (which still needs adjustment because the paths are handled a bit differently between the two).

If your json looked like this:

{
  value: "theValue"
}

then

json_query('value')

would work.

The path you're passing to json_query isn't correct for what you're trying to do.

If your top level object was named fusion_spark_worker_memory (without the periods), then your query should work. The dots are throwing things off, I believe. There may be a way to escape those in the query...

edit 2: clockworknet for the win! He beat me to it both times. :bow:

Old Schooled
  • 1,222
  • 11
  • 22
  • Unfortunately this returns `"msg": ""` – Marged Jan 15 '19 at 08:51
  • 1
    I think this is because the query isn't quite right due to extra dots. Because of the way that the field is named (fusion.spark.worker.memory), the json_query filter thinks that it is looking for a memory object within a worker object within a spark object within a fusion object. The json query is very finicky, you have to get it exactly right or it will scream. I would suggest using a jquery pathfinder like https://jqplay.org/ to help you find the exact path you need before using the json_query. – Old Schooled Jan 15 '19 at 11:07