How to do an unpretty print on pretty JSON file in shell >> serial string JSON >> ES _bulk?

Question

In working with Elasticsearch on AWS EC2, I just hit an issue with bulk indexing. The ES _bulk endpoint requires the files to be basically JSON serial strings with \n terminators on each string; and what I have built using various web APIs and file pre/processing is pretty JSON ie., easily human readable.

Is there a simple shell script method to get all the pretty JSON simply concatenated into strings, without loading up some Java libraries or whatever? I can add tokens to the basic file during pre-processing to tag the desired \n breaks if that helps parsing, but if anyone has a tip on the toolset I would be grateful. I have a feeling there are scripts out there and I know there are some libraries, but I have not found any simple command line tools to do the unpretty printing so far.

score 63 · Answer 1 · answered Oct 17 '15 at 00:31

63

You can try the great jq tool for parsing JSON in the shell. To de-pretty print with jq, you can use either method below:

cat pretty-printed.json | jq -c .
jq -c . pretty-printed.json

the -c (or --compact-output) tells it to not pretty print (which is the default). The "." tells it to return the JSON content "as is" unmodified other than the reformatting. It gets dumped back to stdout, so you can redirect output or pipe it to something else.

P.S. I was looking to address the same problem and came to this option.

answered Oct 17 '15 at 00:31

David

3,223
3
29
41

and to redirect to another file...? – myol Oct 11 '16 at 09:01
you can redirect to file with either command, appending `> pathTo/file.json` to the command – David Jun 03 '22 at 00:15
and if you wish for the fields to be ordered alphabetically when unpretty printing, add `-S` to the jq command. – David Jun 03 '22 at 00:16

score 4 · Answer 2 · edited Nov 27 '21 at 21:41

The answer from D_S_toowhite was not a direct answer but it set me thinking in the right way i.e., the problem is to remove all the white space. I found a very simple way to remove all white space using command line tool tr:

tr -d [:space:] inputfile

The :space: tag removes all white space, tabs, spaces, vertical tabs etc. So a pretty JSON input like this:-

{
    "version" : "4.0",
    "success" : true,
    "result" :
    {
            "Focus" : 0.000590008,
            "Arc" : 12
    }
}

becomes this JSON serial string:

{"version":"4.0","success":true,"result":{"Focus":0.000590008,"Arc":12}}

I still have to solve the \n terminator but I think that is trivial now at least in my special case, just append after closing bracket pair using sed.

The problem with this approach is that if the data contained any important whitespace (ie, one of the values was a string that contained a sentence) it will get incorrectly stripped. — benkc, Oct 02 '18 at 19:07

score 1 · Answer 3 · answered Sep 16 '14 at 02:07

1

You can try find/replace using regexp:

Find what: "^\s{2,}" replace to ""
Find what "\n" replace ""

See this: https://github.com/dzhibas/SublimePrettyJson/issues/17

answered Sep 16 '14 at 02:07

D_S_toowhite

643
5
17

score 1 · Answer 4 · answered Nov 13 '15 at 17:26

jsonlint is easy to get up and running in the command line with the help of npm, and a simple way to print out 'no fluff' JSON is to give it an indentation character of "".

jsonlint -t ""

As a bonus for command line users, I use this all the time to take paste buffers (on a Mac) and convert them into something else, for instance:

Swap buffer contents for a JSON linted 'compressed' format:

pbpaste | jsonlint -t "" | pbcopy

Swap buffer contents for a pretty printed JSON linted format:

pbpaste | jsonlint | pbcopy

You could also pipe file contents to an ugly (and JSON linted) version of the file:

cat data-pretty.json | jsonlint -t "" > data-ugly.json

How to do an unpretty print on pretty JSON file in shell >> serial string JSON >> ES _bulk?

4 Answers4