-1

How do I test the below JSON files for correctness?

Using basex on the command line:

thufir@dur:~/json$ 
thufir@dur:~/json$ ls
formatted.json  raw.json
thufir@dur:~/json$ 
thufir@dur:~/json$ basex
BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
> 
> CREATE DATABASE db raw.json
"/home/thufir/json/raw.json" (Line 1): Content is not allowed in prolog.
> 
> CREATE DATABASE db formatted.json
"/home/thufir/json/formatted.json" (Line 1): Content is not allowed in prolog.
> 
> exit
Have fun.
thufir@dur:~/json$ 

I ran the raw data through a formatter to make it more readable:

thufir@dur:~/json$ 
thufir@dur:~/json$ cat formatted.json 
{
  "1224083010015956992": {
    "metadata": {
      "result_type": "recent",
      "iso_language_code": "en"
    },
    "in_reply_to_status_id_str": null,
    "in_reply_to_status_id": null,
    "created_at": "Sun Feb 02 21:31:46 +0000 2020",
    "in_reply_to_user_id_str": null,
    "source": "<a href=\"https://mobile.twitter.com\" rel=\"nofollow\">Twitter Web App<\/a>",
    "retweeted_status": {
      "metadata": {
        "result_type": "recent",
        "iso_language_code": "en"
      },
      "in_reply_to_status_id_str": null,
      "in_reply_to_status_id": null,
      "created_at": "Sun Feb 02 20:53:32 +0000 2020",
      "in_reply_to_user_id_str": null,
      "source": "<a href=\"https://about.twitter.com/products/tweetdeck\" rel=\"nofollow\">TweetDeck<\/a>",
      "retweet_count": 3,
      "retweeted": false,
      "geo": null,
      "in_reply_to_screen_name": null,
      "is_quote_status": false,
      "id_str": "1224073388706189312",
      "in_reply_to_user_id": null,
      "favorite_count": 6,
      "id": 1224073388706189312,
      "text": "Myth of the 10x programmer:\n\nh......... particularly like the list of productivity improvement \"tools\" at the end.",
      "place": null,
      "lang": "en",
      "favorited": false,
      "possibly_sensitive": false,

Given that the online parser shows the data and can explore nodes, can't see what the problem would be.

full:

https://gist.github.com/THUFIR/ab9e1f77af92d4d984b268434afc01dd.js

Community
  • 1
  • 1
Thufir
  • 8,216
  • 28
  • 125
  • 273
  • 1
    There is a great all-in-one JSON formatter and validator [here](https://jsonformatter.curiousconcept.com/). – Jesse Feb 03 '20 at 02:51
  • thanks, it seems fine to me per that validator. at least, didn't say any errors but displays the data. – Thufir Feb 03 '20 at 02:55
  • 1
    Also I believe you added the wrong gist link to your question. Is [this](https://gist.github.com/THUFIR/ab9e1f77af92d4d984b268434afc01dd) what you meant? – Jesse Feb 03 '20 at 02:55
  • 2
    "Content is not allowed in prolog" is an error for malformatted XML files; it's not an error for JSON files. I don't know basex but it looks like your database is interpreting the data as XML, not as json. – Erwin Bolwidt Feb 03 '20 at 03:18
  • @ErwinBolwidt Good observation, especially so given that *"BaseX is a robust, high-performance **XML** database engine"* (quoting first line of the [BaseX home page](http://basex.org/)). – Andreas Feb 03 '20 at 03:27
  • yes, @Jesse that's the data. – Thufir Feb 27 '20 at 11:22

2 Answers2

1

Quoting documentation for CREATE DATABASE:

Syntax CREATE DB [name] ([input])

The input can be a file or directory path to XML documents, a remote URL, or a string containing XML

As you can see, the command expects an XML file, not a JSON file.

Community
  • 1
  • 1
Andreas
  • 154,647
  • 11
  • 152
  • 247
0

Seems to work:

thufir@dur:~/json$ 
thufir@dur:~/json$ 
thufir@dur:~/json$ basex
BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
> 
> list
Name                 Resources  Size  Input Path                               
-----------------------------------------------------------------------------
com.w3schools.books  1          6290  https://www.w3schools.com/xml/books.xml  
w3school_data        1          5209  https://www.w3schools.com/xml/note.xml   

2 database(s).
> 
> exit
See you.
thufir@dur:~/json$ 
thufir@dur:~/json$ ls
createDB.xquery  formatted.json  raw.json
thufir@dur:~/json$ 
thufir@dur:~/json$ cat createDB.xquery 
let $database := "db"
for $name in file:list('.', false(), '*.json')
let $file := file:read-text($name)
let $json := json:parse($file)
return db:add($database, $json, $name) 
thufir@dur:~/json$ 
thufir@dur:~/json$ basex createDB.xquery 
Stopped at /home/thufir/json/createDB.xquery, 5/14:
[db:open] Database 'db' was not found.
thufir@dur:~/json$ 
thufir@dur:~/json$ basex
BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
> 
> create database db
Database 'db' created in 269.32 ms.
> 
> list
Name                 Resources  Size  Input Path                               
-----------------------------------------------------------------------------
com.w3schools.books  1          6290  https://www.w3schools.com/xml/books.xml  
db                   0          4570                                           
w3school_data        1          5209  https://www.w3schools.com/xml/note.xml   

3 database(s).
> 
> exit
See you.
thufir@dur:~/json$ 
thufir@dur:~/json$ basex createDB.xquery 
thufir@dur:~/json$ 
thufir@dur:~/json$ basex
BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
> 
> list
Name                 Resources  Size    Input Path                               
-------------------------------------------------------------------------------
com.w3schools.books  1          6290    https://www.w3schools.com/xml/books.xml  
db                   2          196469                                           
w3school_data        1          5209    https://www.w3schools.com/xml/note.xml   

3 database(s).
> 
> 

and that query will return a very large document, too large to paste here.

Just looking to do the above with Java. The point about JSON versus XML was on point. (The xquery file is from the docs.)

Provided just the formatted JSON is in the directory, seems to work better:

thufir@dur:~/json$ 
thufir@dur:~/json$ basex
BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
> 
> list
Name                 Resources  Size    Input Path                               
-------------------------------------------------------------------------------
com.w3schools.books  1          6290    https://www.w3schools.com/xml/books.xml  
db                   1          101838                                           
w3school_data        1          5209    https://www.w3schools.com/xml/note.xml   

3 database(s).
> 
> open db
Database 'db' was opened in 66.85 ms.
> 
> xquery /

and then returns the expected result (the JSON in full).

Thufir
  • 8,216
  • 28
  • 125
  • 273