24

Since Geojson is actual json I thought i could to use mongoimport to load data into my MongoDB database from a .geojson file.

but i'm getting the following error:

exception:BSON representation of supplied JSON is too large: code FailedToParse: FailedToParse: Expecting '{': offset:0

The file is 25MB and this is a fragment of it:

{
"type": "FeatureCollection",
"features": [
{
    "type": "Feature",
    "id": "node/2661561690",
    "properties": {
        "timestamp": "2014-02-08T17:58:24Z",
        "version": "1",
        "changeset": "20451306",
        "user": "Schandlers",
        "uid": "51690",
        "natural": "tree",
        "id": "node/2661561690"
    },
    "geometry": {
        "type": "Point",
        "coordinates": [
            -66.9162255,
            10.5056439
        ]
    }
},
// ... Omitted data
{
    "type": "Feature",
    "id": "node/2664472516",
    "properties": {
        "timestamp": "2014-02-10T04:27:30Z",
        "version": "2",
        "changeset": "20477473",
        "user": "albertoq",
        "uid": "527105",
        "name": "Distribuidora Brithijos (Aceites)",
        "shop": "car_parts",
        "id": "node/2664472516"
    },
    "geometry": {
        "type": "Point",
        "coordinates": [
            -66.9388903,
            10.4833647
        ]
    }
}
]
}
OscarVGG
  • 2,632
  • 2
  • 27
  • 34
  • Need mor information: How big is the file? How big is each record in the file? Can you shard the command you ran to get that error? mongoimport expects one json object per line if I remember correctly. – Rob Moore Feb 26 '14 at 02:39
  • @RobMoore the size of the file is 25MB. I ran was `mongoimport --db driversec --collection geomaps --file map.geojson`. The file doesn't have one json object per line, I would say it's pretty printed, that might be the problem then... Do you recommend any tool to shape the file properly for mongoimport – OscarVGG Feb 26 '14 at 04:34
  • @RobMoore I edited the question to show a fragment of the file i'm trying to import – OscarVGG Feb 26 '14 at 04:40
  • 1
    It looks like 1 large document. MongoDB has a 16MB document size limit. That matches the error you are seeing. Do you want it loaded as 1 document or each "Feature" to be a separate document? You will need to write something to break the document up either way. – Rob Moore Feb 27 '14 at 03:02

6 Answers6

25

Download jq (it's sed-like program but for JSON)

Then run:

jq --compact-output ".features" input.geojson > output.geojson

then

mongoimport --db dbname -c collectionname --file "output.geojson" --jsonArray

ParoX
  • 5,685
  • 23
  • 81
  • 152
  • 1
    It's much better to have a process that can be automated rather the being told the specific tweaks you have to make in a text editor (which is creaking under the massive weight of the file). Thanks for your advice – Forbesmyester May 24 '16 at 09:23
  • 3
    one could pipe the result directly, without using a temporary file: `jq --compact-output ".features" input.geojson | mongoimport --db dbname -c collectionname --jsonArray` – Constantin Galbenu Nov 08 '18 at 12:27
10

Right now you have an array of features. MongoDB will consider this to be one document. Try deleting the following from the beginning of your geojson:

{
"type": "FeatureCollection",
"features": [

Also, delete the following from the end of your geojson:

]
}

EDIT - Also, mongo expects one document per line. So make sure that your only \n is between documents! e.g.

...    
},\n
    {
        "type": "Feature",
        "id": "node/2664472516",
...
Adam
  • 3,142
  • 4
  • 29
  • 48
4

ParoX idea works great, however has 16MB limit.

mongodb document

--jsonArray Accepts the import of data expressed with multiple MongoDB documents within a single JSON array. Limited to imports of 16 MB or smaller.

If the file size larger than 16MB, you could do this

jq --compact-output ".features[]" input.geojson > output.geojson

This will give you exactly one line for one object, no comma at end.

{.....}
{.......}
{...}

{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.87088507656375,35.21515162500578]},"properties":{"name":"ABBOTT NEIGHBORHOOD PARK","address":"1300  SPRUCE ST"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.83775386582222,35.24980190252168]},"properties":{"name":"DOUBLE OAKS CENTER","address":"1326 WOODWARD AV"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.83827000459532,35.25674709224663]},"properties":{"name":"DOUBLE OAKS NEIGHBORHOOD PARK","address":"2605  DOUBLE OAKS RD"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.83697759172735,35.25751734669229]},"properties":{"name":"DOUBLE OAKS POOL","address":"1200 NEWLAND RD"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.81647652154736,35.40148708491418]},"properties":{"name":"DAVID B. WAYMER FLYING REGIONAL PARK","address":"15401 HOLBROOKS RD"}}
{"type":"Feature","geometry":{"type":"Point","coordinates":[-80.83556459443902,35.39917224760999]},"properties":{"name":"DAVID B. WAYMER COMMUNITY PARK","address":"302 HOLBROOKS RD"}}
{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[-80.72487831115721,35.26545403190955],[-80.72135925292969,35.26727607954368],[-80.71517944335938,35.26769654625573],[-80.7125186920166,35.27035945142482],[-80.70857048034668,35.268257165144064],[-80.70479393005371,35.268397319259996],[-80.70324897766113,35.26503355355979],[-80.71088790893555,35.2553619492954],[-80.71681022644043,35.2553619492954],[-80.7150936126709,35.26054831539319],[-80.71869850158691,35.26026797976481],[-80.72032928466797,35.26061839914875],[-80.72264671325684,35.26033806376283],[-80.72487831115721,35.26545403190955]]]},"properties":{"name":"Plaza Road Park"}}

mongoimport --db dbname -c collectionname --file "output.geojson" --jsonArray

hoogw
  • 4,982
  • 1
  • 37
  • 33
3

This Python script is designed to import GeoJSON files into MongoDB in one step: https://github.com/rtbigdata/geojson-mongo-import.py

1

First of all, for verifying that your GeoJSON file is accurate, you could use Geojsonlint, QGIS and so on.

After than, to import your data into your collection, use Mongoimport:

mongoimport --db MY_DATABASE_NAME -c MY_COLLECTION_NAME --type json --file "MY_GEOJSON_FILENAME"

Replace the 3 variables above whith your valid names. Obviously, make sure that your current directory contains the file.

Menelaos Kotsollaris
  • 5,776
  • 9
  • 54
  • 68
0

If the problem is your set of documents size is superior to 16Mb, you can use the batchSize option, which set the number of documents in a batch. For instance:

mongoimport -d mydb -c mycol data.json -j 4 --batchSize=100

Note the -j option which helps to increase the output to the database by using several workers.

The batchSize option is strangely not documented using the '--help' option of 'mongoimport', go figure !

Kartoch
  • 7,610
  • 9
  • 40
  • 68