I'm building a site that scrapes The List, a collection of upcoming concerts in the SF Bay Area in order to power an application that serves the listings up in a modern web GUI.
Right now, I have a web worker that writes to disk a claim that has a folder for each stage of the scraping process: grabbing the raw HTML, scraping the HTML, and transforming the scraped results into something that is structured. The final file is a JSON file with a bunch of objects that look like this:
{
"band": "Willie Nelson And Family",
"date": "2018-10-17T00:00:00-07:00",
"numberOfShows": "1 show",
"venue": "Graton Resort, 288 Golf course Dr., Rohnert Park",
"time": "8pm",
"soldOut": false,
"pit": false,
"multiDay": false,
"ages": "21+",
"price": "$250"
}
I want to import this file on a recurring basis into graphcool, where I plan to have three entities:
- an artist, which is just the name of the band
- a venue, which is the name of the venue and possibly an address
- a show, which is one or many artists at a given venue on a given date and time
My question is twofold:
- How do I restructure this JSON file in order to structure it in a way that graphcool will like?
and
- How do I upload the contents of this file to graphcool on a recurring basis?