I am working with fake JSON data from a dummy JSON site that looks like this:
[
{
"postId": 1,
"id": 1,
"name": "id labore ex et quam laborum",
"email": "Eliseo@gardner.biz",
"body": "laudantium enim quasi est quidem magnam voluptate ipsam eos\ntempora quo necessitatibus\ndolor quam autem quasi\nreiciendis et nam sapiente accusantium"
},
{
"postId": 1,
"id": 2,
"name": "quo vero reiciendis velit similique earum",
"email": "Jayne_Kuhic@sydney.com",
"body": "est natus enim nihil est dolore omnis voluptatem numquam\net omnis occaecati quod ullam at\nvoluptatem error expedita pariatur\nnihil sint nostrum voluptatem reiciendis et"
}
]
I read in the API data through the requests
library then turn around and send it into an Exasol DB table. See code below
import requests
import pyexasol
def get_api_data():
r = requests.get("http://jsonplaceholder.typicode.com/comments")
data = r.json()
return data
def connection():
session = pyexasol.connect_local_config('my_exasol')
return session
def send_api_data():
s = connection()
data = get_api_data()
for row in data:
s.execute("""INSERT INTO TESTBED.TEST_API(postId, id, name, email, body) VALUES ({postId}, {id},{name},
{email}, {body})""", {'postId': row['postId'], 'id': row['id'], 'name': row['name'], 'email': row['email'],
'body': row['body']})
send_api_data()
This works fine, the problem is it is incredibly slow. It takes almost 2 mins to insert 500 records. I know there must be a more efficient way to do this. In reality I will be pulling in data from an API that has thousands of records that I will want to turn around and send into a DB table.
Any ideas on a better approach to take?