Kaggle BigQuery integration

Question

Kaggle do provide link to Big Query, is there any API doc and examples to link. Below is what I tired

# Set your own project id here
PROJECT_ID = 'your-google-cloud-project'
from google.cloud import bigquery
bigquery_client = bigquery.Client(project=PROJECT_ID)
from google.cloud import storage
storage_client = storage.Client(project=PROJECT_ID)

hn_dataset_ref = bigquery_client.dataset('DC Taxi Trips', project='bigquery-public-data')
hn_dset = bigquery_client.get_dataset(hn_dataset_ref)
[x.table_id for x in bigquery_client.list_tables(hn_dset)]

I got an error saying

/opt/conda/lib/python3.6/site-packages/google/cloud/_http.py in api_request(self, method, path, query_params, data, content_type, headers, api_base_url, api_version, expect_json, _target_object, timeout) 421 422 if not 200 <= response.status_code < 300: --> 423 raise exceptions.from_http_response(response) 424 425 if expect_json and response.content:

BadRequest: 400 GET https://www.googleapis.com/bigquery/v2/projects/bigquery-public-data/datasets/DC%20Taxi%20Trips: Invalid dataset ID "DC Taxi Trips". Dataset IDs must be alphanumeric (plus underscores and dashes) and must be at most 1024 characters long.

The Data Set I tried to access is https://www.kaggle.com/bvc5283/dc-taxi-trips/metadata

score 1 · Answer 1 · answered Mar 15 '20 at 21:49

The error says it:

Dataset IDs must be alphanumeric (plus underscores and dashes) and must be at most 1024 characters long.

So, if you are not certain about your dataset id, then maybe try the alphanumeric underscore/dash separated options (like dc-taxi-trips or dc_taxi_trips).

Kaggle BigQuery integration

1 Answers1