4

I have a zip file containing a relatively large dataset (1Gb) stored in a zip file in Google Cloud Storage instance.

I need to use Notebook hosted in Google Cloud Datalab to access that file and the data contained there. How do I go about this?

Thank you.

jaycode
  • 2,926
  • 5
  • 35
  • 71

1 Answers1

2

Can you try the following?

import pandas as pd

# Path to the object in Google Cloud Storage that you want to copy
sample_gcs_object = 'gs://path-to-gcs/Hello.txt.zip'

# Copy the file from Google Cloud Storage to Datalab
!gsutil cp $sample_gcs_object 'Hello.txt.zip'

# Unzip the file
!unzip 'Hello.txt.zip' 

# Read the file into a pandas DataFrame
pandas_dataframe = pd.read_csv('Hello.txt')
Anthonios Partheniou
  • 1,699
  • 1
  • 15
  • 25