I have a Jypyter Notebook accessing Big Query using Pandas as the vehicle:
df = pd.io.gbq.read_gbq( query, project_id = 'xxxxxxx-xxxx' )
This works fine from my local machine! (great, in fact!) But when I load the same notebook to Cloud DataLab I get:
DistributionNotFound: google-api-python-client
Which seems rather disappointing! I believe that the module should be installed with Pandas.. but somehow Google is not including it? It would be most preferable for a bunch of reasons to not have to change the code from what we develop on our local machines to what is needed in Cloud DataLab, in this case we heavily parameterize the data access...
Ok I ran:
!pip install --upgrade google-api-python-client
Now when I run the notebook I get an auth prompt that I cannot resolve since DataLab is on a remote machine:
Your browser has been opened to visit:
>>> Browser string>>>>
If your browser is on a different machine then exit and re-run this
application with the command-line parameter
--noauth_local_webserver
Don't see an obvious answer to this?
I use the code suggested below by @Anthonios Partheniou from within the same notebook (executing it in a cell block) after updating the google-api-python-client in the notebook and I got the following traceback:
TypeError Traceback (most recent call last)
<ipython-input-3-038366843e56> in <module>()
5 scope='https://www.googleapis.com/auth/bigquery',
6 redirect_uri='urn:ietf:wg:oauth:2.0:oob')
----> 7 storage = Storage('bigquery_credentials.dat')
8 authorize_url = flow.step1_get_authorize_url()
9 print 'Go to the following link in your browser: ' + authorize_url
/usr/local/lib/python2.7/dist-packages/oauth2client/file.pyc in __init__(self, filename)
37
38 def __init__(self, filename):
---> 39 super(Storage, self).__init__(lock=threading.Lock())
40 self._filename = filename
41
TypeError: object.__init__() takes no parameters
He mentions the need to be executing the notebook from the same folder yet the only way that I know of for executing a datalab notebook is via the repo?
While the new module of using the new Jupyter Datalab module is a possible alternative The ability to use the full Pandas BQ interface unchanged on local and DataLab instances would be hugely helpful! So xing my fingers for a solution!
pip installed:
GCPDataLab 0.1.0
GCPData 0.1.0
wheel 0.29.0
tensorflow 0.6.0
protobuf 3.0.0a3
oauth2client 1.4.12
futures 3.0.3
pexpect 4.0.1
terminado 0.6
pyasn1 0.1.9
jsonschema 2.5.1
mistune 0.7.2
statsmodels 0.6.1
path.py 8.1.2
ipython 4.1.2
nose 1.3.7
MarkupSafe 0.23
py-dateutil 2.2
pyparsing 2.1.1
pickleshare 0.6
pandas 0.18.0
singledispatch 3.4.0.3
PyYAML 3.11
nbformat 4.0.1
certifi 2016.2.28
notebook 4.0.2
cycler 0.10.0
scipy 0.17.0
ipython-genutils 0.1.0
pyasn1-modules 0.0.8
functools32 3.2.3-2
ipykernel 4.3.1
pandocfilters 1.2.4
decorator 4.0.9
jupyter-core 4.1.0
rsa 3.4.2
mock 1.3.0
httplib2 0.9.2
pytz 2016.3
sympy 0.7.6
numpy 1.11.0
seaborn 0.6.0
pbr 1.8.1
backports.ssl-match-hostname 3.5.0.1
ggplot 0.6.5
simplegeneric 0.8.1
ptyprocess 0.5.1
funcsigs 0.4
scikit-learn 0.16.1
traitlets 4.2.1
jupyter-client 4.2.2
nbconvert 4.1.0
matplotlib 1.5.1
patsy 0.4.1
tornado 4.3
python-dateutil 2.5.2
Jinja2 2.8
backports-abc 0.4
brewer2mpl 1.4.1
Pygments 2.1.3
end