I would like to load the huge amount of data which is compressed (.gz) and I don't know how handle with it. My dataset it is pageviews from wikipedia.
My goal is to provide basic statistic measures to analyse them.
I found this article where is used the same dataset but I don't know how to load dataset using python script which is shown in step1.
I assume that with such a large set of analysis on a local computer is not the right approach, hence the idea to use google cloud