I want to convert parquet file to hyper file format using python. There is the following git for this - https://github.com/tableau/hyper-api-samples/blob/main/Community-Supported/parquet-to-hyper/create_hyper_file_from_parquet.py. But in this case the parquet format /schema is known beforehand. What should I do if I want it to work for any parquet file, irrespective of the schema. About me, I mostly work in analytics and data science with python but wanted to work on this project to make some files accessible to tableau. Thank you in advance and please let me know if you want any more information.
Asked
Active
Viewed 1,009 times
2 Answers
2
If you do not wish to define a schema when creating a .hyper file from a parquet file you can use the CREATE TABLE
command instead of the COPY
command.
To use the CREATE TABLE
command you can skip the schema and table definition like this:
# Start the Hyper process.
with HyperProcess(telemetry=Telemetry.SEND_USAGE_DATA_TO_TABLEAU) as hyper:
# Open a connection to the Hyper process. This will also create the new Hyper file.
# The `CREATE_AND_REPLACE` mode causes the file to be replaced if it
# already exists.
with Connection(endpoint=hyper.endpoint,
database=hyper_database_path,
create_mode=CreateMode.CREATE_AND_REPLACE) as connection:
connection.execute_command("CREATE TABLE products AS (SELECT * FROM external('products.parquet'))")

The Singularity
- 2,428
- 3
- 19
- 48
-
does this work for a regular dataframe? is this more performant as it seems to be a bulk operation? – mike01010 Aug 30 '23 at 01:28
-
@mike01010 when you say `regular` dataframe are you using pandas or pyspark? – The Singularity Aug 31 '23 at 06:46
-
im referring to just a pandas dataframe. – mike01010 Aug 31 '23 at 20:03
-
@mike01010 for Pandas dataframes you can use PanTab – The Singularity Sep 03 '23 at 05:52
0
From Tableau's official Github: https://github.com/tableau/hyper-api-samples/blob/main/Community-Supported/parquet-to-hyper/create_hyper_file_from_parquet.py

Joseph Luker
- 24
- 2
-
As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jun 08 '22 at 22:43
-
Hi, The above github works but only if we know the schema of the parquet file beforehand. What do we do if we want it to work for any parquet file, without knowing its schema. – Ashish Padhi Aug 01 '22 at 07:33