We are trying to read huge files. But we are not able to extract huge palantir-foundry files data either from tableau live connection or from power BI. So we are trying to connect to Palantir from python. Can any one suggest any other way to extract huge files from palantir. Or how to connnect to palantir from custom python of my local system.
I tried to find some reference in internet, But I always ended up with pyspark style coding of palantir. I found below python code to extract palantir dataframes. But for this also I am facing some issues, like error code 400. Then "Max retries exceeded with url: /foundry-data". our palantir base url is like https://XXXX.palantirfoundry.com/. When I gave this base url of our company, i got 405 error. Can some one help.
import requests
import pandas as pd
def query_foundry_sql(query, token, branch='master', base_url='https://foundry-instance.com') -> (list, list):
"""
Queries the dataproxy query API with spark SQL.
Example: query_foundry_sql("SELECT * FROM `/path/to/dataset` Limit 5000", "ey...")
Args:
query: the sql query
branch: the branch of the dataset / query
Returns: (columns, data) tuple. data contains the data matrix, columns the list of columns
Can be converted to a pandas Dataframe:
pd.DataFrame(data, columns)
"""
response = requests.post(f"{base_url}/foundry-data-proxy/api/dataproxy/queryWithFallbacks",
headers={'Authorization': f'Bearer {token}'},
params={'fallbackBranchIds': [branch]},
json={'query': query})
response.raise_for_status()
json = response.json()
columns = [e['name'] for e in json['foundrySchema']['fieldSchemaList']]
return columns, json['rows']
columns, data = query_foundry_sql("SELECT * FROM `/Global/Foundry
Operations/Foundry Support/iris` Limit 5000",
"ey...",
base_url="https://foundry-instance.com")
df = pd.DataFrame(data=data, columns=columns)
df.head(5)