Questions tagged [palantir-foundry]

Palantir Foundry is a web-based data analytics and decision modeling SaaS platform. Use this tag for questions about building your own models in Foundry using Python, R, or SQL or working with the Foundry API.

Palantir Foundry is a web-based data analytics and decision modeling SaaS platform. Use this tag for questions about building your own models in Foundry using Python, R, or SQL or working with the Foundry API.

731 questions
2
votes
1 answer

Foundry - Using tranforms verbs inmemory datastore to test incremental transforms

Im using the in-memory datastore approach to test incremental transform and im receiving the error below. Any idea what i might be doing wrong. def test_transformdata_incr(spark_session): df_input = spark_session.createDataFrame([ (1,…
S J
  • 21
  • 1
2
votes
1 answer

How to add column description with transform_df in Foundry?

Normally, I can add a column description with transform like this: from transforms.api import Input, Output, transform from utils import COLUMN_DESCRIPTIONS @transform( output=Output("/Shared/output"), …
huy
  • 1,648
  • 3
  • 14
  • 40
2
votes
1 answer

How to find the list of valid arguments for phonograph aggregations?

Phonograph2 is based off ElasticSearch but has a few differences that can sometime throw the following error message: errorCode: INVALID_ARGUMENT errorName: Conjure:UnprocessableEntity How to find out the set of valid properties that can be used…
2
votes
1 answer

Is it possible to specify the name of the output file in a Foundry transform?

I have a PySpark transform in Palantir Foundry that's outputting to a csv file for export into other systems. Currently, using the write_dataframe method the name of the file looks like…
hjones
  • 168
  • 1
  • 8
2
votes
1 answer

Best approach for geospatial indexes in Palantir Foundry

What the recommended approach is for building a pipeline that needs to find a point contained in a polygon (shape) in Planatir Foundry? In the past, this has been pretty difficult in Spark. GeoSpark has been pretty popular, but can still lag. If…
2
votes
2 answers

How to test a transformation in Palantir Foundry?

We try to create a test function for the whole transformation. import os from transforms.verbs.testing.TransformRunner import TransformRunner from transforms.api import Pipeline from .myproject.datasets import my_transform # This assumes your test…
2
votes
1 answer

How do I build a large incremental output dataset from an existing large incremental input dataset in Foundry?

I have an 80TB date-partitioned dataset in Palantir Foundry, which ingests 300-450GB of data in an incremental Append transaction every 3 hours. I want to create an incremental transform using this as an input. However, the dataset is too large to…
2
votes
1 answer

Equivalency Foundry profiles to AWS Glue worker types

When I am working with Foundry, there are some options to configure my job profile: num_executors, driver_memory, executor_memory, etc. I am wondering which is the equivalence of these profiles to the worker types in AWS. If I use AWS Glue Studio, I…
2
votes
3 answers

Slate - creating a TypeScript Function to filter and return an object with max property value

take a look at NEW RELATED QUESTION: I want to filter an object set to retrieve the largest number of a column. i do not know how to solve it. i try with max etc. But i think it is a skill problem. Here is my code so far: @Function() public…
Marco
  • 91
  • 4
2
votes
2 answers

Slate - Queries - Query on Query?

How can I create an query on an existing query? I tried multiple versions. SELECT * FROM {{q_....}} does not works
Marco
  • 91
  • 4
2
votes
1 answer

How to drop duplicates in Workshop Object table

in a Workshop App, can duplicates (based on some specific columns) be eliminated from an Object Table? And how?
Jresearcher
  • 297
  • 3
  • 13
2
votes
1 answer

Spreadsheet uploading appropriate for business/end-users in Foundry

Does Foundry have native support for uploading and appending spreadsheets (identical schema) to one dataset, with an interface appropriate for business/end-users? I'm evaluating a user workflow that involves receiving tabular spreadsheets ad-hoc and…
L99
  • 169
  • 9
2
votes
3 answers

How to union several datasets with the same schema in Palantir Foundry?

I have several datasets I want to union in Palantir Foundry. I know what the datasets are ahead of time. The schema of all the datasets is the same (i.e. they have the same column names, and column types). What is the best way to combine (union)…
domdomegg
  • 1,498
  • 11
  • 20
2
votes
1 answer

Finding number of rows in a dataframe when previewing a large dataset (Foundry Platform)

Is there a way in a Foundry Code Repository to be able to print the shape of DataFrame, like in pandas how one can do df.shape()? I am interested in getting the correct number of rows in the dataset. I am using this function to print the shape but…
2
votes
1 answer

In Foundry, how can I Hive partition with only 1 parquet file per value?

I'm looking to improve the performance on running filtering logic. To accomplish this, the idea is to do hive partitioning setting by setting the partition column to a column in the dataset (called splittable_column). I checked and the cardinality…