Highest Voted 'iceberg' Questions

2

votes

3 answers

How to actually delete files in Iceberg

I know that in Apache Iceberg I can set limits on number and age of snapshots, and that "deleting" data from the table does not result in underlying data removal, it simply masks or deletes tracking information. I would like to actually delete the…

iceberg

asked Oct 20 '22 at 14:58

zachd1_618

4,210
6
34
47

2

votes

0 answers

Registering Iceberg Day Partition Transform UDFs in Spark

I am looking to apply Iceberg's same hidden day and year partitioning to a DataFrame in the same way as we apply the bucket partitioning. https://iceberg.apache.org/docs/latest/spark-writes/. Iceberg provides IcebergSpark.registerbucketUdf; I'm…

apache-spark user-defined-functions iceberg

asked Oct 18 '22 at 17:29

zachd1_618

4,210
6
34
47

2

votes

2 answers

Getting error when querying iceberg table via Spark thrift server using beeline client?

I am trying to query iceberg table (External table with data in S3 & Metadata in Hivemetastore) using spark thrift server coming as part of Spark. I am able to query non iceberg tables but when I query iceberg table I am getting below error. Can we…

apache-spark spark-thriftserver iceberg

asked Jun 14 '22 at 16:23

Bill Goldberg

1,699
5
26
50

2

votes

2 answers

what the difference between sparksessioncatalog and sparkcatalog in iceberg

As the title says. question comes from: I connect to spark-sql with iceberg catalog like this: bin/spark-sql \ --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \ --conf…

apache-spark iceberg

asked May 16 '22 at 07:38

ElapsedSoul

725
6
18

2

votes

2 answers

Apache Iceberg table format to ADLS / azure data lake

I am trying to find some integration to use iceberg table format on adls /azure data lake to perform crud operations. Is it possible to not use any other computation engine like spark to use it on azure. I think aws s3 supports this usecase. Any…

amazon-s3 azure-data-lake azure-data-lake-gen2 trino iceberg

asked Jan 19 '22 at 11:42

John

35
5

2

votes

1 answer

Writing multiple partition specs to Apache Iceberg table

I would like to write an Iceberg table with a different partition spec than the default table settings so that when I run data compaction the data would be compacted according to the default spec (as possible with the write-format config) For…

apache-spark apache-spark-sql iceberg

asked Jan 02 '22 at 13:05

Shimon Steinitz

21
3

2

votes

0 answers

Athena Iceberg Slow On Empty Table

I am looking at the new Iceberg Tables for AWS Athena. I'm hoping to move my data lake over to Iceberg so that I can significantly reduce the complexity of table partition management and hopefully get some better performance. I created a test…

amazon-athena iceberg

asked Dec 27 '21 at 14:30

micah

7,596
10
49
90

2

votes

1 answer

Unable to write data in table by Apache Iceberg using Spark

I am new to Apache Iceberg. I want to perform read and write operation using Apache Iceberg. I am using Spark 3.0.0. code: System.setProperty("hadoop.home.dir","C:\\hadoop" ) val conf = new SparkConf() …

apache-spark apache-spark-sql iceberg

asked Jul 11 '21 at 14:17

Santlal J. Gupta

91
8

2

votes

3 answers

How to execute a Spark SQL merge statement on an Iceberg table in Databricks?

I'm trying to get Apache Iceberg set up in our Databricks environment and running into an error when executing a MERGE statement in Spark SQL. This code: CREATE TABLE iceberg.db.table (id bigint, data string) USING iceberg; INSERT INTO…

azure apache-spark apache-spark-sql databricks iceberg

asked Jun 08 '21 at 19:22

Aaron Kub

21
2

2

votes

0 answers

Flink's hive streaming vs iceberg/hudi/delta

There are some open sourced datake solutions that support crud/acid/incremental pull,such as Iceberg, Hudi, Delta. I think they have done what flink's hive streaming wants to do and even do better, So, I would ask what the real power of flink's hive…

apache-flink delta-lake apache-hudi iceberg

asked Nov 28 '20 at 05:59

Tom

5,848
12
44
104

1

vote

1 answer

Write to Iceberg/Glue table from local PySpark session

I want to be able to operate (read/write) to an Iceberg table hosted on AWS Glue, from my local machine, using Python. I have already: Created an Iceberg table and registered it on AWS Glue Populated the Iceberg table with limited data using…

apache-spark pyspark aws-glue iceberg apache-iceberg

asked Aug 10 '23 at 10:29

Luiz Tauffer

463
6
17

1

vote

1 answer

Creating an Iceberg Table on S3 Using PyIceberg and Glue Catalog

I am attempting to create an Iceberg Table on S3 using the Glue Catalog and the PyIceberg library. My goal is to define a schema, partitioning specifications, and then create a table using PyIceberg. However, despite multiple attempts, I haven't…

python boto3 aws-glue iceberg apache-iceberg

asked Aug 08 '23 at 03:26

Lew

11
4

1

vote

0 answers

Dataproc Spark Job sometimes gets java.lang.ClassNotFoundException for iceberg jar file

The dataproc cluster creation and spark job submission are scheduled every hour then the cluster will be deleted after the job completed. Sometimes the job is failed due to java.lang.ClassNotFoundException:…

google-cloud-dataproc iceberg

asked Jul 30 '23 at 23:59

suisen

53
9

1

vote

0 answers

Unable to install iceberg extensions for pyspark and use MERGE INTO

I have a python virtual environment in which I have added pyspark v3.4.1. I have run the following command to install the iceberg package- spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:1.3.0\ --conf…

apache-spark pyspark iceberg

asked Jul 23 '23 at 12:59

Sharan Kumar

139
1
2
13

1

vote

0 answers

Spark job very slow at sort stage for iceberg insert operation with local sort

I am inserting data from one iceberg table to another iceberg table with local sort defined on destination table alter table schema1.test_iceberg_ordered1 WRITE DISTRIBUTED BY PARTITION LOCALLY ORDERED BY example_event_cd NULLS LAST while if I do…

apache-spark pyspark iceberg

asked Jun 23 '23 at 12:27

Atif

2,011
9
23

Questions tagged [iceberg]