Highest Voted 'spark-jdbc' Questions

2

votes

2 answers

Check table exists Spark jdbc

I am reading some data into a data frame from Microsoft SQL server using Spark JDBC. And when the table does not exist (for example, it was dropped accidentally) I get an exception: com.microsoft.sqlserver.jdbc.SQLServerException: Invalid object…

asked Apr 23 '19 at 15:00

Cassie

2,941
8
44
92

2

votes

1 answer

Spark JDBC read ends up in one partition only

I have the below code snippet for reading data from a Postgresql table from where I am pulling all available data i.e. select * from table_name : jdbcDF = spark.read \ .format("jdbc") \ .option("url", self.var_dict['jdbc_url']) \ …

apache-spark pyspark apache-spark-sql spark-jdbc

asked Feb 28 '19 at 13:57

Abhi

163
2
14

2

votes

1 answer

How to specify Trust store and trust store type for Spark JDBC connection

I am new to Spark and we are currently using the spark-java to create orc files from Oracle database. I was able to configure the connection with sqlContext.read().jdbc(url,table,props) However, I couldn't find any way in the properties to specify…

oracle apache-spark truststore spark-jdbc

asked Feb 06 '18 at 20:36

Sai Kumar

112
2
11

1

vote

0 answers

Creating partitioned table in postgres via Spark JDBC write

I want to write a dataframe to a postgres table via spark jdbc connector. The table i am writing to in postgres needs to be partitioned by a certain column. This is currently how i am writing it. I am running spark 3.2.3 and postgres 11 val username…

postgresql scala apache-spark jdbc spark-jdbc

asked Aug 22 '23 at 07:48

sanchit08

119
1
7

1

vote

0 answers

How to write a Spark Dataframe into multiple JDBC table based on a column

I'm working with a batch Spark pipeline written in Scala (v2.4). I would like to save a dataframe into a Postgresql database. However, instead of saving all rows into a single table in the database, I want to save them to multiple tables based on…

dataframe apache-spark pyspark apache-spark-sql spark-jdbc

asked Nov 17 '22 at 16:35

IllSc

1,419
3
17
24

1

vote

0 answers

Override JdbcUtils`saveTable` method

How to extend the spark-jdbc sink and override saveTable method, i wanted to use one transaction for the entire dataframe batch instead of separate transactions per…

apache-spark apache-spark-sql spark-jdbc

asked Oct 20 '22 at 03:46

StarScream

223
2
12

1

vote

1 answer

Schema capitalization(uppercase) problem when reading with Spark

Using Scala here: Val df = spark.read.format("jdbc"). option("url", ""). option("dbtable", "UPPERCASE_SCHEMA.table_name"). option("user", "postgres"). option("password", ""). option("numPartitions", 50). …

scala apache-spark apache-spark-sql spark-jdbc

asked Oct 17 '22 at 13:41

Wonseok Choi

99
8

1

vote

0 answers

How to get Spark metric for Spark JDBC writer

Versions: Scala - 2.11, Spark: 2.4.4 To implement this, I have created my own implementation of SparkListener and added this during creating Spark session. class SparkMetricListener extends SparkListener { ... override def onTaskEnd .. { .. //use…

scala apache-spark spark-jdbc

asked Jul 06 '22 at 15:02

VimalK

65
1
8

1

vote

1 answer

Spark SQL : INSERT Statement with JDBC does not support default value

I am trying to read/write data from other databases using JDBC. just following the doc https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html But I found Spark SQL does not work well with Default value or AUTO_INCREMENT CREATE TEMPORARY…

apache-spark apache-spark-sql spark-jdbc

asked Mar 29 '22 at 03:24

shiyuhang

31
5

1

vote

1 answer

Spark JDBC "batch size" effect on insert

I wanted to know what effect the batchsize option has on an insert operation using spark jdbc. Does this mean a bulk insert using one insert command similar to a bulk insert or a batch of insert commands that gets committed at the end? Could someone…

apache-spark spark-jdbc

asked Nov 24 '21 at 21:26

justlikethat

329
2
12

1

vote

0 answers

Issue when reading Teradata table via Apache Spark

I'm reading a Teradata table using Spark. Here is my code: spark.read.format("jdbc") .option("url", "jdbc:teradata://127.0.0.1/database=test, TMODE=TERA") .option("username", "test") .option("password", "test") …

apache-spark teradata spark-jdbc

asked Nov 02 '21 at 14:28

Finkelson

2,921
4
31
49

1

vote

1 answer

Spark JDBC UpperBound

jdbc(String url, String table, String columnName, long lowerBound, long upperBound, int numPartitions, …

apache-spark jdbc spark-jdbc

asked Jul 08 '21 at 18:52

Akhil

63
5

1

vote

1 answer

How to register a JDBC Spark dialect in Python?

I am trying to read from a databricks table. I have used the url from a cluster in the databricks. I am getting this error: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int. After these statements: jdbcConnUrl=…

python jdbc pyspark dialect spark-jdbc

asked Jun 17 '21 at 05:18

Samruddhi Padture

21
3

1

vote

2 answers

Why does PostgreSQL say FATAL: sorry, too many clients already when I am nowhere close to the maximum connections?

I am working with an installation of PostgreSQL 11.2 that periodically complains in its system logs FATAL: sorry, too many clients already despite being no-where close to its configured limit of connections. This query: SELECT…

python postgresql apache-spark pyspark spark-jdbc

asked Feb 13 '21 at 06:09

Eddie

53,828
22
125
145

1

vote

0 answers

How to connect to SQL Server using Pyspark which is running on GCP without using Secure Sockets Layer?

I am trying to connect to a SQL Server database using PySpark as below: from pyspark.sql import SparkSession import traceback def connect_and_read(spark: SparkSession): url =…

apache-spark pyspark google-cloud-dataproc mssql-jdbc spark-jdbc

asked Feb 09 '21 at 14:13

Metadata

2,127
9
56
127

Questions tagged [spark-jdbc]