Questions tagged [databricks-sql]

Questions about Databricks SQL

For questions about Databricks SQL - a serverless data warehouse on the Databricks Lakehouse Platform that lets you run all your SQL and BI applications at scale with improved performance, a unified governance model, open formats and APIs, and your tools of choice

357 questions
1
vote
1 answer

Cast string to a timestamp in Databricks

When I try to cast a string column with cast(my_value as timestamp) (e.g. values equal 4/24/2020 14:43:54 or 12/5/2020 14:43:54) with Databricks SQL I got the following error : CAST_INVALID_INPUT] The value '4/24/2020 14:43:54' of the type "STRING"…
alxsbn
  • 340
  • 2
  • 14
1
vote
1 answer

writing UnitTest cases for functions which uses spark.sql

We have unity catalog configured in our databricks environment.We have some functions which will connect with the tables using spark.sql(" the sql code") and retrieve data. we need want to write test cases for those functions. we were able to mock…
1
vote
1 answer

Databricks CREATE VIEW equivalent in PySpark

Can someone let me know what the equivalent of the following CREATE VIEW in Databricks SQL is in PySpark? CREATE OR REPLACE VIEW myview as select last_day(add_months(current_date(),-1)) Can someone let me know the equivalent of the above is in…
Patterson
  • 1,927
  • 1
  • 19
  • 56
1
vote
1 answer

Databricks: SQL Pivot does not work - but Python does

I am trying to pivot a SQL table in Databricks on the FeatureName column using the FeatureDescription column as the value. But somehow the SQL query that I have resolves to null values being populated across the table. Here is the original…
Ronak Vachhani
  • 214
  • 2
  • 14
1
vote
1 answer

Why I'm not able to see reserve catalog under 'show catalogs' command on databricks notebook but able to see on databricks SQL editor?

Few days back we were able to see reserve catalog(spark_catalog) in databricks notebook. Not sure what happened, suddenly our process is failing due to not getting reserve catalog name under 'show catalogs' command in notebooks. And when I try to…
1
vote
3 answers

Spark SQL Broadcase Hint placement

I am trying to understand that if I use a small table alias multiple times (with diff where clause ) in my spark sql query , then I have to use broadcase hint multiple times OR just one time in any of the place . Original query : select …
1
vote
1 answer

How do I use latest version in table_changes function for a delta table?

I have a query which gives me the latest version of a delta table in the following format (Query 1): %sql SELECT VERSION FROM (DESCRIBE HISTORY dbo.customers LIMIT 1) This gives me output like this: I have a query for table_changes which I'm using…
1
vote
0 answers

Converting a column into an IDENTITY column in Databricks Delta table while keeping old values

Here's my use case: I'm migrating out of an old DWH, into Databricks (DBR 10.4 LTS). When moving dimension tables into Databricks, I'd like old SKs (surrogate keys) to be maintained, while creating the SKs column in Databricks Delta as an IDENTITY…
1
vote
1 answer

Convert spark sql to python spark / Databricks pipeline event logs

I have the following sql statement to query the databricks pipeline event logs and it works. I tried to rewrite it into a python code, but I failed. Could somebody provide me any advice? Many thanks!! SELECT timestamp, details:user_action:action,…
1
vote
1 answer

Failed begin transaction with Databricks warehouse

I am trying to run multiple queries in one Databricks transaction. I am using golang for that. But getting not implemented error. When I look at the library code: // Not supported in Databricks. func (c *conn) Begin() (driver.Tx, error) { return…
Gilo
  • 640
  • 3
  • 23
1
vote
1 answer

Object of type datetime is not JSON serializable in Airflow DatabricksSqlOperator

I am trying to fetch some data in Airflow using DatabricksSqlOperator from a Databricks delta tables using : select = DatabricksSqlOperator( databricks_conn_id=databricks_id, http_path=http_path, task_id="select_data", sql="select *…
vaibhav
  • 45
  • 6
1
vote
1 answer

Azure Databricks JpaRepository - [Databricks][JDBC](10220) Driver does not support this optional feature

In my java 17 spring boot application I am trying to connect to an azure databrick and use a JpaRepository. I get the following error message when using the repository: CannotCreateTransactionException: Could not open JPA EntityManager for…
LDK
  • 197
  • 1
  • 10
1
vote
1 answer

Why is Databricks insert using specified columns failing?

I'm trying to do an insert into a table using only specified columns as described here: https://spark.apache.org/docs/3.1.2/sql-ref-syntax-dml-insert-into.html If I run the queries below, I get the error shown below. What do I need to do to get…
John
  • 3,458
  • 4
  • 33
  • 54
1
vote
2 answers

Databricks named_struct strange behaviour with backslash in value

How is it possible to handle backslash as is with named_struct? I am trying to achieve the return value of: {"test": "DE844\/374"} a) with this: SELECT named_struct('test', "DE844\/374"); return value is: {"test": "DE844/374"} b) with this: SELECT…
1
vote
2 answers

Error while creating the database in azure databricks using sql script, runtime exception:Unable to instantiate

I'm trying to create the database in azure databricks using sql script. %sql CREATE DATABASE DB_TEST; failed with below error org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate…