0

I have requirement where I am deleting duplicate records from delta file using databricks sql. Below is my query

%sql
delete from delta.`adls_delta_file_path` where code = 'XYZ '

but it gives below error

com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:529) at scala.None$.get(Option.scala:527) at com.privacera.spark.agent.bV.a(bV.java) at com.privacera.spark.agent.bV.a(bV.java) at com.privacera.spark.agent.bc.a(bc.java) at com.privacera.spark.agent.bc.apply(bc.java) at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:252) at com.privacera.spark.agent.bV.a(bV.java) at com.privacera.spark.base.interceptor.c.b(c.java) at com.privacera.spark.base.interceptor.c.a(c.java) at com.privacera.spark.agent.n.a(n.java) at com.privacera.spark.agent.n.apply(n.java) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$3(RuleExecutor.scala:221) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:221) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:89) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:218) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:210) at scala.collection.immutable.List.foreach(List.scala:392) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:210) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:188) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:109) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:188) at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:112) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:134) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:180) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:180) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:109) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:109) at org.apache.spark.sql.execution.QueryExecution.assertOptimized(QueryExecution.scala:120) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:139) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:136) at org.apache.spark.sql.execution.QueryExecution.$anonfun$simpleString$2(QueryExecution.scala:199) at org.apache.spark.sql.execution.ExplainUtils$.processPlan(ExplainUtils.scala:115) at org.apache.spark.sql.execution.QueryExecution.simpleString(QueryExecution.scala:199) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:260) at org.apache.spark.sql.execution.QueryExecution.explainStringLocal(QueryExecution.scala:226) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:123) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:273) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:104) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:223) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3823) at org.apache.spark.sql.Dataset.(Dataset.scala:235) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:104) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:101) at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:689) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:854) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:684) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at com.databricks.backend.daemon.driver.SQLDriverLocal.$anonfun$executeSql$1(SQLDriverLocal.scala:91) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.TraversableLike.map(TraversableLike.scala:238) at scala.collection.TraversableLike.map$(TraversableLike.scala:231) at scala.collection.immutable.List.map(List.scala:298) at com.databricks.backend.daemon.driver.SQLDriverLocal.executeSql(SQLDriverLocal.scala:37) at com.databricks.backend.daemon.driver.SQLDriverLocal.repl(SQLDriverLocal.scala:145) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$11(DriverLocal.scala:529) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:50) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:50) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:506) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:611) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:603) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:522) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:557) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:427) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:370) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:221) at java.lang.Thread.run(Thread.java:748) at com.databricks.backend.daemon.driver.SQLDriverLocal.executeSql(SQLDriverLocal.scala:130) at com.databricks.backend.daemon.driver.SQLDriverLocal.repl(SQLDriverLocal.scala:145) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$11(DriverLocal.scala:529) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:50) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:50) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:506) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:611) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:603) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:522) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:557) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:427) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:370) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:221) at java.lang.Thread.run(Thread.java:748)

Any suggestion here .

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
exploding_data
  • 317
  • 1
  • 14

2 Answers2

0

com.databricks.backend.common.rpc.DatabricksExceptionsSQLExecutionException: java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:529)

And firstly, convert your delta file to delta table in databricks and enable the support for SQL commands by Configuring SparkSession then delete the duplicate records from delta table

For more understanding on conversion of file to delta table refer this document by Microsoft Delta Lake quickstart

Pratik Lad
  • 4,343
  • 2
  • 3
  • 11
0

This issue was related to cluster configuration .We have databricks cluster managed by Privecera.There are certain configuration in cluster that privecera blocks.We tried running on cluster without privecera and it worked.Raised a request with Privecera to find out actual cause.

Thanks for your suggestion

exploding_data
  • 317
  • 1
  • 14
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jul 21 '22 at 01:30