I have been trying to create a Custom Spark Data source using V2 API.
As per Jira, it is in master and available for use. But when tried to use it in my project, compilation failed citing that no such package path exists in v3.3.1.
package closed.source.gs
import org.apache.spark.sql.sources.v2.DataSourceV2
class DefaultSource extends DataSourceV2{
}
Error:
...../gs/DefaultSource.scala:3:37:: object v2 is not a member of package org.apache.spark.sql.sources
[error] import org.apache.spark.sql.sources.v2.DataSourceV2
Any ideas on what happened to V2 APIs? Did spark devs refactor this? Are there any good latest articles that can help me here?
Already read but unhelpful links: https://blog.madhukaraphatak.com/spark-datasource-v2-part-3