I am trying to load data from a csv file to Hive. I am using JAVA API of spark for doing that. I want to know how I can load data in hive using spark dataframes.
Here is what I try to make it using JSON:
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.SQLContext;
public class first {
public static void main (String[] args)
{
String inputFileName = "samples/big.txt" ;
String outputDirName = "output" ;
SparkConf conf = new SparkConf().setAppName("org.sparkexample.WordCount").setMaster("local");
JavaSparkContext context = new JavaSparkContext(conf);
@SuppressWarnings("deprecation")
SQLContext sc = new SQLContext(context);
DataFrame input = sc.jsonFile(inputFileName);
input.printSchema();
}
}
But don't know how to make it using csv. I have some idea about Spark-csv provided by databricks.
Kindly let me know how I can do it.