I am very new to this whole world of "big data" tech, and recently started reading about Spark. One thing that keeps coming up is SparkSQL, yet I consistently fail to comprehend was exactly it is.
Is it supposed to convert SQL queries to MapReduce that do operations on the data you give it? But aren't dataframes already essentially SQL tables in terms of functionality?
Or is it some tech that allows you to connect to an SQL database and use Spark to query it? In this case, what's the point of Spark in here at all - why not use SQL directly? Or is the point that you can use your structured SQL data in combination with the flat data?
Again, I am emphasizing that I am very new to all of this and may or may not talking out of my butt :). So please do correct me and be forgiving if you see that I'm clearly misunderstanding something.