Questions tagged [apache-hive]

Apache Hive supports analysis of large datasets stored in Hadoop's HDFS and compatible file systems such as Amazon S3 filesystem. It provides an SQL-like language called HiveQL with schema on read and transparently converts queries to map/reduce, Apache Tez[7] and Spark jobs. All three execution engines can run in Hadoop YARN. To accelerate queries, it provides indexes, including bitmap indexes.

Few features:-

1.Indexing to provide acceleration, index type including compaction and Bitmap index as of 0.10, more index types are planned. 2.Different storage types such as plain text, RCFile, HBase, ORC, and others. 3.Metadata storage in an RDBMS, significantly reducing the time to perform semantic checks during query execution. 4.Operating on compressed data stored into the Hadoop ecosystem using algorithms including DEFLATE, BWT, snappy, etc. 5.Built-in user defined functions (UDFs) to manipulate dates, strings, and other data-mining tools. Hive supports extending the UDF set to handle use-cases not supported by built-in functions. 6.SQL-like queries (HiveQL), which are implicitly converted into MapReduce or Tez, or Spark jobs.

96 questions
1
vote
1 answer

What is the best way to integrate SAS with Hadoop without losing the parallel processing capacity of Hadoop

I am trying to understand the integration between SAS and Hadoop. From what I understand, SAS processes like proc sql can only work against a SAS data set, I cannot issue proc sql against a text file on a hadoop node. Is it correct? If yes, then…
Victor
  • 16,609
  • 71
  • 229
  • 409
1
vote
1 answer

default.fs.name and hive.metastore.warehouse.dir do not conflict

Hi When I try to run the below command Load data Inpath '/data' into Table Tablename; in hive shell it throws following error Move from: hdfs://hadoopcluster/data to: file:/user/hive/warehouse/Tablename is not valid. Please check that values for…
wazza
  • 770
  • 5
  • 17
  • 42
1
vote
1 answer

Hive LATERAL VIEW and WHERE Clause using Sub query

I'm looking for a way to optimize my query. We have a table with events called lea, with a column app_properties, which are tags, stored as a comma separated string. I would like to select all the events that match the result of a query that select…
Bas
  • 597
  • 5
  • 10
  • 22
1
vote
1 answer

which slave we have to upload the data into hadoop cluster

we have set up the hadoop cluster with 2 machines, we are trying to implement cluster in our real time projects , we need information in a multiple node cluster about uploading the data , suppose if i have 9 data nodes , which slave node we…
srikanth
  • 61
  • 6
0
votes
1 answer

Difference in HDFS data size and Hive Data Size

I have a table in Hive. When I ran the command show tblproperties myTableName, It gives below result: numFiles 12 numRows 1688092 rawDataSize 934923162 totalSize 936611254 That means rawDataSize is 934.92 MB and totalSize…
Sandeep Singh
  • 7,790
  • 4
  • 43
  • 68
0
votes
1 answer

Hive date format matching

How can i match particular date format in hive query, as i have to get those rows having date format other than max of rows. Eg. My max of rows have date format as MM/dd/yyyy and i have to list all rows other than above…
0
votes
1 answer

Hive command is giving error

I have downloaded neccessary jar files. I have also changed .bashrc configurations and have added CLASSPATH in hadoop-env.sh. Still it is giving below error: Exception in thread "main" java.lang.NoClassDefFoundError:…
0
votes
0 answers

When a hive query is executed,It can not be generated application on large size table

I have a one problem in hive. When a hive query is executed, large tables can not be generated application Not appears app on yarn monitoring web page. and beeline is still ready But the small size table works normally. I do not even know why an…
Lee. YunSu
  • 416
  • 6
  • 21
0
votes
0 answers

Using Apache Hive feature masking and filtering of rows/columns

Recently I've found out that masking and filtering of rows/columns feature was added in Hive. https://issues.apache.org/jira/browse/HIVE-13125 But still there is no documentation about it. During my research I've found out that we can use this…
Vlad Gudikov
  • 103
  • 1
  • 7
0
votes
1 answer

Using Python UDF with Hive

I am trying to learn using Python UDF's with Hive. I have a very basic python UDF here: import sys for line in sys.stdin: line = line.strip() print line Then I add the file in Hive: ADD FILE /home/hadoop/test2.py; Now I call the Hive…
Rakesh Adhikesavan
  • 11,966
  • 18
  • 51
  • 76
0
votes
1 answer

Profiling Apache Hive CLI

This link Profling Hive CLI provides an instruction on how to profile the Hive CLI using Java mission control. And the steps are Create a directory to save profiler outputs:mkdir $HOME/profiles Create an alias so that it would be easier to…
Lawan subba
  • 610
  • 3
  • 7
  • 19
0
votes
0 answers

Does Hive Float primitive can support more than two precision after decimal point?

Hive support only one precision after decimal point. Can we change the precision value of float in hive? If not can we override the hive float function. Ex: Hive support float as below create table test(amount float); amount ------ …
marjun
  • 696
  • 5
  • 17
  • 30
0
votes
2 answers

Flink 1.1.3 Interact with Hive 2.1.0

Excuse me for the inconvenience but I did not find an answer in the Doc or Internet. I have a platform with : Hadoop 2.7.3 Hive 2.1.0 Hbase 1.2.4 Spark 1.6 I have integrated Flink 1.1.3 to use it on local mode and Yarn mode. I'm interested to use…
0
votes
1 answer

How to load the array of sub documents data from mongodb to hive

We are trying to use the mongodb data in hive, document has array of subdocuments.. How can I load the complex data into hive? Here is the sample json: { "_id" : ObjectId("582c8cb9913e2f21e062aaa6"), "acct" : NumberLong(12345), "history"…
Maddy
  • 109
  • 1
  • 8
0
votes
1 answer

Hive JDBC error: java.lang.NoSuchFieldError: HIVE_CLI_SERVICE_PROTOCOL_V7

I'm trying to create a connection via JDBC to Impala using the Hive2 connector. But I'm getting this error: Exception in thread "main" java.lang.NoSuchFieldError: HIVE_CLI_SERVICE_PROTOCOL_V7 at…