Questions tagged [apache-hive]

Apache Hive supports analysis of large datasets stored in Hadoop's HDFS and compatible file systems such as Amazon S3 filesystem. It provides an SQL-like language called HiveQL with schema on read and transparently converts queries to map/reduce, Apache Tez[7] and Spark jobs. All three execution engines can run in Hadoop YARN. To accelerate queries, it provides indexes, including bitmap indexes.

Few features:-

1.Indexing to provide acceleration, index type including compaction and Bitmap index as of 0.10, more index types are planned. 2.Different storage types such as plain text, RCFile, HBase, ORC, and others. 3.Metadata storage in an RDBMS, significantly reducing the time to perform semantic checks during query execution. 4.Operating on compressed data stored into the Hadoop ecosystem using algorithms including DEFLATE, BWT, snappy, etc. 5.Built-in user defined functions (UDFs) to manipulate dates, strings, and other data-mining tools. Hive supports extending the UDF set to handle use-cases not supported by built-in functions. 6.SQL-like queries (HiveQL), which are implicitly converted into MapReduce or Tez, or Spark jobs.

96 questions
1
vote
2 answers

unable to create partitions in hive

I am unable to create the partition into a new table from the table which is already present on hive. The query that I am running on hive after the Table creation is INSERT INTO TABLE ba_data.PNR_INFO1_partitioned PARTITION(pnr_create_dt) select *…
Avinash
  • 127
  • 2
  • 13
1
vote
2 answers

Partition and Bucket ORC Tables

I understand that when you create ORC tables, it will improve the speed dramatically. However, can we improve it further by partitioning and bucketing an ORC table? If so, how to do partitioning and bucketing in an existing ORC table?
iPhoneJavaDev
  • 821
  • 5
  • 33
  • 78
1
vote
1 answer

HiveQL - Query Number of Entries over fixed unit of time

I have a table that is similar to the following: LOGIN ID (STRING): TIME_STAMP (STRING HH:MM:SS) BillyJoel 10:45:00 PianoMan 10:45:30 WeDidnt 10:45:45 StartTheFire 10:46:00 AlwaysBurning …
knowads
  • 705
  • 2
  • 7
  • 24
1
vote
1 answer

Concatenation of variable value and string in apache hive

In apache Hive CLI or Beeline CLI, I need to concatenate value of a variable with a string. Is it possible to do so? Example: set path_on_hdfs="/apps/hive/warehouse/my_db.db"; how to get something like '${hivevar:path_on_hdfs}/myTableName'?
MehrdadAP
  • 417
  • 4
  • 11
1
vote
1 answer

Insert data into avro-formatted, partitioned hive table with data from HDFS

I have created a hive table named employee (avro formatted) with partition on department. I have the avro dataset in my HDFS location. My dataset is also having department id. I would like to import the data into Hive table with the data from…
Sivakumar
  • 344
  • 3
  • 8
1
vote
0 answers

Hive 1.0 - REMOTE MySQL Metastore configuration

on EMR 4.2 - Hive 1.0 version, I want to connect to a remote mysql metastore. hive.metastore.uris thrift://hive-metastore-remotemysql.aws.com:9083 JDBC connect string for a JDBC…
user3294904
  • 444
  • 8
  • 26
1
vote
1 answer

Hive Job is processing is stopped after processing sometime

I am running hive on a standalone machine. Hadoop is running in pseudo-distributed mode. I am running hive query which joins two tables (one table has 7M and another has 51M records and each containing 8 columns). After processing some time, Mapper…
Santhosh Tangudu
  • 759
  • 9
  • 19
1
vote
1 answer

Read Apache HIVE table from Informatica

I have need of regarding HIVE table using Informatica and then write the data after some transformations to MS SQL table. Can anyone please let me know what is the driver / connector required to connect to Apache HIVE from Informatica. Is there any…
Koushik Chandra
  • 1,565
  • 12
  • 37
  • 73
1
vote
2 answers

How to load Bucketed HIVE table using LOAD DATA LOCAL INPATH

Can we load a Bucketed HIVE table using LOAD DATA LOCAL INPATH ... command. I have executed it for a sample file, but data values are inserted as NULL. hduser@ubuntu:~$ cat /home/hduser/Desktop/hive_external/hive_external/emp2.csv …
Koushik Chandra
  • 1,565
  • 12
  • 37
  • 73
1
vote
1 answer

Manage reports, when our database is Cassandra ...Spark or Solr...or BOTH?

My db is Cassandra (datastax enterprise => linux). Since it doesn't support group-by, aggregate and etc. for reporting, according to its fundamentals, it's not a good decision to use Cassandra, downright. I googled about this deficit and found some…
Elnaz
  • 2,854
  • 3
  • 29
  • 41
1
vote
0 answers

DML in HIVE using the Java API

I'm writing an application whitch does DDL and DML in Hive tables. For DDL is use the Hive-Class org.apache.hadoop.hive.ql.metadata.Hive which is puplic since Version 1.0. It's perfect for DDL and i think faster than JDBC and other options. But i…
Daniel
  • 1,027
  • 1
  • 8
  • 23
1
vote
2 answers

Create Table Syntax Error in Hive

I am trying to create a table in hive using following query: create table customers(Cust_ID INT, Cust_Name STRING, Dealer_ID INT, Country STRING, State STRING, City STRING, ZipCode INT) row format delimited fields terminated by ';' …
Mrudula
  • 11
  • 1
  • 3
1
vote
1 answer

Hive table creation using hue interface using space delimiter

While creating table using a file in hue-hive interface we have to specify a delimiter. (Tab, Space, Comma etc.) . But my file delimited by one or more spaces. How to specify delimiter to delimit by one or more spaces.
Bruce
  • 8,609
  • 8
  • 54
  • 83
1
vote
0 answers

PIG UDF's in Hive

Apologies for no code on this as this is a Generic Question - Can PIG UDF's be consumed from Hive? Specifically can the PIG Apache DataFu (http://datafu.incubator.apache.org/) UDF'S be used in HIVE? I saw a Jira about using HIVE UDF's in PIG -…
myloginid
  • 1,463
  • 2
  • 22
  • 37
1
vote
1 answer

How to query on a specific date and time range using hive query language taking input from the user?

I have a table in a database in hive. The table is partitioned based on year month and day. My query looks something like this select entity1,entity2 from table_t INNER JOIN tab_roll.cha alias2 ON alias1.sid = alias2.sid INNER JOIN…
kRazzy R
  • 1,561
  • 1
  • 16
  • 44