I would like to know one thing in hive dynamic partition. While doing dynamic partitions we have to set following properties
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
Without those properties we…
I am looking for a way to set application priority for task in hive. When task is committed, I want to set a high priority (like 100) to it. this param can be seen on page:
I am looking for a param like 'set mapreduce.map.memory.mb=4096;',so I can…
On Hive 2.2.0, I am filling an orc table from another source table of size 1.34 GB, using the query
INSERT INTO TABLE TableOrc SELECT * FROM Table; ---- (1)
The query creates TableORC table with 6 orc files, which are much smaller than the block…
I often have a large block of HiveQL that I want to run multiple times with different settings for some variables.
A simple example would be:
set mindate='2015-01-01 00:00:00'
set maxdate='2015-04-01 00:00:00'
select * from my_table where the_date…
I have a new cluster built by cdh 6.3, hive is ready now and 3 nodes have 30GB memory.
I create a target hive table stored as parquet. I put some parquet files downloaded from another cluster to the HDFS directory of this hive table, and when I…
In my core-site.xml, I changed the hadoop.tmp.dir location in another big HHD (/data/hadoop_tmp), this HHD is not linux /tmp location, then formatted my namenode, started my dfs and yarn, I believe it worked.
But the default location appears in the…
How can I remove statement that happened when beeline terminal start?
I have AD jar statement by default when I start beeline and I don't have this jar which case error message :
ADD JAR…
on table will generate table statistics for CBO when:
hive.cbo.enable=true
hive.stats.autogather=true
or do i have to use analyze compute statistics.
Thanks
I ran sqls on hive tez by hive -f xxx.sql --hiveconf hive.session.id=sessionName
but on the yarn resourcemanager displays like this
HIVE-f4ea6c3f-f4cf-4db3-8801-da6f94e20237
HIVE-d920c434-d2e6-4c1c-a506-d69b580960f7
sometimes it displays…
I was told that count(distinct ) may result in data skew because only one reducer is used.
I made a test using a table with 5 billion data with 2 queries,
Query A:
select count(distinct columnA) from tableA
Query B:
select count(columnA)…
I would like to know where the hive-site.xml file configuration is in a Cloudera distribution.
Mainly because I would like to know where I can find out properties…
I am using hive version 3.1.1 and when I try to set hive.stats.fetch.partition.stats=true. I get following error. is hive.stats.fetch.partition.stats is not available in this hive version?
Query returned non-zero code: 1, cause: hive configuration…
I have a query using to much containers and to much memory. (97% of the memory used).
Is there a way to set the number of containers used in the query and limit the max memory?
The query is running on Tez.
Thanks in advance
We have HDP cluster version – 2.6.4
Cluster installed on redhat machines version – 7.2
We noticed about the following issue on the JournalNodes machines ( master machines )
We have 3 JournalNodes machines , and under /tmp folder we have thousands…