1

[There was an initial question at OOM in tez/hive but after some answers and comments a new question with the new knowledge is warranted.]

I have a query with a large LATERAL VIEW. It joins 4 tables, all ORC compressed. The buckets are on the same column. It goes like:

select 
    10 fields from t
  , 80 fields from the lateral view
from
(
  select
    10 fields 
  from
              e (800M rows, 7GB of data, 1 bucket)
    LEFT JOIN m (1M rows, 20MB )
    LEFT JOIN c (2k rows, <1MB)
    LEFT JOIN contact (150M rows, 283GB, 4 buckets)
) t
LATERAL VIEW
    json_tuple (80 fields) as lv

If I remove the LATERAL VIEW, the query completes. If I add the LV, I always end up with:

ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1516602562532_3606_2_03, diagnostics=[Task failed, taskId=task_1516602562532_3606_2_03_000001, diagnostics=[TaskAttempt 0 failed, info=[Container container_e113_1516602562532_3606_01_000008 finished with diagnostics set to [Container failed, exitCode=255. Exception from container-launch.
Container id: container_e113_1516602562532_3606_01_000008
Exit code: 255
Stack trace: ExitCodeException exitCode=255: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
    at org.apache.hadoop.util.Shell.run(Shell.java:844)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:237)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Container exited with a non-zero exit code 255
]], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)

I tried many things:

  • update all tez.grouping.* settings.
  • Add the WHERE condition in the JOIN as well
  • set hive.auto.convert.join.noconditionaltask = false; to make sure does not try to do a map join
  • add distributed by different columns to prevent possible skewness
  • set mapred.map.tasks=100

I already maxed out all java-opts or memory settings.

I need to keep the LATERAL VIEW as it might be possible that some fields are used to filter on them (ie. I can't just do some nice string manipulation to output a csv-like table).

Is there a way to make the Lateral view fit in memory, or split it in multiple mappers? This is the tez UI view:

enter image description here

hdp2.6, 8 datanodes with 32GB Ram

Guillaume
  • 2,325
  • 2
  • 22
  • 40
  • did you try to use the mapred flag for memory? I am just guessing here, set mapreduce.map.memory.mb – hlagos Jan 26 '18 at 15:29
  • @hlagos `mapreduce.map.memory.mb` is set and is equals to `yarn.scheduler.minimum-allocation-mb`. – Guillaume Jan 27 '18 at 18:09
  • @Guillaume, Can you share the picture of DAG after you changed the property "set hive.auto.convert.join.noconditionaltask = false;", Then the DAG should look different. In the Above DAG its still doing mapjoin – BalaramRaju Mar 01 '18 at 21:34
  • @BalaramRaju I just reran the query, with `hive.auto.convert.join.noconditionaltask=false` (I confirmed in the tez view that the setting is applied) but the picture is almost exactly the same: map1 has more tasks (but it might just be because of more data) and reducer 2 does not exist anymore. – Guillaume Mar 02 '18 at 13:47

0 Answers0