0

having an event tables, partitioned by time (year,month,day,hour) Wanna join a few events in hive script that gets the year,month,day,hour as variables, how can you add for example also events from all 6 hours prior to my time without 'recover all...'

10x

harelg
  • 61
  • 1
  • 5

1 Answers1

0

So basically what i needed was a way to use a date that the Hive script receives as parameter and add all partitions 3 hour before and 3 hours after that date, without recovering all partitions and add the specific hours in every Where clause.

Didn't find a way to do it inside the hive script, so i wrote a quick python code that gets a date and table name, along with how many hours to add from before/after. When trying to run it inside the Hive script with: !python script.py tablename ${hivecond:my.date} 3 i was surprised that the variable substition does not take place in a line that starts with !

my workaround was to get the date that the hive script recieved from the log file in the machine using something like: 'cat /mnt/var/log/hadoop/steps/ls /mnt/var/log/hadoop/steps/ |sort -r|head -n 1/stdout' and from there you can parse each hive parameter in the python code without passing it via Hive.

harelg
  • 61
  • 1
  • 5