How to distinctly select a column while selecting other columns

Question

I have a table that looks something like this in hive. What I want to do is run a query such that every 3 hours, I look at unique workerUUIDs and do some manipulation on them. So what I want to do is between now and 3hrs before

Capture all the unique workerUUIDs
Select * from these workerUUIDs

I am using hive to run this query and the table has a few million entries every three- six hours. What is the best way to write this query?

--------------------------------------------
| workerUUID | City |  Debt  | TestN| LName| 
|------------------------------------------|
| 1234       |  SF  |  100k  | 23   |  Nil |
|-------------------------------------------
| 6789       |  NY  |  150k  | 34   |  Fa  |
|------------------------------------------|
| 1234       |  SF  |  10k   | 45   |  Na  |
--------------------------------------------
| 6789       |  NY  |  1k    | 13   |  Nil |
|-------------------------------------------
| 6789       |  SF  |  150k  | 34   |  Nil |
|------------------------------------------|
| 8999       |  IN  |  10k   | 45   |  Na  |
--------------------------------------------

Basically I want to do something like

 select City, Debt, TestN where workerUUID = '1234'
 select City, Debt, TestN where workerUUID = '6789'
 select City, Debt, TestN where workerUUID = '8999'

To clarify further, I want to generate temporary tables like


| workerUUID | City |  Debt  | TestN| 
|------------------------------------
| 1234       |  SF  |  100k  | 23   |
|------------------------------------
| 1234       |  SF  |  10k   | 45   |
|-----------------------------------|


| workerUUID | City |  Debt  | TestN| 
|------------------------------------
| 6789       |  NY  |  150k  | 23   |
|------------------------------------
| 6789       |  NY  |  1k    | 13   |
|------------------------------------
| 6789       |  NY  |  150k  | 34   |
|-----------------------------------


| workerUUID | City |  Debt  | TestN| 
|------------------------------------
| 8999       |  IN  |  10k   | 45   |

etc

for all the unique value of workerUUIDs generated in the 3 hour gap

Please provide required output example, it is not clear what are you trying to achieve — leftjoin, Jun 19 '19 at 00:59

How to distinctly select a column while selecting other columns

0 Answers0