Hi I have a document uploaded to a Hive table named Data
with sample lines like below:
He is a good boy and but his brother is a bad boy.
He is a naughty boy.
The table's schema is:
create table Data(
document_data STRING)
row format delimited
fields terminated by '\n'
stored as textfile;
I want to write a query that counts the occurrences of just the words boy
and naughty` and outputs them as such:
boy 3
naughty 1