The scenario is I need to process a file(Input) and for each records I need to check whether certain fields in input file are matching the fields stored in an Hadoop cluster.
We are in a thought of using MRJob to process the the input file and use HIVE to get data from hadoop cluster. I would like to know whether it is possible for me to connect HIVE inside a MRJob module. If so how to do that?
If not what would be the ideal approach to fulfill my requirement.
I am new to Hadoop, MRJob and Hive.
Please provide some suggestion.