I need to create data pipelines in hadoop. I have data import, export, scripts to clean data set up and need to set it up in a pipeline now.
I have been using Oozie for data import and export schedules but now need to integrate R scripts for data cleaning process as well.
I see falcon is used for the same.
- How to install falcon in cloudera?
- What other tools are available to create data pipelines in hadoop?