I'm working on a project that relies on Hadoop but MRv1 architecture (Hadoop-1.1.2). I tried oozie scheduler for creating workflows(mapred) but gave up eventually, cause it is a nightmare to configure and I couldn't get it to work. I was wondering if I should try these other workflow Schedulers such as Azkaban or Apache Airflow. Would they be compatible with my requirements ?
Asked
Active
Viewed 147 times
0
-
1Hadoop 1.1.2 was released nearly a decade ago, are you sure you have to use it? You can submit MR jobs from CLI commands, so you can use whichever scheduler you want. Airflow is certainly widely-used these days. You'll have an easier time setting up Apache Spark from scratch than trying to run MR jobs on Hadoop 1.1.2. – Ben Watson Apr 24 '22 at 13:28
-
Hi! Think of it less of a project and more of an educational experiment. Hence why I chose Hadoop-1.1.2 for using MRv1. So do you think Apache Airflow works fine with Hadoop-1.1.2. ? – aniii Apr 24 '22 at 15:36
-
1I'd say so, although I obviously don't know the full scope of what you're trying to achieve. Airflow is a popular scheduler, it can execute Bash scripts on a schedule, and MapReduce jobs can be run as Bash scripts. You could just use cron if your use case is basic. – Ben Watson Apr 24 '22 at 15:41