Questions tagged [azkaban]

Azkaban is a batch workflow job scheduler created at LinkedIn to run their Hadoop Jobs.

Often times there is a need to run a set of jobs and processes in a particular order within a workflow. Azkaban will resolve the ordering through job dependencies and provide an easy to use web user interface to maintain and track your workflows. Here are a few features:

  • Compatible with any version of Hadoop
  • Easy to use web UI
  • Simple web and http workflow uploads
  • Project workspaces
  • Scheduling of workflows
  • Modular and pluginable
  • Authentication and Authorisation
  • Tracking of user actions
  • Email alerts on failure and successes
  • SLA alerting and auto killing
  • Retrying of failed jobs

http://azkaban.github.io

64 questions
0
votes
1 answer

definition of airflow dag for a use case with variable dependencies

I would like to use airflow for the following use-case : Compute a daily report for a given website (~150 websites to handle). Each report will be computed as follows: A set of tasks that should be run at site level, A set of tasks that should be…
Seb
  • 3
  • 2
0
votes
1 answer

Azkaban Execute error

Got following Error when Executing flow. Error submitting flow bar. azkaban.executor.ExecutorManagerException: org.apache.http.conn.HttpHostConnectException: Connect to localhost:10000 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed:…
0
votes
1 answer

Failed Azkaban Spark Job has Success final status in stead of Failure

Use case: Azkaban starts Spark job Sparks Job Fails somewhow Expected Result: Hadoop ResourceManager said job final status is FAILED Actual Result: Hadoop RM said job final status is SUCCESSFUL Does anybody know how can it be fixed?
dmreshet
  • 1,496
  • 3
  • 18
  • 28
0
votes
0 answers

How can I describe jobs in Azkaban UI?

I just discovered Azkaban and I want to know if it is possible to put some description next to the jobs/flows in the project view ? (See this image for more accuracy)
Yoiro
  • 352
  • 3
  • 15
0
votes
1 answer

Parameter to disable Azkaban job by default

Given the following Azkaban configuration that has 1 flow containing 3 jobs, how do I disable Job2 by default? Is there a parameter/configuration for that? I know that I could go in the UI and disable the job manually. However, I'd like to have the…
Marsellus Wallace
  • 17,991
  • 25
  • 90
  • 154
0
votes
0 answers

Azkaban does not recognize shell variable

My target is to create a file on a yesterday-date directory on HDFS via Azkaban. The command in the *.job file is as follows: command=sudo -u hdfs hadoop fs -touchz /user/"`date -d '${etl_date} yesterday' "+%F"`"/_SUCCESS The hadoop fs -touchz…
superz
  • 99
  • 1
  • 2
  • 10
0
votes
1 answer

Can we use Azkaban with Google Cloud Bigtable?

Can we use Azkaban with Google Cloud Bigtable, as we do with Apache HBase?
lakshay gaur
  • 39
  • 1
  • 9
0
votes
2 answers

Apache Activiti Workflow Execution happens as a separate process or within the Activiti Process

I have been investigating Azkaban and Apache Activiti for one of our Workflow Use Cases. What I understand is that each job within Azkaban runs as a separate process, is that the same with Activiti, or do Activiti Tasks run within the Activiti's…
piyugupt
  • 404
  • 2
  • 5
  • 14
0
votes
1 answer

Process to pre-validate Azkaban flows

I would like to validate my Azkaban flows before I upload them to the server, as simple as that. Do we have a plugin or something to do that? If not what are the classes within Azkaban github that do this validation? I could just adapt them and use…
Henrique Martins
  • 352
  • 2
  • 11
0
votes
1 answer

Setup priority on Azkaban parallel flows/depedencies

I'm using Azkaban 3.4.1 and one of my flow has more than 30 dependencies. Some dependencies are takes more longer than another. So, I want to prioritize these flows to started before another flows. (because the running thread is limited) Currently…
0
votes
1 answer

azkaban 3.0 jobs id doesn't change

I'm trying to do some testing with Azkaban 3.0. Currently I'm facing a problem whenever I kick off a project that I already kicked off. So, before the execution id is assigned for the new run, it will be the same as the last execution id used for…
tkyass
  • 2,968
  • 8
  • 38
  • 57
0
votes
1 answer

I want to create my own plugin for Azkaban

I'm trying to do what I wrote at this mail's subject. But the number of the information on the internet seems very few. Does anyone know about that or helpful webpages?
Sankame
  • 113
  • 9
0
votes
1 answer

Call a Upload API in Java

There is one POST API in Azkaban to upload a zip file. I am able to upload by using curl as they have given in documentation. curl -k -i -H "Content-Type: multipart/mixed" -X POST --form 'session.id=47cb9240-f8fe-46f9-9cba-1c1a293a0cf3' --form…
hatellla
  • 4,796
  • 8
  • 49
  • 101
0
votes
1 answer

Accessing Azkaban's "runtime properties"

I've been trying (with no luck) on a simple topic: Accessing Azkaban's "global" runtime properties (supposedly available to the flow). I've tried every normal and abnormal methods to try to access them from within a flow to no avail. Does anyone…
Dave
  • 1
  • 2
0
votes
1 answer

mysql + Azkaban: Reading "LongBlob"

I am trying to build a query layer on 'azkaban' Database. (Language used: Java) I am running into, what I thought would be a simple problem (but turning out to be irritating). This is the query I am running: select exec_id, CONVERT(log USING…
Roger
  • 2,823
  • 3
  • 25
  • 32