I want to use an ETL service, but i am stuck between Apache Airflow and Matillion.
- Are they the same?
- What are the main differences?
I want to use an ETL service, but i am stuck between Apache Airflow and Matillion.
Airflow's primary use-case is orchestration / scheduling, not ETL. You can perform ETL tasks inside Airflow DAGs, but unless you're planning on implementing Airflow using a containerized / K8 architecture, you'll quickly see performance bottlenecks and even hung / stuck processes. There are ways to mitigate this, certainly, but it's not the primary use case.
Matillion's primary use-case is ETL (really ELT), so it's not going to suffer the same performance issues, or require a complex infrastructure to achieve that performance. It also provides a GUI based code-optional interface, so that you don't have to be a Python expert to achieve results quickly.
I actually view Airflow and Matillion as complimentary (potentially). If you have inter-application dependencies, for example, you can orchestrate Matillion workflow with Airflow, or another third-party scheduler, and gain the benefits of both.
I've never used Matillion. So I can't answer with respect to any specific use case you have.
But with the quick analysis on Matillion I can very well tell that Matillion and Airflow aren't the same at all.
Matillion is a Extract/Transform/Load tool. You can compare it with tools like AWS Glue / Apache NiFi / DMExpress.
Airflow is an orchestration tool. You can compare it with tools like oozie.
More importantly Matillion doesn't come free of cost.