18

I have recently installed airflow for my workflows. While creating my project, I executed following command:

airflow initdb

which returned following error:

[2016-08-15 11:17:00,314] {__init__.py:36} INFO - Using executor SequentialExecutor
DB: sqlite:////Users/mikhilraj/airflow/airflow.db
[2016-08-15 11:17:01,319] {db.py:222} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
ERROR [airflow.models.DagBag] Failed to import: /usr/local/lib/python2.7/site-packages/airflow/example_dags/example_twitter_dag.py
Traceback (most recent call last):
    File "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 247, in process_file
       m = imp.load_source(mod_name, file path)
    File "/usr/local/lib/python2.7/site-packages/airflow/example_dags/example_twitter_dag.py", line 26, in <module>
       from airflow.operators import BashOperator, HiveOperator, PythonOperator
ImportError: cannot import name HiveOperator
Done.

I checked some similar issues on web, which suggested me to install airflow[hive], pyhs2 but it doesn't seem to work.

Rusty
  • 1,086
  • 2
  • 13
  • 27

3 Answers3

24

Are you using the HiveOperator? It seems like the error you are getting is due to 1 of the example dags. In production you should probably set load_examples to False and install airflow[hive] only if you are using the HiveOperator.

That being said, not sure why airflow[hive] isn't enough for you. You may try installing airflow[hive,hdfs,jdbc] but the airflow[hive] should be enough to get rid of the HiveOperator import error. Could you perhaps add what other error you are getting?

Vineet Goel
  • 2,138
  • 1
  • 22
  • 28
  • Seems like that is the issue. On production `airflow[hive]` worked for me. Can you tell me how to set `load_examples` to `False`. – Rusty Aug 16 '16 at 08:18
  • 2
    Check out the `airflow.cfg` file. Airflow automatically creates the default `airflow.cfg` file for you in the AIRFLOW_HOME dir. The file has a variable `load_examples` which by default is set to `True` – Vineet Goel Aug 16 '16 at 18:41
  • 3
    The command `pip install airflow[hive]` was sufficient to resolve the error on a fresh install for me. – Taylor D. Edmiston Nov 14 '16 at 21:01
  • 3
    The tutorial triggers this problem. – Javier Dec 06 '16 at 17:24
  • I solved this by installing from the git repo: pip install -e "git+ssh://git@github.com/apache/incubator-airflow.git@master#egg=airflow" – mbarkhau Feb 15 '17 at 11:17
3
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 247, in process_file
    m = imp.load_source(mod_name, filepath)
  File "/usr/local/lib/python2.7/dist-packages/airflow/example_dags/example_twitter_dag.py", line 26, in <module>
    from airflow.operators import BashOperator, HiveOperator, PythonOperator
ImportError: cannot import name HiveOperator

If you still want to continue with installing sample data ... for Ubuntu 14.04 please use this method with python 2.7 the latest. (tested at DO )

1.apt-get update

2.apt-get install python-pip python-dev build-essential

3.pip install --upgrade pip

3a.which pip #/usr/local/bin/pip

3b.pip -V #pip 9.0.1 from /usr/local/lib/python2.7/dist-packages (python 2.7)

4.pip install --upgrade virtualenv

(Task 5 is optional)

5.apt-get install sqlite3 libsqlite3-dev

https://askubuntu.com/questions/683601/how-to-upgrade-python-setuptools-12-2-on-ubuntu-15-04

6.apt-get remove python-setuptools

7.pip install -U pip setuptools

8.export AIRFLOW_HOME=~/airflow

9.pip install airflow

10.pip install airflow[hive]

11.airflow initdb

You will get this response below

[2017-02-01 12:04:28,289] {__init__.py:36} INFO - Using executor SequentialExecutor
[2017-02-01 12:04:28,350] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2017-02-01 12:04:28,376] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
DB: sqlite:////root/airflow/airflow.db
[2017-02-01 12:04:28,522] {db.py:222} INFO - Creating tables
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
Done.

NOTE: Please Please apply necessary sudo command if applicable

Community
  • 1
  • 1
0

Check if the hive operator imported in the dag file? If not, You can do something like:

from airflow.operators.hive_operator import HiveOperator