I have been following the book of Paul Crickard - "Data Engineering with python". I wanted to install and configure Apache Airflow. After installing it, I tried to initialize the database with
airflow db init
It raised the following error:
[2022-06-28 10:40:08,549] {db.py:1448} INFO - Creating tables
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
INFO [alembic.runtime.migration] Running upgrade 13eb55f81627 -> 338e90f54d61, Add ``operator`` and ``queued_dttm`` to ``task_instance`` table
Traceback (most recent call last):
File "/home/kien/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
self.dialect.do_execute(
File "/home/kien/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: duplicate column name: operator
I have not yet tried anything meaningful. I also do not know where I can find the file "13eb55f81627".
I am using Ubuntu 20.04.04 LTS and python 3.8.6.
Also pip freeze gave me:
alembic==1.8.0
amqp==5.1.1
anyio==3.6.1
apache-airflow==2.3.2
apache-airflow-providers-celery==3.0.0
apache-airflow-providers-ftp==3.0.0
apache-airflow-providers-http==3.0.0
apache-airflow-providers-imap==3.0.0
apache-airflow-providers-postgres==5.0.0
apache-airflow-providers-slack==5.0.0
apache-airflow-providers-sqlite==3.0.0
apispec==3.3.2
appdirs==1.4.4
apt-xapian-index==0.49
apturl==0.5.2
argcomplete==2.0.0
argon2-cffi==20.1.0
asn1crypto==0.24.0
async-generator==1.10
atomicwrites==1.1.5
attrs==20.3.0
Automat==0.8.0
Babel==2.10.3
backcall==0.2.0
beautifulsoup4==4.8.2
billiard==3.6.4.0
bleach==3.2.1
blinker==1.4
Brlapi==0.7.0
cachelib==0.9.0
cattrs==1.10.0
celery==5.2.7
certifi==2019.11.28
cffi==1.14.3
chardet==3.0.4
Click==7.0
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
clickclick==20.10.2
colorama==0.4.3
colorlog==4.8.0
command-not-found==0.3
commonmark==0.9.1
configobj==5.0.6
connexion==2.14.0
constantly==15.1.0
cron-descriptor==1.2.24
croniter==1.3.5
cryptography==2.8
cssselect==1.1.0
cupshelpers==1.0
cycler==0.10.0
Cython==0.29.21
dbus-python==1.2.16
decorator==4.4.2
defer==1.0.6
defusedxml==0.6.0
deluge==2.0.3
Deprecated==1.2.13
dill==0.3.3
distlib==0.3.1
distro==1.4.0
distro-info===0.23ubuntu1
dlx==1.0.4
dnspython==2.2.1
docplex==2.15.194
docutils==0.18.1
email-validator==1.2.1
entrypoints==0.3
et-xmlfile==1.0.1
fastdtw==0.3.4
fastjsonschema==2.15.0
filelock==3.6.0
Flask==1.1.4
Flask-AppBuilder==3.4.5
Flask-Babel==2.0.0
Flask-Caching==1.11.1
Flask-JWT-Extended==3.25.1
Flask-Login==0.4.1
Flask-OpenID==1.3.0
Flask-Session==0.4.0
Flask-SQLAlchemy==2.5.1
Flask-WTF==0.14.3
flower==1.0.0
gpg===1.13.1-unknown
graphviz==0.20
greenlet==1.1.2
gunicorn==20.1.0
h11==0.12.0
h5py==3.1.0
html5lib==1.0.1
httpcore==0.15.0
httplib2==0.14.0
httpx==0.23.0
humanize==4.2.2
hyperlink==19.0.0
idna==2.8
importlib-metadata==4.12.0
importlib-resources==5.8.0
incremental==16.10.1
inflection==0.5.1
intel-openmp==2022.0.2
ipykernel==5.3.4
ipython==7.19.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
itsdangerous==1.1.0
jdcal==1.0
jedi==0.17.2
Jinja2==2.11.2
joblib==0.17.0
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==6.1.7
jupyter-console==6.2.0
jupyter-core==4.7.0
jupyterlab-pygments==0.1.2
keyring==18.0.1
kiwisolver==1.0.1
kombu==5.2.4
language-selector==0.1
launchpadlib==1.10.13
lazr.restfulclient==0.14.2
lazr.uri==1.0.3
lazy-object-proxy==1.7.1
lockfile==0.12.2
louis==3.12.0
lxml==4.6.2
macaroonbakery==1.3.1
Mako==1.1.0
Markdown==3.3.7
MarkupSafe==2.0.1
marshmallow==3.17.0
marshmallow-enum==1.5.1
marshmallow-oneofschema==3.0.1
marshmallow-sqlalchemy==0.26.1
matplotlib==3.1.2
mistune==0.8.4
mkl==2022.0.2
more-itertools==4.2.0
mpmath==1.2.1
multitasking==0.0.9
nbclient==0.5.1
nbconvert==6.0.7
nbformat==5.0.8
nest-asyncio==1.4.3
netifaces==0.10.4
networkx==2.5
notebook==6.1.5
ntlm-auth==1.5.0
numexpr==2.7.1
numpy==1.17.4
oauthlib==3.1.0
olefile==0.46
openpyxl==3.0.3
packaging==20.3
pandas==1.3.3
pandocfilters==1.4.3
parso==0.7.1
pathspec==0.9.0
pendulum==2.1.2
pexpect==4.6.0
pickleshare==0.7.5
Pillow==7.0.0
platformdirs==2.5.1
pluggy==1.0.0
ply==3.11
prison==0.2.1
prometheus-client==0.9.0
prompt-toolkit==3.0.8
protobuf==3.6.1
psutil==5.5.1
psycopg2-binary==2.9.3
ptyprocess==0.6.0
py==1.8.1
pyasn1==0.4.2
pyasn1-modules==0.2.1
pybind11==2.6.2
pycairo==1.16.2
pycparser==2.20
pycrypto==2.6.1
pycups==1.9.73
pydot==1.4.2
Pygments==2.7.2
PyGObject==3.36.0
PyHamcrest==1.9.0
PyJWT==1.7.1
pylatexenc==2.8
pymacaroons==0.13.0
PyNaCl==1.3.0
pyOpenSSL==19.0.0
pyparsing==2.4.6
PyQt5==5.14.1
pyRFC3339==1.1
pyrsistent==0.17.3
pytest==4.6.9
python-apt==2.0.0+ubuntu0.20.4.7
python-constraint==1.4.0
python-daemon==2.3.0
python-dateutil==2.8.1
python-debian===0.1.36ubuntu1
python-libtorrent==1.1.13
python-nvd3==0.15.0
python-slugify==6.1.2
python3-openid==3.2.0
pytz==2022.1
pytzdata==2020.1
pyxdg==0.26
PyYAML==5.3.1
pyzmq==20.0.0
qiskit==0.23.6
qiskit-aer==0.7.5
qiskit-aqua==0.8.2
qiskit-ibmq-provider==0.11.1
qiskit-ignis==0.5.2
qiskit-terra==0.16.4
qiskit-textbook==0.1.0
qtconsole==4.7.7
QtPy==1.9.0
Quandl==3.6.0
ranger-fm==1.9.3
rencode==1.0.6
reportlab==3.5.34
requests==2.22.0
requests-ntlm==1.1.0
requests-unixsocket==0.2.0
retworkx==0.7.2
rfc3986==1.5.0
rich==12.4.4
scikit-learn==1.0.2
scipy==1.3.3
scour==0.37
seaborn==0.11.1
SecretStorage==2.3.1
Send2Trash==1.5.0
service-identity==18.1.0
setproctitle==1.1.10
simplejson==3.16.0
sip==4.19.21
six==1.14.0
slack-sdk==3.17.2
sniffio==1.2.0
soupsieve==1.9.5
SQLAlchemy==1.4.9
SQLAlchemy-JSONField==1.0.0
SQLAlchemy-Utils==0.38.2
ssh-import-id==5.10
swagger-ui-bundle==0.0.9
sympy==1.7.1
system-service==0.3
systemd-python==234
tables==3.6.1
tabulate==0.8.10
tbb==2021.5.1
tenacity==8.0.1
termcolor==1.1.0
terminado==0.9.1
testpath==0.4.4
text-unidecode==1.3
threadpoolctl==2.1.0
tornado==6.1
traitlets==5.0.5
Twisted==18.9.0
typing-extensions==4.2.0
ubuntu-advantage-tools==27.8
ubuntu-drivers-common==0.0.0
ufw==0.36
unattended-upgrades==0.1
unicodecsv==0.14.1
urllib3==1.25.8
usb-creator==0.3.7
vine==5.0.0
virtualenv==20.14.0
wadllib==1.3.3
wcwidth==0.1.8
webencodings==0.5.1
websockets==8.1
Werkzeug==1.0.1
widgetsnbextension==3.5.1
wrapt==1.14.1
WTForms==2.3.3
xkit==0.0.0
xlrd==1.1.0
xlwt==1.3.0
yfinance==0.1.55
zipp==3.8.0
zope.interface==4.7.1
I have tried to get rid of the duplicate columns by doing the following:
sqlite> .output text.txt
sqlite> .open airflow.db
sqlite> .dump
it emptied the airflow.db and when turning it to airflow.sql it gave:
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
COMMIT;
Re-running
airflow db init
finally did something (no additional errors were raised). Initialization done. So my question is does anyone know what I have possibly done?