ModuleNotFoundError: No module named 'haystack.nodes'

Question

I am following the tutorial from haystacks website for Extractive QA system. I am trying to convert PDF to Text. Link to the blog is here : (https://www.deepset.ai/blog/automating-information-extraction-with-question-answering)

I pip installed haystack but I get this error. I even tried !pip install haystack.nodes but that doesn't work.

Note: I am using Google Colab for this.

Here is my detailed code and error:

!pip -q install haystack haystack.nodes
path = '/content/drive/MyDrive/Colab Notebooks/NLP/Information Extraction QA with Haystack (Adidas Financial corpus)'
from haystack.nodes import PDFToTextConverter

pdf_converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=['en'])

converted = pdf_converter.convert(file_path = path, meta = { 'company': 'Company_1', 'processed': False })

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-7-61021fb3b7b8> in <cell line: 1>()
----> 1 from haystack.nodes import PDFToTextConverter
      2 
      3 pdf_converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=['en'])
      4 
      5 converted = pdf_converter.convert(file_path = path, meta = { 'company': 'Company_1', 'processed': False })

Welcome to Stack Overflow. Are you using the same python binary that is associated to the pip command? — ewokx, Apr 17 '23 at 02:59
Hello. Please follow the installation instructions in this tutorial, where `PDFToTextConverter` is used: https://haystack.deepset.ai/tutorials/08_preprocessing It should work — Stefano Fiorucci - anakin87, Apr 17 '23 at 07:23
Also remember to install `farm-haystack` (not simply `haystack`). — Stefano Fiorucci - anakin87, Apr 19 '23 at 08:47

score 1 · Answer 1 · answered May 25 '23 at 10:05

To install Haystack, you need to run pip install farm-haystack. The pypi package is called farm-haystack and not just haystack as Stefano mentioned. A good starting point are the Haystack tutorials, which you can run as python notebooks on Google Colab, for example this tutorial using the PDFToTextConverter.

ModuleNotFoundError: No module named 'haystack.nodes'

1 Answers1