Python Version : 3.9
System Version : Mac M1 12.5.1
I have the following problems in the process of using langchain
Sample code :
from langchain.document_loaders import DirectoryLoader
if __name__ == '__main__':
loader = DirectoryLoader('./data/', glob='**/*.md')
documents = loader.load()
The following abnormal information is prompted
Traceback (most recent call last):
File "/Users/zhangsan/project/python/test/langchain/text01.py", line 5, in <module>
documents = loader.load()
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/langchain/document_loaders/directory.py", line 84, in load
raise e
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/langchain/document_loaders/directory.py", line 78, in load
sub_docs = self.loader_cls(str(i), **self.loader_kwargs).load()
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/langchain/document_loaders/unstructured.py", line 70, in load
elements = self._get_elements()
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/langchain/document_loaders/unstructured.py", line 102, in _get_elements
from unstructured.partition.auto import partition
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/unstructured/partition/auto.py", line 9, in <module>
from unstructured.partition.doc import partition_doc
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/unstructured/partition/doc.py", line 7, in <module>
from unstructured.partition.docx import partition_docx
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/unstructured/partition/docx.py", line 6, in <module>
import docx
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/docx/__init__.py", line 3, in <module>
from docx.api import Document # noqa
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/docx/api.py", line 14, in <module>
from docx.package import Package
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/docx/package.py", line 9, in <module>
from docx.opc.package import OpcPackage
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/docx/opc/package.py", line 9, in <module>
from docx.opc.part import PartFactory
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/docx/opc/part.py", line 12, in <module>
from .oxml import serialize_part_xml
File "/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/docx/opc/oxml.py", line 12, in <module>
from lxml import etree
ImportError: dlopen(/Users/zhangsan/project/python/test/venv/lib/python3.9/site-packages/lxml/etree.cpython-39-darwin.so, 0x0002): symbol not found in flat namespace (_exsltDateXpathCtxtRegister)
According to the above exception information, when I checked the "import docx" step, I entered the python command line environment and executed the code manually.
import docx
It is found that the same error message as the above exception is triggered.