I am attempting to extract quotations and quotation attributions from text across multiple records using a function from textacy. So far, I have successfully executed the function on a single record, as such:
import textacy
data = ("\"Hello, nice to meet you,\" said world 1")
doc = textacy.make_spacy_doc((data), lang="en_core_web_sm")
quotes = textacy.extract.triples.direct_quotations(doc)
print(list(quotes))
This is the output:
[DQTriple(speaker=[world], cue=[said], content="Hello, nice to meet you,")]
But I run into errors when I attempt to run the function on multiple records. Here is what I have tried:
import textacy
data = [
("\"Hello, nice to meet you,\" said world 1"),
("\"Hello, nice to meet you,\" said world 2"),
]
doc = textacy.make_spacy_doc((data), lang="en_core_web_sm")
quotes = textacy.extract.triples.direct_quotations(doc)
print(list(quotes))
And the error message:
raise TypeError(errors.type_invalid_msg("data", type(data), types.DocData)) TypeError:
data
type = <class 'list'> is invalid; type must match typing.Union[str, textacy.types.Record, spacy.tokens.doc.Doc].