0

I am attempting to extract quotations and quotation attributions from text across multiple records using a function from textacy. So far, I have successfully executed the function on a single record, as such:

import textacy

data = ("\"Hello, nice to meet you,\" said world 1")

doc = textacy.make_spacy_doc((data), lang="en_core_web_sm")

quotes = textacy.extract.triples.direct_quotations(doc)

print(list(quotes))

This is the output:

[DQTriple(speaker=[world], cue=[said], content="Hello, nice to meet you,")]

But I run into errors when I attempt to run the function on multiple records. Here is what I have tried:

import textacy

data = [
        ("\"Hello, nice to meet you,\" said world 1"),
        ("\"Hello, nice to meet you,\" said world 2"),
        ]

doc = textacy.make_spacy_doc((data), lang="en_core_web_sm")

quotes = textacy.extract.triples.direct_quotations(doc)

print(list(quotes))

And the error message:

raise TypeError(errors.type_invalid_msg("data", type(data), types.DocData)) TypeError: data type = <class 'list'> is invalid; type must match typing.Union[str, textacy.types.Record, spacy.tokens.doc.Doc].

vvvvv
  • 25,404
  • 19
  • 49
  • 81
jedmund
  • 55
  • 4

1 Answers1

0
data = [
        ("\"Hello, nice to meet you,\" said world 1"),
        ("\"Hello, nice to meet you,\" said world 2"),
        ]
for record in data:
    doc = textacy.make_spacy_doc(record, lang="en_core_web_sm")
    print(list(textacy.extract.triples.direct_quotations(doc)))

This answer was posted as an edit to the question Perform function on multiple records using textacy by the OP jedmund under CC BY-SA 4.0.

vvvvv
  • 25,404
  • 19
  • 49
  • 81