I would like to reverse the tokenization that I have applied to my data.
data = [['this', 'is', 'a', 'sentence'], ['this', 'is', 'a', 'sentence', '2']]
Expected output:
['this is a sentence', 'this is a sentence 2']
I tried to do this with the following code block:
from nltk.tokenize.treebank import TreebankWordDetokenizer
data_untoken= []
for i, text in enumerate(data):
data_untoken.append(text)
data_untoken = TreebankWordDetokenizer().detokenize(text)
But I have the following error
'str' object has no attribute 'append'