udpipe_accuracy() always gives the same error " The CoNLL-U line '....' does not contain 10 columns!"

Question

This is regarding the R package udpipe for NLP. I am using it to tokenize, tag, lemmatize and perform dependency parsing on text files.

I am not sure which template the conllu file is needed for the function

udpipe_accuracy

I loaded a CSV file of 10 columns but the error persists.

I could not search any questions on SO on this package and also there is no tag of udpipe.

What are you trying to do ? You can give a sample of your problem and the expected output. May be, there would be another (may be better) package which can solve your problem. — YOLO, Feb 25 '18 at 16:27
@ManishSaraswat, I am working on summarising large text documents. Before I can use any package like [textrank](https://github.com/bnosac/textrank/blob/master/vignettes/textrank.Rmd) I need to convert the text into CoNLLU format. I guess that's the standard format for any NLP work on text. — Lazarus Thurston, Feb 26 '18 at 04:58
Can you provide a sample data set ? I think there might be another way to do it where you don't require CoNLLU format. — YOLO, Feb 26 '18 at 11:29
I guess I no longer need to run udpipe_accuracy, as I sorted out the fundamental problem of accurately creating a terminology file. But I now have a problem in ignoring page headers and footers in the text file imported for a pdf doc...as the CONLL-U format includes the headers also as sentences. I will close this question and open a new question if that is fine @ManishSaraswat — Lazarus Thurston, Feb 26 '18 at 17:16

score 1 · Accepted Answer · answered Mar 08 '18 at 16:22

udpipe_accuracy is used in combination with udpipe_train. If you trained a custom udpipe model with udpipe_train based on data in conllu format, you can see how good it is by using udpipe_accuracy on hold-out conllu data which was not used to build the model.

udpipe_accuracy() always gives the same error " The CoNLL-U line '....' does not contain 10 columns!"

1 Answers1