-3

Im having slight issue running on CNN data. The vocabulary file generated using the code above gives assertion error. Im not able to understand what is causing this issue.

This is the error i get :

Traceback (most recent call last):
File “/home/umair/sumModel/bazel-bin/textsum/seq2seq_attention.runfiles/__main__/textsum/seq2seq_attention.py”, line 213, in <module>
tf.app.run()
File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py”, line 30, in run
sys.exit(main(sys.argv))
File “/home/umair/sumModel/bazel-bin/textsum/seq2seq_attention.runfiles/__main__/textsum/seq2seq_attention.py”, line 165, in main
assert vocab.CheckVocab(data.SENTENCE_START) > 0
AssertionError

the function in seq2seq_attention.py:

def main(unused_argv): vocab = data.Vocab(FLAGS.vocab_path, 10000000) Check for presence of required special tokens. assert vocab.CheckVocab(data.PAD_TOKEN) > 0 assert vocab.CheckVocab(data.UNKNOWN_TOKEN) >= 0 assert vocab.CheckVocab(data.SENTENCE_START) > 0 assert vocab.CheckVocab(data.SENTENCE_END) > 0 –

1 Answers1

0

What about these? You miss some of them in your vocabulary i.e. SENTENSE_START.

# Special tokens
PARAGRAPH_START = '<p>'
PARAGRAPH_END = '</p>'
SENTENCE_START = '<s>'
SENTENCE_END = '</s>'
UNKNOWN_TOKEN = '<UNK>'
PAD_TOKEN = '<PAD>'
DOCUMENT_START = '<d>'
DOCUMENT_END = '</d>'

source: https://github.com/tensorflow/models/blob/master/textsum/data.py

TomBa
  • 1