I have been testing textsum with both the binary data and gigaword data, trained models and tested. The beam search decoder gives me all 'UNK' results with both set of data and models. I was using the default parameter settings.
I first changed the data interface in data.py and batch_reader.py to read and parse the article and abstract from gigaword dataset. I trained a model with over 90K mini-batches on a roughly 1.7 million documents. Then I tested the model on a different test set but it returned all results. decoder result from model trained with gigaword
Then I used the binary data that comes along with the textsum code to train a small model with less than 1k mini-batches. I tested on the same binary data. It gives all results in the decoding file except a few 'for' and '.'. decoder result from model trained with binary data I also viewed the tensorboard on training loss and it shows training converged.
In training and testing, I didn't change any of the default settings. Have anyone tried the same thing as I did and found the same issue?