This issue is seen when performing training against my own dataset which was converted to binary via data_convert_example.py. After a week of training I get decode results that don't make sense when comparing the decode and ref files.
If anyone has been successful and gotten results similar to what is posted in the Textsum readme using their own data, I would love to know what has worked for you...environment, tf build, number of articles.
I currently have not had luck with 0.11, but have gotten some results with 0.9 however the decode results are similar to those shown below which I have no idea where they are even coming from.
I currently am running Ubuntu 16.04, TF 0.9, CUDA 7.5 and CuDnn 4. I tried TF 0.11 but was dealing with other issues so I went back to 0.9. It does seem that the decode results are being generated from valid articles, but the reference file and decode file indicies have NO correlation.
If anyone can provide any help or direction, it would be greatly appreciated. Otherwise, should I figure anything out, I will post here.
A few final questions. Regarding the vocab file referenced. Does it at all need to be sorted by word frequency at all? I never performed anything along these lines when generating it and just wasn't sure if this would throw something off as well.
Finally, I made the assumption in generating the data that the training data articles should be broken down into smaller batches. I separated out the articles into multiple files of 100 articles each. These were then named data-0, data-1, etc. I assume this was a correct assumption on my part? I also kept all the vocab in one file which has not seemed to throw any errors.
Are the above assumptions correct as well?
Below are some ref and decode results which you can see are quite odd and seem to have no correlation.
DECODE:
output=Wild Boy Goes About How I Can't Be Really Go For Love
output=State Department defends the campaign of Iran
output=John Deere sails profit - Business Insider
output=to roll for the Perseid meteor shower
output=Man in New York City in Germany
REFERENCE:
output=Battle Chasers: Nightwar Combines Joe Mad's Stellar Art With Solid RPG Gameplay
output=Obama Meets a Goal That Could Literally Destroy America
output=WOW! 10 stunning photos of presidents daughter Zahra Buhari
output=Koko the gorilla jams out on bass with Flea from Red Hot Chili Peppers
output=Brenham police officer refused service at McDonald's