I'm trying to set up a generalized Reinforcement Learning framework in PyTorch to take advantage of all the high-level utilities out there which leverage PyTorch DataSet and DataLoader, like Ignite or FastAI, but I've hit a blocker with the dynamic nature of Reinforcement Learning data:
- Data Items are generated from code, not read from a file, and they are dependent on previous actions and model results, therefore each nextItem call needs access to the model state.
- Training episodes are not fixed length so I need a dynamic batch size as well as a dynamic total data set size. My preference would be to use a terminating condition function instead of a number. I could "possibly" do this with padding, as in NLP sentence processing, but that's a real hack.
My Google and StackOverflow searches so far have yielded zilch. Anyone here know of existing solutions or workarounds to using DataLoader or DataSet with Reinforcement Learning? I hate to loose access to all the existing libraries out there which depend on those.