I´m trying to use this code from a repo in GitHub (https://github.com/nicknochnack/Reinforcement-Learning-for-Trading-Custom-Signals/blob/main/Custom%20Signals.ipynb) in Point 3:
model = A2C('MlpLstmPolicy', env, verbose=1)
model.learn(total_timesteps=1000000)
I got a lot of problems with stable-baselines for a different line so I tried with stable-baselines3 But I think that MlpLstmPolicy doesn´t work. ChatGPT said to change this for:
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.callbacks import CheckpointCallback
env = make_vec_env('env', n_envs=4, seed=0)
env = DummyVecEnv([lambda: env])
model = PPO('MlpLstmPolicy', env, verbose=1)
But I get this error: Error: Attempted to look up malformed environment ID: b'env'. (Currently all IDs must be of the form ^(?:[\w:-]+/)?([\w:.-]+)-v(\d+)$.)
I see that in the first option "model =" used env. So this is what I did.
I´ve changed "env" with everything else I found in the code but nothing worked.
Any help would be appreciated.