You can use the Command pattern in conjunction with ML.NET to solve your problem. The Command pattern essentially generates the sequence of commands that are then executed by a Command Interpreter in the traditional sense of architecture patterns.
We use the Command pattern to generate the game play training data as follows:
Create a class called GameState.
public class GameState
{
public enum GameAction
{
Fire,
Jump,
MoveRight,
MoveLeft,
...
}
public GameState Current { get; set; }
public GameAction NextAction { get; set; }
public GameOutcome Outcome { get; set; }
public string Descriptor {
get {
// returns a string that succinctly and uniquely
// describes the current game state
}
}
}
and define a GameOutcome class:
public class GameOutcome
{
public int GameID { get; set; }
public enum OutcomeState
{
Win,
Loss,
Tie,
Unfinished
}
public OutcomeState Outcome { get; set; }
}
If you can generate GameState sequences from actual game play as training data, then you can create a predictor (essentially a MultiClassClassifier) using ML.NET that takes the GameState.Descriptor, GameState.Outcome.OutcomeState and GameState.NextAction with the Descriptor and OutcomeState as features and NextAction as the predicted label.
In live (automated play), you initialize the gamestate and then predict the next action setting an OutcomeState of 'Win' and using the ML classifier to predict the learnt next action.
The trick lies in encapsulating a rich and succinct game state description that takes into account the history of steps followed to get to the current game state and the projected future outcome of the game (from a large number of historical game plays).