0

I have created a data source and trained the machine learning model in Amazon Machine Learning. The data resides in S3 which is used for creating the data source. However, my application has new data added to S3 every second, thus I need a way in which I can generate the data source and train the model periodically.

Is there a way in which I can achieve this?

Any help is appreciated.

Vasanti
  • 1,207
  • 2
  • 12
  • 24

1 Answers1

0

Yes. You need to do a few things:

  • make sure your data source points to the prefix in s3: bucket/data/ rather than bucket/data/data.csv
  • write a script that you run regularly to create a new model (you unfortunately can't update the model) against this data. Here's a sample script which does this using boto: https://github.com/mooreds/amazonmachinelearning-anintroduction/blob/master/updatemodel/updatemodel.py
  • tag your new model and make sure your clients are finding the model to use via tags
  • delete your old models (mostly to avoid confusion)
mooreds
  • 4,932
  • 2
  • 32
  • 40