-1

We have a Google App Engine application consist of several modules and we are storing our user's data in the Google Cloud DataStore. Now we are going to implement some machine learning algorithms on this data and we are going to use DecisionTree algorithm. We're looking to solve this by using one of the below methods:

  1. Export the datas in the datastore to CSV file so we can use tools like Weka.

  2. Process the data in the datastore and run google cloud's machine learning techniques. (But when I looked at the Google Cloud ML documents I couldnt find anything about running decision tree on datastore)

So does anyone know is it possible to accomplish the above methods in Google Cloud. If its can you show me a specific documentation or can you describe me the way to do it?

dsesto
  • 7,864
  • 2
  • 33
  • 50
tolgatanriverdi
  • 561
  • 9
  • 31

1 Answers1

0

Based on your use case, I would say the best approach for your scenario is to use the new Beta release of Cloud ML Engine for scikit-learn. As you may already know, scikit-learn is a Machine Learning library for Python, and among its wide variety of possibilities, it includes Decision Trees. Note that this is a Beta release and therefore there may still be some rough edges, but I definitely think it should be a good option for you.

Cloud ML Engine has a tight integration with Google Cloud Storage, as it is the required storage option for input and output data, models, etc. That is why, regarding your mentioning of the storage of your data, I would say that the first option you mentioned "1. Export the data in the Datastore to CSV file so we can use tools like Weka" is the most suitable one. You will have to export your data to CSV files, upload them to Cloud Storage, and use ML Engine.

Finally, let me share with you some additional documentation pages that may be helpful to start working with ML Engine and scikit-learn:

dsesto
  • 7,864
  • 2
  • 33
  • 50