0

I got some data, which is 3.2 million entries in a csv file. I'm trying to use CNN estimator in tensorflow to train the model, but it's very slow. Everytime I run the script, it got stuck, like the webpage(localhost) just refuse to respond anymore. Any recommendations? (I've tried with 22 CPUs and I can't increase it anymore)

Can I just run it and use a thread, like the command line python xxx.py & to keep the process going? And then go back to check after some time?

Elona Mishmika
  • 480
  • 2
  • 5
  • 21

2 Answers2

1

Google offers serverless machine learning with TensorFlow for precisely this reason. It is called Cloud ML Engine. Your workflow would basically look like this:

  1. Develop the program to train your neural network on a small dataset that can fit in memory (iron out the bugs, make sure it works the way you want)

  2. Upload your full data set to the cloud (Google Cloud Storage or BigQuery or &c.) (documentation reference: training steps)

  3. Submit a package containing your training program to ML Cloud (this will point to the location of your full data set in the cloud) (documentation reference: packaging the trainer)

  4. Start a training job in the cloud; this is serverless, so it will take care of scaling to as many machines as necessary, without you having to deal with setting up a cluster, &c. (documentation reference: submitting training jobs).

You can use this workflow to train neural networks on massive data sets - particularly useful for image recognition.

If this is a little too much information, or if this is part of a workflow that you'll be doing a lot and you want to get a stronger handle on it, Coursera offers a course on Serverless Machine Learning with Tensorflow. (I have taken it, and was really impressed with the quality of the Google Cloud offerings on Coursera.)

charlesreid1
  • 4,360
  • 4
  • 30
  • 52
0

I am sorry for answering even though I am completely igonorant to what datalab is, but have you tried batching?

I am not aware if it is possible in this scenario, but insert maybe only 10 000 entries in one go and do this in so many batches that eventually all entries have been inputted?