I am building a classification model using AutoML and I have some basic usage questions about the GCP.
1 - Data privacy question; if we save behavior data to train our model in BigQuery, does Google have access to that data? Could Google ever use that data to learn more about behavior of individuals we collected data from?
2 - Since training costs are charged by the hour, I would like to understand the relationship between data and training time. Does the time increase linearly with the size of the training data set? For example, we trained a classification using 1.7MB of data and it took 3 hrs. So, would training a model with 17MB of data take 30 hours?
3 - A batch prediction costs 1.16 USD per hour. However, our data is in a csv and it seems that we cannot upload a csv to do a batch prediction. So, we will try using the API. Therefore I have two questions: A) can we do a batch upload using the API and B) what are the associated costs?
4 - What exactly is an online prediction?
5 - When using the cost calculator (for machine learning), what is a node hour?