Wow, lot of questions, lot of hypothesis and lot of possibilities. The best answer is "all depends of your needs"!
Where do I store the "normal business data" for this micro services?
Want do you want to do in these microservices?
Or can I connect them with BigTable?
Yes you can, but do you need this? If you need the raw data before processing, yes connect to BigTable and query it!
If not, it's better to have a batch process which pre-process the raw data and store only the summary in a relational or document database (better latency for user, but less details)
Are there alternatives to BigTable?
Depends of your needs. BigTable is great for high throughput. If you have less than 1 million of stream write per second, you can consider BigQuery. You can also query BigTable table with BigQuery engine thanks to federated table
BigTable, BigQuery and Cloud Storage are reachable by dataproc, so as you need!
Another developer told me that an option is to store data on Google Cloud Storage (Buckets)
Yes, you can stream to Cloud Storage, but be careful, you don't have checksum validation and thus you can be sure that your data haven't been corrupted.
Note
You can think your application in other way. If you publish event into PubSub, one of common pattern is to process them with Dataflow, at least for the pre-processing -> your dataproc job for training your model will be easier like this!
If you train a Tensorflow model, you can also consider BigQuery ML, not for the training (except if a standard model fit your needs but I doubt), but for the serving part.
- Load your tensorflow model into BigQueryML
- Simply query your data with BigQuery as input of your model, submit them to your model and get immediately the prediction. That you can store directly into BigQuery with an
Insert Select
query. The processing for the prediction is free, you pay only the data scanned into BigQuery!
As I said, lot of possibility. Narrow your question to have a sharper answer! Anyway, hope this help