Why does google cloud click to deploy hadoop workflow requires picking size for local persistent disk even if you plan to use the hadoop connector for cloud storage? The default size is 500 GB .. I was thinking if it does need some disk it should be much smaller in size. Is there a recommended persistent disk size when using cloud storage connector with hadoop in google cloud?
"Deploying Apache Hadoop on Google Cloud Platform
The Apache Hadoop framework supports distributed processing of large data sets across a clusters of computers.
Hadoop will be deployed in a single cluster. The default deployment creates 1 master VM instance and 2 worker VMs, each having 4 vCPUs, 15 GB of memory, and a 500-GB disk. A temporary deployment-coordinator VM instance is created to manage cluster setup.
The Hadoop cluster uses a Cloud Storage bucket as its default file system, accessed through Google Cloud Storage Connector. Visit Cloud Storage browser to find or create a bucket that you can use in your Hadoop deployment.
Apache Hadoop on Google Compute Engine Click to Deploy Apache Hadoop Apache Hadoop ZONE us-central1-a WORKER NODE COUNT
CLOUD STORAGE BUCKET Select a bucket HADOOP VERSION 1.2.1 MASTER NODE DISK TYPE Standard Persistent Disk MASTER NODE DISK SIZE (GB)
WORKER NODE DISK TYPE Standard Persistent Disk WORKER NODE DISK SIZE (GB) "