It's known for Spark & Kafka integration, we have some options for executor location as described in the link:
Is there any option like this for Storage Layer. For example let's assume I will integrate Spark with Minio as the storage layer. Is it possible to run executors on Minio with any configuration. With any configuration I mean;
- It could be a Spark Standalone installation and Minio and Spark nodes can be on the same machines
- Or it could be Spark and Minio on K8S and pod/machine configuratons may provide working on same node etc.
Main achievement is to avoid Network Overhead between Spark and Storage layer as much as possible.Is there any configuration for this?
Notes: There is no HDFS, YARN, Mesos. Instead of spesific configurations for those, it could be better to evalute K8S and Spark Standalone configurations.
Thanks.