There are two ways of communication between control plane & data plane:
- Legacy - when VMs running on the data plane should have the public IPs, and control plane reaches them directly. This way was always a security headache. Azure still supports it & shows in the UI, but it shouldn't be used
- "No Public IP (NPIP)" or another name "Secure Cluster Connectivity" (doc and more technical details). In this case, when VMs in the data plane are starting, they are opening a bi-directional tunnel to a relay on the control plane, and it's always used for controlling VMs & Spark. In this setup, VMs don't need public IPs, and it's much more secure & easy to control.
Regarding authentication - it's internal detail, but it provides a way of ensuring that VMs that are communicating with control plane are really that VMs that form a cluster.