I hope my answer helps you, even though I'm not a specialist but a fellow Pentaho user who is just trying to do exactly the same as you described and my experience so far is this:
(if anyone find something wrong on my answer, please let me know. I want to learn too =D)
What PDI Clusters are? - A scale out solution
Pentaho Data Integration clusters are awesome (1) to break huge transformations that uses up a lot of CPU/memory into smaller chunks and (2) to speed-up execution time with a clever design or at least make it run in common hardware (not a huge server with 24 CPUs and 256GM of RAM)
Is there a way to automatically distribute transformations (round-robin) inside de cluster?
I'm sorry to say that until now I've not been able to do that in my AWS instances. I use 3 EC2 in AWS to test the distribution with some different structures that follows:
- One master, two slaves - I sent all transformation entry to be executed by the same master in hopes that it would round robin between the slaves and only execute some transformation when the slaves are full of things to do. But it didn't happened this way, the master took all the work for himself and the slaves didn't do anything. (the same happens if you send a job that have parallel transformations to run)
- Three masters, via elastic load balancer - The ELB from AWS is a awesome way to distribute app requests from different sources to all your EC2 instances and I thougth that it could help me distributing my transformation to all the carte machines (all masters). Well it turns out if it's the same host making the request, you get pointed to the same EC2 instance. So everytime I sent the test job to run, one random master took all requests and the others just sat there, waiting. No good news here.
- Three masters, route 53 - Route 53 is the AWS DNS service and have a special ability to route your website/webapp requests in a lot of different ways. One of them is round-robin. But I got the same problem Elastic Load Balancer gave me. One random server got all the trouble, so, no good news here too.
Possible sollution
Well, it's not all a nightmare in which you can't distribute your transformations to a bunch of other machines execute. You actually can! But neither Carte, nor Elastic Load Balance, nor Route 53 will do the round robin for you. So what you do is just add all your slave servers (or masters servers) to your job, assigning a different slave server to each Transformation. That's doable in the advanced tab, like in the screenshot:
