Linux LAG groups can use various load distribution schemes, a switch often just provides a single one. This is usually source address/destination address SA/DA - the egress port is chosen by a hash on the source/destination combination of MAC addresses, IP addresses, or IP and TCP/UDP port combinations, depending on the switch capabilities.
Accordingly, MAC SA/DA will only distribute flows from different end nodes between LAG interfaces. IP SA/DA is slightly better in that you can use multiple IP addresses to aid distribution. IP/port SA/DA is best in that it tries to distribute each socket's flow individually.
On the Linux side you often have more control over egress traffic but if the bottleneck is the ingress direction it's the switch that's defining your choices.
No distribution scheme will send frames belonging to a single stream over different interfaces. This prevents out-of-order reception in a stream which most often has a severe performance penalty. So basically, you need to have multiple streams / socket connections to start with. Generally, no LAG scheme will provide you with a true aggregated bandwidth.
For your purpose, I've found that avoiding LAG trunks and using IP/MAC-based distribution with distinct connections yields more predictable results.
With virtual MAC end nodes you can assign these MACs to the physical NICs according to your workloads. Nodes with more bandwidth demand simply get multiple vNICs. The load-balancing can be accomplished by simple round-robin DNS (an A record resolving to rotating IPs) or a more sophisticated scheme controlled by DNS or at the application level.
You could also use virtual IP addresses and map them to physical or virtual interface MAC addresses using a tight control over ARP. This can easily break extended sessions, so it's better suited for small content delivery and such.