We have route based VPN tunnels going from a pair of Cisco routers at each of two sites (4 total routers), to two different VPN Gateways in the same shared VPC (XPN). Sharing routes through BGP, to a single Cloud Router. The two sites on our side each have a different AS, each pair peers over that same AS within the pair, but the peering between pairs is EIGRP. I'm redistributing the eigrp back into BGP (and vice versa) with a route-map and a specific metric. At one site I'm also redistributing in-out to OSPF, and the other in from static. The "local" routes to each site for each site come from ospf/static, and "remote" routes from eigrp.
So as an example...
AS 65001 is announcing 10.1.1.0/24 with a metric of 50 (it's a local route rdis from OSPF), and 172.16.1.1/24 with a metric of 100 (it's a remote route, learned from EIGRP).
AS 65002 is announcing 172.16.1.1/24 with a metric of 50 (it's a local route, rdis from static), and 10.1.1.0/24 with a metric of 100 (it's a remote route, learned from EIGRP).
There's actually 62 routes announced, but you get the picture. The list of 62 routes only varies in metric.
Now my problem... The cloud router takes all the routes from 65001 and makes them primary/active regardless of metric, and ignores the lower metric routes from 65002 unless I drop the tunnels/peering to 65001.
So my traffic always gets where it needs to go, but is taking sub-optimal routes.
This was working fine about two weeks ago, it has only stopped working as expected at some point since then.
I changed one of the AS 65002 routers over to 65001 (just with a neighbor statement local-as, and updating the GCP side also, as I'm also using AS 65002 to peer to Azure from those routers), but it didn't seem to change this behavior.