0

I have to following setup:

  • A VPC network V and a VPC Connector for V using CIDR range "10.8.0.0/28" (EDITED)
  • The following services A and B are connected to the VPC via the Connector
  • Cloud Run Service A: This service is set to ingress=internal to secure the API. Its egress is set to or private-ranges-only
  • Cloud Run Service B: This service provides an API for another Service C within the Azure Cloud. B also needs access to Service A's API. The egress and ingress are set to all to route all outgoing traffic through the VPC connector and allow for a successful request on internal Service A.

The current problem is the following: Requests from Service C -> Service B return in a 504 Gateway Timeout. If the egress of Service B is changed to private-ranges-only the request of Service C succeeds but in return all requests of B -> A return 403 Forbidden since traffic is no longer routed through the VPC Connector because Cloud Run does not allow for private-ranges to send traffic to Service A(afaik). All requests of Cloud Run Services to other Cloud Run Services are currently issued to "*.run.app" URLs.

I can not come up with an idea for a possible and convenient fix for this setup. Is there an explanation why egress=all in Service B results in a Gateway Timeout of requests from Service C. I tried to follow logs from the VPC but did not see any causes.

PMatt
  • 25
  • 4
  • What are you doing in your service B when is called by the service C? Do you perform external calls? – guillaume blaquiere Apr 29 '21 at 07:25
  • @guillaumeblaquiere Service B is responding with a welcome message on the initial POST by Service C. In further interactions it may call Service A and sends a message to Service C. – PMatt Apr 29 '21 at 08:18
  • It's not normal, let me test (maybe tomorrow) I will be back after. Egress control is enforced on traffic originated from the Cloud Run service, not in answer to previous request. – guillaume blaquiere Apr 29 '21 at 12:59
  • I assume that the POST from Service C to the endpoint in Service B can arrive since ingress = all for Service B but on egress it gets routed through the VPC Connector and is somehow "lost" within the VPC network and can not get back to Service C. Thus the gateway timeout error. But i am not very experienced in networking topics. – PMatt Apr 29 '21 at 14:12
  • Impossible to reproduce this case in my side. It might miss some detail in your explanation, or it was a transient bug, fixed now. Can you try again, or provide more detail (or try in another project without too many change/try in your VPC configuration, routes and others) – guillaume blaquiere May 02 '21 at 19:38
  • The mentioned Service C is a Bot Channel Registration Service from Azure which needs to POST at an API provided by Service B (/api/messages). Service B is a Python Django Web Server where the source code is close to this [repo](https://github.com/microsoft/BotBuilder-Samples/tree/main/generators/python/app/templates/echo/%7B%7Bcookiecutter.bot_name%7D%7D). Service A is a Flask Web Server with business logic. – PMatt May 04 '21 at 07:05
  • @guillaumeblaquiere if you need specific configurations i could provide censored screenshots – PMatt May 04 '21 at 07:14
  • Service C on Azure change nothing (except is you have filtering on Azure). Django and Flask framework also change nothing, as long as they answer correctly unitary – guillaume blaquiere May 04 '21 at 07:20
  • My setup does work now with a different VPC Connector being used by the Service B. The connector uses as subnet of the overlying VPC network. My initial post had an error. The VPC Connector used a given CIDR range. Thus Service A uses a different connector now than Service B. @guillaumeblaquiere does this change explain my timeout?! – PMatt May 04 '21 at 10:40
  • I can't explain your issue. I tried to put the same connector on the both services, and I haven't any issue.... It's clearly strange. Note my range is 10.9.0.0/28 – guillaume blaquiere May 04 '21 at 18:57
  • Thanks for your help so far. I will debug the setup to hopefully find the reason and post it here. – PMatt May 05 '21 at 05:43

1 Answers1

1

The following changes were necessary to make it run:

  1. Follow this guide to create a static outbound ip for Service B
  2. Remove previous created VPC Connector (created with CIDR range not subnet as in guide)
  3. Update Cloud Run Service B to use VPC Connector created during Step 1

Since removing the static outbound ip is breaking the setup, I assume the azure service demands a static ip to communicate with.

PMatt
  • 25
  • 4