EDIT
It seems that my first error I describe is very easy to reproduce. Actually, Google Run fails to run any GRPC query on a .NET5 GRPC server it seems (at least, it did work before but as of today, February 21st, it seems that something changed). To reproduce:
- Create a .NET5 GRPC server (also fails with .NET6):
dotnet new grpc -o TestGrpc
- Change
Program.cs
so that it listens on$PORT
, typically:
public static IHostBuilder CreateHostBuilder(string[] args)
{
var port = Environment.GetEnvironmentVariable("PORT") ?? "8080";
var url = string.Concat("http://0.0.0.0:", port);
return Host.CreateDefaultBuilder(args)
.ConfigureWebHostDefaults(webBuilder =>
{
webBuilder.UseStartup<Startup>().UseUrls(url);
});
}
- A very simple Dockerfile to have an image for the server (fails also with a more standard one, like here):
FROM mcr.microsoft.com/dotnet/sdk:5.0
COPY . ./
RUN dotnet restore ./TestGrpc.csproj
RUN dotnet build ./TestGrpc.csproj -c Release
CMD dotnet run --project ./TestGrpc.csproj
- Build and push to Google Artifcats Registry.
- Create a Cloud Run instance with HTTP/2 enabled (Ketrel requires HTTP/2 so we need to set HTTP/2 end-to-end, yet I tested without as well but it's not better).
- Use Grpcurl for instance and try:
grpcurl {CLOUD_RUN_URL}:443 list
And you will obtain the same error as I got with my (more complex) project:
Failed to list services: rpc error: code = Unavailable desc = upstream connect error or disconnect/reset before headers. reset reason: remote reset
On the Google Cloud Run instance I only have the log:
2022-02-21T16:44:32.528530Z POST 200 1.02 KB 41 ms grpcurl/v1.8.6 grpc-go/1.44.1-dev https://***/grpc.reflection.v1alpha.ServerReflection/ServerReflectionInfo
(I don't really understand why it's a 200 though... and never seems to reach the actual server implementation, just as if there was some kind of middleware blocking the query to reach the implementation... )
I'm pretty sure this used to worked as I started my project this way (and then changed the protos, the service, etc. ). If anyone has a clue I'd be more than grateful :-)
INITIAL POST (less precise than explanations above but I leave it here if it may give clues)
I have a server running within a Docker (.NET5 GRPC application). This server, when deployed locally works perfectly fine. But recently I have an error when I deploy it on Google Cloud Run: upstream connect error or disconnect/reset before headers. reset reason: remote reset
when it was working fine before. I keep on having this error from any client I use, for instance with Curl:
curl -v https://{ENDPOINT}/{Proto-base}/{Method} --http2
* Trying ***...
* TCP_NODELAY set
* Connected to *** (***) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=*.a.run.app
* start date: Feb 7 02:07:06 2022 GMT
* expire date: May 2 02:07:05 2022 GMT
* subjectAltName: host "***" matched cert's "*.a.run.app"
* issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1C3
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x5564aad30860)
> GET /{Proto}/{Method} HTTP/2
> Host: ***
> user-agent: curl/7.68.0
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
< HTTP/2 503
< content-length: 85
< content-type: text/plain
< date: Mon, 21 Feb 2022 13:51:31 GMT
< server: Google Frontend
< traceparent: 00-5a74487dafb5687961deeb17e0158ca9-5ab63cd23680e7d7-01
< x-cloud-trace-context: 5a74487dafb5687961deeb17e0158ca9/6536478782730069975;o=1
< alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
<
* Connection #0 to host *** left intact
upstream connect error or disconnect/reset before headers. reset reason: remote reset
Same happens with Grpcurl:
grpcurl ***:443 list {Proto-base}
Failed to list methods for service "***.Company": rpc error: code = Unavailable desc = upstream connect error or disconnect/reset before headers. reset reason: remote reset
I cannot find much resource on this error as most of the threads I read deal with another type of reset reason
(like protocol, or connection, etc.). But I strictly have no idea what remote reset
means and what I did wrong.
Looking at the logs in Google Cloud Run, I can see that the server is definitly hit, though I added trace logging in the route which is not triggered, hence it never reaches my code:
2022-02-21T14:44:22.840580Z POST 200 1.01 KB 1 msgrpc-python/1.44.0 grpc-c/22.0.0 (linux; chttp2) https://***/{Protos-base}/{Method}
(if I reached my code it should print some "Hellos" everywhere which it doesn't)
Has anyone ever found this?
P.S.: there are many things around about Envoy, but I don't even use this. I simply have a Cloud Run instance (with HTTP/2 - and I tried without but it fails due to protocol issue).