1

I recently started deploying my sites using Traefik for both ssl and reverse proxy. All seemed to be going well except phones specifically using AT&T data plans don't seem able to successfully connect to my sites. I get no error messages when they fail to connect and there are no known issues with any other internet service providers whether on data or wifi. I have no idea where to even start with an issue like this. I'm by no means a networking guru and the google search results with similar problems are non existent. Posted below are my Traefik related configuration files, hopefully they can provide a useful window into my configuration errors. Any help is much appreciated. Thank you.

docker-compose.yml

  traefik:
    build:
      context: .
      dockerfile: ./compose/production/traefik/Dockerfile
    image: app_production_traefik
    depends_on:
      - django
    volumes:
      - production_traefik:/etc/traefik/acme:z
    ports:
      - "0.0.0.0:80:80"
      - "0.0.0.0:443:443"

dockerfile

FROM traefik:v2.2.11 
# I have tried with the updated version and got the same result
RUN mkdir -p /etc/traefik/acme \
  && touch /etc/traefik/acme/acme.json \
  && chmod 600 /etc/traefik/acme/acme.json
COPY ./compose/production/traefik/traefik.yml /etc/traefik

traefik.yml

log:
  level: INFO

entryPoints:
  web:
    # http
    address: ":80"

  web-secure:
    # https
    address: ":443"

certificatesResolvers:
  letsencrypt:
    # https://docs.traefik.io/master/https/acme/#lets-encrypt
    acme:
      email: "me@email.com"
      storage: /etc/traefik/acme/acme.json
      # https://docs.traefik.io/master/https/acme/#httpchallenge
      httpChallenge:
        entryPoint: web

http:
  routers:
    web-router:
      rule: "Host(`mysite.com`) || Host(`www.mysite.com`)"
  
      entryPoints:
        - web
      middlewares:
        - redirect
        - csrf
      service: django

    web-secure-router:
      rule: "Host(`mysite.com`) || Host(`www.mysite.com`)"
  
      entryPoints:
        - web-secure
      middlewares:
        - csrf
      service: django
      tls:
        # https://docs.traefik.io/master/routing/routers/#certresolver
        certResolver: letsencrypt

  middlewares:
    redirect:
      # https://docs.traefik.io/master/middlewares/redirectscheme/
      redirectScheme:
        scheme: https
        permanent: true
    csrf:
      # https://docs.traefik.io/master/middlewares/headers/#hostsproxyheaders
      # https://docs.djangoproject.com/en/dev/ref/csrf/#ajax
      headers:
        hostsProxyHeaders: ["X-CSRFToken"]

  services:
    django:
      loadBalancer:
        servers:
          - url: http://django:5000

providers:
  # https://docs.traefik.io/master/providers/file/
  file:
    filename: /etc/traefik/traefik.yml
    watch: true

Update

I configured the log as demonstrated below - I get log updates every time a user successfully reaches the site but when I tried with an AT&T phone again, it did not log anything and the phone did not successfully connect to the site. On Chrome I simply see a message that reads "This site can't be reached, example.com unexpectedly closed the connection."

Display name
  • 753
  • 10
  • 28
  • Maybe the request simply nether reaches your Traefik server. Is it possible that somehow your domain name or IP address has been blacklisted by AT&T network? – Pierre B. Feb 24 '21 at 14:23
  • @PierreB. I don't think Ive been blacklisted, I have tried with three different domains and none of my website's content would be grounds for any form of blacklisting. Ones a portfolio, the other is ecological modeling, and the third is finance related. Nothing sketchy about them at all. – Display name Feb 24 '21 at 14:28
  • Is the problem related to the AT&T network, or to the phone browsers ? Can you share your phone connection with a laptop over WiFi and try the same request from a postman or laptop browser ? – saad Mar 01 '21 at 23:20
  • @saad , I apologize for the delayed response, I seem to have glanced right over your comment multiple times without even noticing it. The issue is specific to AT&T, I can take the same phone and have replicable results on multiple browsers. If on cellular (AT&T) there is no connection, if the same phone is on WIFi on any browser it will connect – Display name Mar 11 '21 at 02:52

2 Answers2

1

First step is putting Traefik into debug mode with the following configuration

log:
  level: DEBUG

accessLog: {}

more about accessLog at: https://doc.traefik.io/traefik/observability/access-logs/

This should show you all requests that are coming in the entryPoints and you can share more info from there on in the original question.

DarthHTTP
  • 406
  • 2
  • 7
  • I have updated my question in response to your answer – Display name Feb 23 '21 at 17:34
  • My thought process went to "mis-configuration of traefik or middlewares"; however it seems is not the case, traffic is not hitting your traefik. I would advise putting cloudflare in between the phone and app you are serving. it's fairly easy to configure - you point a DNS record to your server and relax the SSL requirements (no strict checking of certificates) and proxy everything via cloudflare. I suspect AT&T doing some transparent proxying themselves which doesn't like your setup. With Cloudflare you're essentially cloaking behind them. – DarthHTTP Feb 23 '21 at 18:14
  • Feel free to comment and link to a new question about Cloudflare if needed, i'll help you out, i use the free-tier configuration for years. – DarthHTTP Feb 23 '21 at 18:16
  • @Julian what was the conclusion of our findings ? Thanks for awarding the bounty anyhow, curious to learn what was the cause – DarthHTTP Mar 08 '21 at 09:38
  • 1
    Thank you for you continued interest in the question. As of now your solution to use cloud fare for proxy is the primary solution on the table, however, before reconfiguring everything I'm going to take some time to get familiar with Traefik and see if there are any silly mistakes I may have made. If thousands of developers use this product I'm convinced that they don't all have this problem leading me to think its still config issue. – Display name Mar 09 '21 at 00:08
  • 1
    I noticed I am getting a delayed message from Traefik about 8 mins after a failed AT&T connection that reads ' http: TLS handshake error from [IP ADDRESS]: tls: no cipher suite supported by both client and server ', I'm going to see what I can do with this info but thought I would keep you updated since you expressed continued interest. Thank you for your time. – Display name Mar 11 '21 at 02:57
  • @Julian please apply to Traefik the config generated by: https://ssl-config.mozilla.org/#server=traefik&version=2.1.2&config=old&guideline=5.6 it can be a lead to what's happening. possible that browser on that phone &/or AT&T transparent proxy would accept a more permissive TLS configuration. – DarthHTTP Mar 11 '21 at 09:25
  • You can test your TLS setup here: https://www.ssllabs.com/ssltest/ – DarthHTTP Mar 11 '21 at 09:28
  • 1
    I used the tools you sent over and got my ssl rating to an A+ playing around with the configuration generated by Mozilla. Unfortunately, still no luck with AT&T phones on any browser. Will continue my search! Thank you for all the help thus far, it is all very appreciated. – Display name Mar 11 '21 at 16:52
1

Finally solved a few weeks later. Took a LOT longer than I thought it would but solved it is!

Docker by default does not have IPv6 capabilities. However, by default, Linode adds the IPv6 address in the AAAA config when transferring your domains to them. Since the IPv6 was being advertised by the DNS config, AT&T seemed to default to reaching out to it and then didn't search for IPv4 when that failed. Interesting that they're the only service providers that do that. None the less, I removed the AAAA configuration in my DNS config and it fixed it after the changes took effect (about 27 hours.)

On the bright side, I at least bettered my Traefik config throughout the process haha

Display name
  • 753
  • 10
  • 28