5

I'm using netty 4.1 as NIO socket server for MMORPG game. It was running perfectly for years but recently we are suffering from DDOS attacks. I was fighting it for a long time but currently, I don't have any more ideas on how could I improve it. Ddoser is spamming with new connections from thousands of ips from all over the world. It's difficult to cut it on the network level because attacks look very similar to normal players. Attacks are not very big compared to attacks on HTTP servers but big enough to crash our game.

How i'm using netty:

public void startServer() {

    bossGroup = new NioEventLoopGroup(1);
    workerGroup = new NioEventLoopGroup();

    try {
        int timeout = (Settings.SOCKET_TIMEOUT*1000);
        bootstrap = new ServerBootstrap();

        int bufferSize = 65536;
        bootstrap.group(bossGroup, workerGroup)
                .channel(NioServerSocketChannel.class)
                .childOption(ChannelOption.SO_KEEPALIVE, true)
                .childOption(ChannelOption.SO_TIMEOUT, timeout)
                .childOption(ChannelOption.SO_RCVBUF, bufferSize)
                .childOption(ChannelOption.SO_SNDBUF, bufferSize)
                .handler(new LoggingHandler(LogLevel.INFO))
                .childHandler(new CustomInitalizer(sslCtx));


        ChannelFuture bind = bootstrap.bind(DrServerAdmin.port);
        bossChannel = bind.sync();

    } catch (InterruptedException e) {
        e.printStackTrace();
    } finally {
        bossGroup.shutdownGracefully();
        workerGroup.shutdownGracefully();
    }
}

Initalizer:

public class CustomInitalizer extends ChannelInitializer<SocketChannel> {

    public static  DefaultEventExecutorGroup normalGroup = new DefaultEventExecutorGroup(16);
    public static  DefaultEventExecutorGroup loginGroup = new DefaultEventExecutorGroup(8);
    public static  DefaultEventExecutorGroup commandsGroup = new DefaultEventExecutorGroup(4);

    private final SslContext sslCtx;

    public CustomInitalizer(SslContext sslCtx) {
        this.sslCtx = sslCtx;
    }

    @Override
    public void initChannel(SocketChannel ch) throws Exception {
        ChannelPipeline pipeline = ch.pipeline();

        if (sslCtx != null) {
            pipeline.addLast(sslCtx.newHandler(ch.alloc()));
        }

        pipeline.addLast(new CustomFirewall()); //it is AbstractRemoteAddressFilter<InetSocketAddress>
        int limit = 32768;        
        pipeline.addLast(new DelimiterBasedFrameDecoder(limit, Delimiters.nulDelimiter()));
        pipeline.addLast("decoder", new StringDecoder(CharsetUtil.UTF_8));
        pipeline.addLast("encoder", new StringEncoder(CharsetUtil.UTF_8));

        pipeline.addLast(new CustomReadTimeoutHandler(Settings.SOCKET_TIMEOUT));

        int id = DrServerNetty.getDrServer().getIdClient();
        CustomHandler normalHandler = new CustomHandler();
        FlashClientNetty client = new FlashClientNetty(normalHandler,id);
        normalHandler.setClient(client);

        pipeline.addLast(normalGroup,"normalHandler",normalHandler);

        CustomLoginHandler loginHandler = new CustomLoginHandler(client);
        pipeline.addLast(loginGroup,"loginHandler",loginHandler);


        CustomCommandsHandler commandsHandler = new CustomCommandsHandler(loginHandler.client);
        pipeline.addLast(commandsGroup, "commandsHandler", commandsHandler);

    }
}

I'm using 5 groups:

  • bootstrap bossGroup - for new connections
  • bootstrap workerGroup - for delivering messages
  • normalGroup - for most messages
  • loginGroup - for heavy login process
  • commands group - for some heavy logic

I'm monitoring the number of new connections and messages so I can immediately find out if there is an attack going. During the attack I'm not accepting new connections anymore: I'm returning false in the custom firewall ( AbstractRemoteAddressFilter ).

protected boolean accept(ChannelHandlerContext ctx, InetSocketAddress remoteAddress) throws Exception {
    if(ddosDetected())
       return false;
    else
        return true;
}

But even that I'm dropping new connections right away my workgroup is getting overloaded. PendingTasks for worker group (all other groups are fine) are growing which causes longer and longer communications for normal players and finally, they get kicked by socket_timeouts. I'm not sure why is it happen. During normal server usage, the busiest groups are login and normal group. On network level server is fine - it's using just ~10% of its bandwidth limit. CPU and RAM usage also isn't very high during the attack. But after a few minutes of such an attack, all my players are kicked out from the game and are not able to connect anymore.

Is there any better way to instantly drop all incoming connections and protect users that are aready connected?

INDRAJITH EKANAYAKE
  • 3,894
  • 11
  • 41
  • 63
drygu
  • 51
  • 2
  • 1
    A bit unrelated but maybe you should consider external services e.g. [CloudFlare](https://www.cloudflare.com/ddos/) depending on the scale of the DDOS. – Karol Dowbecki Mar 06 '19 at 12:01
  • Last time i was resaerching this topic i couldn't find any services that were supproting TCP protection. But now i can see that cloudflare have something like this: https://www.cloudflare.com/products/cloudflare-spectrum/. Thank you i'll check it out. – drygu Mar 06 '19 at 17:04

1 Answers1

1

I think you will need to "fix this" on the kernel level via for example iptables. Otherwise you can only close the connection after you already accept it which sounds not good enough in this case.

Norman Maurer
  • 23,104
  • 2
  • 33
  • 31