1

I tried to send 10M data to cluster via JGroups by udp-multicast, it takes tens of seconds, I have changed some udp properties, but it does not work, here is my udp.xml file. We test in local network.

there no problem while sending small data.

<UDP
        singleton_name="fr_cluster_TP_UDP"
        mcast_addr="${jgroups.udp.mcast_addr:235.5.5.10}"
        mcast_port="${jgroups.udp.mcast_port:45588}"
        ip_ttl="1"
        tos="8"
        ucast_recv_buf_size="5M"
        ucast_send_buf_size="5M"
        mcast_recv_buf_size="5M"
        mcast_send_buf_size="5M"
        max_bundle_size="64K"
        max_bundle_timeout="30"
        enable_diagnostics="true"
        thread_naming_pattern="cl"

        timer_type="new3"
        timer.min_threads="2"
        timer.max_threads="4"
        timer.keep_alive_time="3000"
        timer.queue_max_size="500"

        thread_pool.enabled="true"
        thread_pool.min_threads="2"
        thread_pool.max_threads="8"
        thread_pool.keep_alive_time="5000"
        thread_pool.queue_enabled="true"
        thread_pool.queue_max_size="10000"
        thread_pool.rejection_policy="discard"

        oob_thread_pool.enabled="true"
        oob_thread_pool.min_threads="1"
        oob_thread_pool.max_threads="8"
        oob_thread_pool.keep_alive_time="5000"
        oob_thread_pool.queue_enabled="false"
        oob_thread_pool.queue_max_size="100"
        oob_thread_pool.rejection_policy="discard"/>

<PING/>
<MERGE3 max_interval="30000"
        min_interval="10000"/>
<FD_SOCK/>
<FD_ALL/>
<VERIFY_SUSPECT timeout="1500"/>
<BARRIER/>
<pbcast.NAKACK2 xmit_interval="500"
                xmit_table_num_rows="100"
                xmit_table_msgs_per_row="2000"
                xmit_table_max_compaction_time="30000"
                max_msg_batch_size="500"
                use_mcast_xmit="false"
                discard_delivered_msgs="true"/>
<UNICAST3 xmit_interval="500"
          xmit_table_num_rows="100"
          xmit_table_msgs_per_row="2000"
          xmit_table_max_compaction_time="60000"
          conn_expiry_timeout="0"
          max_msg_batch_size="500"/>
<pbcast.STABLE stability_delay="1" desired_avg_gossip="50000"
               max_bytes="4M"/>
<pbcast.GMS print_local_addr="true" join_timeout="2000"
            view_bundling="true"/>
<FRAG2 frag_size="60K"/>
<RSVP resend_interval="2000" timeout="10000"/>
<pbcast.STATE_TRANSFER/>
<!-- pbcast.FLUSH  /-->

ANYONE have handled problem like this, I am doubting it's caused by local udp configuration instead of udp.xml file.

My demo is :

 public static void main(String[] args) throws Exception {

    JChannel jChannel = new JChannel(IOUtils.getResourceAsStream("/com/fr/cluster/engine/core/context/protocol/udp.xml", Test.class));
    jChannel.connect("testing----");
    jChannel.setDiscardOwnMessages(true);
    jChannel.setReceiver(new ReceiverAdapter() {

        public void receive(Message msg) {

            long current = System.currentTimeMillis();

            System.out.println(new SimpleDateFormat("HH:mm:ss.sss").format(new Date(current)) + " msg received. size : " + msg.getLength());
        }
    });
    Message message = new Message(null, getBytes());
    while (true) {
        System.in.read();
        System.out.println("Message sending.....");
        long current = System.currentTimeMillis();

        jChannel.send(new Message(null, new SimpleDateFormat("HH:mm:ss.sss").format(new Date(current))));
        jChannel.send(message);
        System.out.println("Message send complete.Time used " + (System.currentTimeMillis() - current));
    }
}

private static byte[] getBytes() throws FileNotFoundException {
    //10M
    int len = 10 * 1024 * 1024;
    byte[] data = new byte[len];
    for (int i = 0; i < len; i++) {
        data[i] = 1;
    }

    return data;
}
Rinoux
  • 11
  • 3

2 Answers2

1

Do you have a sample program that I can take a look at?

I ran a quick test and sending 10MB to a member took 97ms...

Note that in your config you most likely always have only 2 threads running in your thread pool, as you have a queue enabled and its size is 10000. Perhaps this slows things down... Try disabling the queue and raising max_threads.

Bela Ban
  • 2,186
  • 13
  • 12
0

You made one classical mistake: you reused message[1]! This won't work if you send messages in quick succession. Move the creation of message into the loop, and create one message per iteration, then this will work.

I tried this out both in 3.6 and 4.x and it worked perfectly.

[1] https://github.com/belaban/workshop/blob/master/slides/admin.adoc#problem-9-reusing-a-message-the-sebastian-problem

Bela Ban
  • 2,186
  • 13
  • 12
  • It's stable now, but still take 20 some seconds depends version 3.5/3.6, and 14s on version 4.0. It's there any properties or computer configurations matters the performance. BTW, we test that progra by two macbooks. – Rinoux Jan 15 '18 at 10:50
  • 1
    the issue of time consuming is caused by our router, it's ok now – Rinoux Jan 16 '18 at 07:20