0

I have an aeron stream over multicast, some hosts subscribed to the stream see new sessions but then have their images go unavailable very quickly. These hosts then never see those images become available again. Whilst other hosts successfully receive data from the same session.

On my aeron stream I have 3 hosts A, B and C that all publish and subscribe. When I start my publication from A, both B and C initially see it (confirmed via AVAILABLE_IMAGE events in the event log). However, after 11s host C reports the image as unavailable. It then never becomes available again, even with restarting the publication (the publication from that host reuses the same sessionid). In the meantime host B continues to receive data successfully from this session. Upon restarting the driver everything works correctly.

One difference between the subscription from C and the subscription from B, is that subscription C uses tether=true. It should also be noted that other streams sessions on the same multicast group published from host A and received by host B are working correctly so it doesn't seem to be a network issue.

I'd expect the images to become available again when they continue to see data for those sessions. However this never happens.

What could be causing the initial unavailable image, and why doesn't it eventually try recreating it?

Reiss
  • 3
  • 2
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jan 20 '23 at 07:59
  • A worked example would be required to better what you are doing. – Martin Thompson Jan 25 '23 at 14:20
  • @MartinThompson what details do you need? I've not been able to reproduce this. I think my question can be simplified to the following: What are all the causes an unavailable image? Which of these cases are possible without network issues? Do images ever not become available again, even if the publisher successfully continues to send data? – Reiss Jan 26 '23 at 08:20
  • There are many potential reasons an image would go unavailable. It would take a blog post to cover them all. – Martin Thompson Jan 26 '23 at 14:08

1 Answers1

0

This was only happening due to various bugs present in aeron 1.40.0 and prior. All causes of stuck sessions that I'm aware of have had fixes submitted as of today.

Reiss
  • 3
  • 2