2

I'm attempting to port my zeronconf-enabled C/C++ app to Linux, however I'm getting D-BUS related segfaults. I'm not sure if this is a bug in Avahi, my misuse of Avahi, or a bug in my code.

I am using a ZeroconfResolver object that encapsulates an AvahiClient, AvahiSimplePoll, and AvahiServiceResolver. The ZeroconfResolver has a Resolve function that first instantiates the AvahiSimplePoll, then AvahiClient, and finally the AvahiServiceResolver. At each instantiation I am checking for errors before continuing to the next. After the AvahiServiceResolver has been successfully created it calls avahi_simple_poll_loop with the AvahiSimplePoll.

This whole process works great when done synchronously but fails with segfaults when multiple ZeroconfResolvers are being used at the same time asynchronously (i.e I have multiple threads creating their own ZeroconfResolver objects). A trivial adaptation of the object that reproduces the segfaults can be seen in the code below (may not produce a segfault right away, but in my use case it happens frequently).

I understand that "out of the box" Avahi is not thread safe, but according to my interpretation of [1] it is safe to have multiple AvahiClient/AvahiPoll objects in the same process as long as they are not 'accessed' from more than one thread. Each ZeroconfResolver has its own set of Avahi objects that do not interact with each other across thread boundaries.

The segfaults occur in seemingly random functions within the Avahi library. In general they happen within the avahi_client_new or avahi_service_resolver_new functions referencing dbus. Does the Avahi wiki mean to imply that the 'creation' of AvahiClient/AvahiPoll objects is also not thread safe?

[1] http://avahi.org/wiki/RunningAvahiClientAsThread

#include <dispatch/dispatch.h>
#include <cstdio>

#include <sys/types.h>
#include <netinet/in.h>

#include <avahi-client/lookup.h>
#include <avahi-client/client.h>
#include <avahi-client/publish.h>
#include <avahi-common/alternative.h>
#include <avahi-common/simple-watch.h>
#include <avahi-common/malloc.h>
#include <avahi-common/error.h>
#include <avahi-common/timeval.h>

void resolve_reply(
  AvahiServiceResolver *r,
  AVAHI_GCC_UNUSED AvahiIfIndex interface,
  AVAHI_GCC_UNUSED AvahiProtocol protocol,
  AvahiResolverEvent event,
  const char *name,
  const char *type,
  const char *domain,
  const char *host_name,
  const AvahiAddress *address,
  uint16_t port,
  AvahiStringList *txt,
  AvahiLookupResultFlags flags,
  void * context) {

    assert(r);

    if (event == AVAHI_RESOLVER_FOUND)
      printf("resolve_reply(%s, %s, %s, %s)[FOUND]\n", name, type, domain, host_name);

    avahi_service_resolver_free(r);
    avahi_simple_poll_quit((AvahiSimplePoll*)context);
}


int main() {
  // Run until segfault
  while (true) {
    // Adding block to conccurent GCD queue (managed thread pool)
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), [=]{
      char name[] = "SomeHTTPServerToResolve";
      char domain[] = "local.";
      char type[] = "_http._tcp.";

      AvahiSimplePoll * simple_poll = NULL;
      if ((simple_poll = avahi_simple_poll_new())) {
        int error;
        AvahiClient * client = NULL;
        if ((client = avahi_client_new(avahi_simple_poll_get(simple_poll),   AVAHI_CLIENT_NO_FAIL, NULL, NULL, &error))) {
          AvahiServiceResolver * resolver = NULL;
             if ((resolver = avahi_service_resolver_new(client, AVAHI_IF_UNSPEC,     AVAHI_PROTO_UNSPEC, name, type, domain, AVAHI_PROTO_UNSPEC, AVAHI_LOOKUP_NO_ADDRESS,     (AvahiServiceResolverCallback)resolve_reply, simple_poll))) {
               avahi_simple_poll_loop(simple_poll);
               printf("Exit Loop(%p)\n", simple_poll);
             } else {
               printf("Resolve(%s, %s, %s)[%s]\n", name, type, domain, avahi_strerror(avahi_client_errno(client)));
             }
             avahi_client_free(client);
        } else {
          printf("avahi_client_new()[%s]\n", avahi_strerror(error));
        }
        avahi_simple_poll_free(simple_poll);
      } else {
        printf("avahi_simple_poll_new()[Failed]\n");
      }
    });
  }

  // Never reached
  return 0;
}
BigMacAttack
  • 4,479
  • 3
  • 30
  • 39
  • @JohanLundberg Nope. I gave up and decided to compile in an embedded version of Bonjour instead. I used the open-source uMundo project as a guide. Not necessarily an easy task but perhaps my only move since the Avahi community appears to be dying (almost all questions go unanswered on the mailing-list; the creators are silent). If this trend continues, I think the Linux distros should switch. – BigMacAttack Apr 06 '13 at 04:17
  • 1
    You can't really have more than one Zeroconf responder on a system. So you really have no choice but to use Avahi on Linux systems (unless you have complete control over the target systems' installations I suppose). – Craig McQueen May 02 '13 at 03:08
  • @CraigMcQueen "You can't really have more than one Zeroconf responder on a system". Says who? I know that it is not recommended, since it is technically duplicate functionality and increases network traffic, but I haven't seen it break anything in my tests. You just need to remember not to bind to the the multicast ports exclusively or Avahi won't be able to set itself up alongside the embedded version(s) to receive the same mDNS packets. – BigMacAttack May 02 '13 at 22:18
  • 1
    @BigMacAttack: I've looked it up, and I think you're right. It's not recommended, but it is possible, according to [Section 15 of RFC 6762](http://tools.ietf.org/html/rfc6762#section-15). – Craig McQueen May 02 '13 at 23:05
  • Thank you for the reference. It would be nice if they'd elaborate on the "known issues" they mention. – BigMacAttack May 03 '13 at 00:30
  • Sad state and worse 10 years later. None of this basic infrastructure projects like Avahi/Cairo/Pango have maintainers anymore and all have multithreading issues and nobody is willing to help out. – Lothar Jul 09 '23 at 21:06

1 Answers1

0

One solution that seems to work fine is to add your own synchronization (a common mutex) around avahi_client_new, avahi_service_resolver_new and the corresponding free operations. It seems avahi does not claim those operation to be internally synchronized.

What is claimed is that independent objects do not interfere.

I had success with this approach, using a helper class with a static mutex. To be specific, a static member function (or free function) like this:

std::mutex& avahi_mutex(){
  static std::mutex mtx;
  return mtx;
}

and a lock around any section of code (as small as possible) doing free or new:

{
  std::unique_lock<std::mutex> alock(avahi_mutex());
  simple_poll = avahi_simple_poll_new()
}    
Johan Lundberg
  • 26,184
  • 12
  • 71
  • 97