1

I've been messing around with an application that uses boost::asio for both UDP and SocketCAN communication. Today, I noticed something weird - it was leaking memory!

So I grabbed my trusty toolkit consisting of

echo 0 $(awk '/Private/ {print "+", $2}' /proc/`pidof main`/smaps) | bc

and Allinea DDT and got to work diagnosing this issue.

What I ended up with was the following snippet, which utilizes boost::asio::posix::basic_stream_descriptor as it's base :

void Can::write(struct can_frame frame) {
  stream_.async_write_some(boost::asio::buffer(&frame, sizeof(frame)),
                           boost::bind(&Can::datSend, this)
  );
}

Here, the datSend is just an empty function that pings a watchdog. I've also tried

void Can::write(struct can_frame frame) {
  stream_.write_some(boost::asio::buffer(&frame, sizeof(frame)));
}

But this gives an exception (invalid data) for some reason.

The rear end of this code looks something like this :

boost::asio::io_service ioService_;
boost::asio::posix::basic_stream_descriptor<> stream_;

Constructor() : stream_(ioService_) {
  socketDescriptor_ = socket(PF_CAN, SOCK_RAW, CAN_RAW);

  struct timeval timeout {
      .tv_sec = 5,
      .tv_usec = 0
  };

  if (setsockopt(socketDescriptor_, SOL_SOCKET, SO_RCVTIMEO,
                 reinterpret_cast<char *>(&timeout),
                 sizeof(timeout)) < 0) {
    throw std::string("Error setting CAN socket timeout");
  }

  strcpy(interfaceRequest_.ifr_name, interfaceName.c_str());
  ioctl(socketDescriptor_, SIOCGIFINDEX, &interfaceRequest_);
  socketAddress_.can_family = AF_CAN;
  socketAddress_.can_ifindex = interfaceRequest_.ifr_ifindex;
  stream_.assign(socketDescriptor_);

  if (bind(socketDescriptor_, (struct sockaddr *)&socketAddress_,
           sizeof(socketAddress_)) < 0) {
    throw std::string("Error in socket bind");
  }

}

Afterwards I just run the ioservice and that's that :

void Can::iosThreadWorker() { ioService_.run(); }

I've gone over quite a few stackoverflow topics as well as boost documentation, but can't seem to find why this function would leak memory.

Boost version - 1.60 G++ - 6.30 OS : Ubuntu 17.04

  • Your first suspicion should always be your own code, not that of a well known and heavily used library. You dismiss datSend() as irrelevant, but that may indeed be where the leak happens. Please provide the code for that function as well. – Eyal K. Sep 26 '17 at 12:11
  • It is literally an empty function : void Can::datSend(){} I used to log this occasion, but currently, it does nothing but act as a placeholder. – Rainer Keerdo Sep 26 '17 at 12:18
  • You can set a debug hook in your main file, and once entering the suspected function, set a breakpoint on any memory allocations. See if any of them stand out – Eyal K. Sep 26 '17 at 12:20
  • I saved this memory debugger stacktrace from earlier, and it seems that inside asio, this is the function that gets called a lot : boost::asio::asio_handler_allocate(unsigned long, ...) (handler_alloc_hook.ipp) , approximately millions of times, and is never deallocated. – Rainer Keerdo Sep 26 '17 at 12:23
  • Please post a self contained example that demonstrates the issue. – sehe Sep 26 '17 at 12:23
  • The call to allocate must have a corresponding deallocation. The problem is probably higher up the stack trace – Eyal K. Sep 26 '17 at 12:28
  • https://pastebin.com/wSxktwy6 This would be the similar to the minimal use case, which my complete system does. – Rainer Keerdo Sep 26 '17 at 12:30
  • @EyalK. - any suggestions to dive in deeper? I basically let my code run with close to no manual memory allocation, and I've so far verified that by commenting out the one stream_.async_write_some will stop the leak. – Rainer Keerdo Sep 26 '17 at 12:33
  • Creating the buffer is definitely a memory allocation. Try creating it only once and reusing it (if possible) – Eyal K. Sep 26 '17 at 13:05
  • 1
    It turns out that was not the issue. Technically, it was not even a leak, but just continuous assignment of memory as intended. – Rainer Keerdo Sep 26 '17 at 13:22

1 Answers1

2

So I dug a bit deeper and found this tidbit about boost.io_service :

io_service.run() completes if it has no work to do - so by running it and then sending more async work to be done, the work was not getting completed.

This stemmed from an issue where someone had flipped the ioservice.run() and async read callback assignment to make to code look better - now there was no work piled up before running and ioservice could just complete it's job and return.

  • 1
    Kudos for solving it. Next time, edit the question with additional information, so people will notice (I didn't find your comment before you posted this answer) – sehe Sep 26 '17 at 13:26