1

I'm attempting to create a Java wrapped C implementation of a CAN bus that is to be used in an Android environment. I've used SocketCAN including in the Linux kernel to create a socket and bind it to the CAN interface.

The development environment doesn't have a physical CAN bus, and as a result I'm creating a virtual bus via sudo ip link add dev vcan0 type vcan and sudo ip link set up vcan0.

Running the native C code in this environment works as expected, the socket binds when the interface is present, and returns an error when it is not. However, when running the same native C code via JNI the bind(...) call always returns 0 regardless of the state of the interface, although any subsequent write(...) calls fail as expected.

Is there something I've overlooked that means this is the case?

The JNI code is as follows (lifted directly from my C implementation with additional type casting where necessary):

JNIEXPORT jboolean JNICALL Java_SocketCAN_nativeOpen
  (JNIEnv * env, jobject jobj, jint bus_id)
{
  if ((int) bus_id < MAX_NUMBER_OF_CAN_BUSES)
  {
    int s;

    if((s = socket(PF_CAN, SOCK_RAW, CAN_RAW)) == -1) 
    {
      printf("Error while opening socket\n");
      return JNI_FALSE;
    }

    struct sockaddr_can addr;
    struct ifreq ifr;

    strcpy(ifr.ifr_name, "vcan0");
    ioctl(s, SIOCGIFINDEX, &ifr);
    
    addr.can_family  = AF_CAN;
    addr.can_ifindex = ifr.ifr_ifindex;

    if(bind(s, (struct sockaddr *)&addr, sizeof(addr)) == -1) 
    {
      printf("Error in socket bind\n");
      return JNI_FALSE;
    }

    // Set the socketId in the Java class.
    jclass   jcls        = (*env)->FindClass(env, "SocketCAN");
    jfieldID socket_id   = (*env)->GetFieldID(env, jcls, "socket", "I");
    jint     j_socket_id = (*env)->GetIntField(env, jobj, socket_id);
    j_socket_id = s;
    (*env)->SetIntField(env, jobj, socket_id, j_socket_id);

    return JNI_TRUE;
  }
  
  return JNI_FALSE;
}

Any help is much appreciated, thanks!

EDIT: If anyone seems to be experiencing this weird issue and wants a workaround (although it might be the correct way to do this and I've overlooked it), check the return value from the ioctl(...) function call. That returns -1 when "vcan0" isn't set up when running both the C and the JNI.

My updated code after modifying for the suggestions made by @12431234123412341234123 and @AndrewHenle is as follows:

JNIEXPORT jint JNICALL Java_SocketCAN_nativeOpen
  (JNIEnv * env, jobject jobj, jint bus_id)
{
  if ((int) bus_id < MAX_NUMBER_OF_CAN_BUSES)
  {
    int s;

    if((s = socket(PF_CAN, SOCK_RAW, CAN_RAW)) == -1) 
    {
      printf("Error while opening socket\n");
      return -1;
    }

    struct sockaddr_can addr;
    struct ifreq ifr;

    strcpy(ifr.ifr_name, "vcan0");
    if (ioctl(s, SIOCGIFINDEX, &ifr) == -1)
    {
      printf("Error in ioctl\n");
      return -1;
    }
    
    addr.can_family  = AF_CAN;
    addr.can_ifindex = ifr.ifr_ifindex;

    if(bind(s, (struct sockaddr *)&addr, sizeof(addr)) == -1) 
    {
      printf("Error in socket bind\n");
      return -1;
    }

    return (jint) s;
  }
  
  return -1;
}
  • If this is your problem: In my C implementation, after a opened the connection i try to read a frame with a timeout of 0 (`recv(p->socket, &frame, sizeof(frame), MSG_DONTWAIT)`). If there is a error which is not `EAGAIN` i know the CAN network does not work. The received frame is ignored. – 12431234123412341234123 Dec 17 '20 at 12:05
  • Apologies for not being very clear. In my case it returns 0 regardless of whether the "vcan0" interface is present or not when the JNI implementation is called. When the interface is not present running the C results in bind returning -1, however running the Java results in bind returning 0. – J. Sugarbum Dec 17 '20 at 12:07
  • OT: Your code has a couple of other issues. First, if the `bind()` fails you're leaking the descriptor. Second, it's more complex than it needs to be - just *return* the socket as a `jint` instead of setting the field with JNI code. You can drop all the JNI field setting code, and the `socket` field would be directly set in Java with something like `socket = nativeOpen( bus_id );` instead of being set as an invisible side effect of the function call. Just use the standard "`-1` means failure" POSIX descriptor paradigm to check for errors. – Andrew Henle Dec 17 '20 at 12:23
  • @12431234123412341234123 - I've attempted adding in your recv(...) call as a workaround, but it returns EPERM regardless of whether "vcan0" is set up. – J. Sugarbum Dec 17 '20 at 12:29
  • @AndrewHenle - Thanks for this, I'll make those changes. Do you have any insight why it might be that the bind never fails? – J. Sugarbum Dec 17 '20 at 12:31
  • @J.Sugarbum Do you have permission to access the CAN network? Do you get the same error when you use C? – 12431234123412341234123 Dec 17 '20 at 13:14
  • Why do you ignore the return value of `ioctl()`? I think this is your original problem. – 12431234123412341234123 Dec 17 '20 at 13:17
  • @J.Sugarbum I've never used CAN, so no. But [this kernel.org document](https://www.kernel.org/doc/Documentation/networking/can.txt) doesn't do any checking for `bind()` failure, so maybe it can't fail? In your situation, I'd check the return from `ioctl()` and `memset()` your `addr` structure to all zeros. I'd also run under `strace` to see all the system calls and their results. – Andrew Henle Dec 17 '20 at 13:19
  • @12431234123412341234123 - The C code behaves as I expect, with errors returned from the bind when the "vcan0" interface isn't present. The terminal prints "Error in socket bind : No such device". I've got the necessary permissions. As for ignoring the return value of `ioctl()` I've just used the [example at kernel.org](https://www.kernel.org/doc/html/latest/networking/can.html) in which there isn't any return value checks either. I'll see what it returns and go from there. – J. Sugarbum Dec 17 '20 at 13:27
  • @AndrewHenle - Calling `bind(...)` from the C code when the "vcan0" interface is not setup does result in a return of -1 so I do think it can fail. I'll check the return value from `ioctl(...)` as you suggest. – J. Sugarbum Dec 17 '20 at 13:30
  • Got it! Checking the return value of the `ioctl(...)` is the answer. That returns -1 when running both C and the JNI when "vcan0" isn't setup. Although it doesn't answer the initial question it does allow me to get on with the rest of the project. Cheers! – J. Sugarbum Dec 17 '20 at 13:39
  • @J.Sugarbum The documentation with the example says: "(example for CAN_RAW sockets without error checking)". I bet you also get an error with C when you call `ioctl()` with a non-existing interface. – 12431234123412341234123 Dec 17 '20 at 14:05
  • @12431234123412341234123 - You do, which is what I'm after. I want to be able to provide notification to the user when there's an error opening the connection to the CAN bus, and that's exactly what adding that check in has achieved. – J. Sugarbum Dec 17 '20 at 14:21

1 Answers1

1

The bind() call needs a proper interface index in can_ifindex. To get this value you can use the ioctl() call with SIOCGIFINDEX, as you do. However, when the ioctl() call fails, the ifreq structure does not necessarily have the correct index, it probably still has the "random" value from the last object that occupied the same memory region before. Because you ignored the return value from ioctl(), you called bind() with a "random" interface index. This also means that bind may or may not fail, depending on the value, because using a uninitialized value is UB in most cases. To avoid this error, check for the return value from ioctl() and handle errors accordingly.

It seems that this "random" value is different for the plain C version as it is for the JNI version. A possibility to avoid such random differences is by setting every new automatic object directly to a value. In your case you could to set everything to 0: struct ifreq ifr={0};, same for addr. This extra step could gain a more consistent behaviour.

  • Thanks for this explanation. It's provided the context I needed to understand what was going on here and cleared up the potential reasons why I was experiencing differences between the C and JNI implementations. – J. Sugarbum Dec 17 '20 at 15:38