1

On OS X El Capitan, my log file system.log feels with hundreds of the following lines at times

03/07/2016 11:52:17.000 kernel[0]: hfs_clonefile: cluster_read failed - 34

but there is no indication of the process where this happens. Apart from that, Disk Utility could not find any fault with the file system. But I would still like to know what is going on and it seems to me that dtrace should be perfectly suited to find out that faulty process but I am stuck. I know of the function return probe but it seems to require the PID, e.g.

dtrace -n 'pidXXXX::hfs_clonefile:return { printf("ret: %d", arg1); }'

Is there a way to tell dtrace to probe all processes? And then how would I print the process name?

  • What version of macOS? – Ken Thomases Jul 02 '16 at 10:44
  • According to this discussion https://discussions.apple.com/thread/4940204?tstart=0, `34` is for "disk full" errors. – Andrew Henle Jul 02 '16 at 15:47
  • 1
    What is actually in your log files? How do they strongly suggest that `hfs_clonefile` is encountering an error? Is there more going on than log messages? That is, are there "real" symptoms of a problem? For what it's worth, `hfs_clonefile()` is not a syscall. It seems to be a function internal to the HFS(+) file system implementation. – Ken Thomases Jul 02 '16 at 16:36

2 Answers2

0

You can use the syscall provider rather than the pid provider to do this sort of thing. Something like:

sudo dtrace -n 'syscall::hfs_clonefile*:return /errno != 0/ { printf("ret: %d\n", errno); }'

The above command is a minor variant of what's used within the built-in DTrace-based errinfo utility. You can view /usr/bin/errinfo in any editor to see how it works.

However, there's no hfs_clonefile syscall, as least as far as DTrace is concerned, on my El Capitan (10.11.5) system:

$ sudo dtrace -l -n 'syscall::hfs*:'

   ID   PROVIDER            MODULE                          FUNCTION NAME
dtrace: failed to match syscall::hfs*:: No probe matches description

Also, unfortunately the syscall provider is prevented from tracing system processes by the System Integrity Protection feature introduced with El Capitan (macOS 10.11). So, you will have to disable SIP which makes your system less secure.

Ken Thomases
  • 88,520
  • 7
  • 116
  • 154
  • Interesting that there's no dTrace probe. This seems to be the source code: http://fxr.watson.org/fxr/source/bsd/hfs/hfs_readwrite.c?v=xnu-1456.1.26;im=excerpts#L3934 Does dTrace on OS X provide a useful set of `fbt` probes? – Andrew Henle Jul 02 '16 at 15:40
  • 1
    You've found a function. Not all functions in the kernel are syscalls. That appears to be an internal function of the HFS(+) VFS (virtual file system) implementation. I'm not sure about the `fbt` provider's probes, but, again, it's disabled by SIP. You might edit your question to explain what reason you have to suspect an error with this function in the first place and why you care. I.e. what symptoms other than a log message are there? – Ken Thomases Jul 02 '16 at 15:46
  • Why do I suspect the problem is in `hfs_clonefile()`? Because the OP's question states "hfs_clonefile fails with an error code 34". – Andrew Henle Jul 02 '16 at 15:52
  • Sorry, wasn't looking closely and assumed you were OP. I was intending to ask OP why he suspected that. – Ken Thomases Jul 02 '16 at 16:34
0

You can try something like this (I don't have access to an OS X machine to test it)

#!/usr/sbin/dtrace -s
# pragma D option quiet

fbt::hfs_clonefile:return
/ args[ 1 ] != 0 /
{
    printf( "\n========\nprocess: %s, pid: %d, ret value: %d\n", execname, pid, args[ 1 ] );
    /* get kernel and user-space stacks */
    stack( 20 );
    ustack( 20 );
}

For the fbt probes, args[ 1 ] is the value returned by the function.

The dTrace script will print out the process name, pid, and return value from hfs_clonefile() whenever the return value is not zero. It also adds the kernel and user space stack traces. That should be more than enough data for you to find the source of the errors.

Assuming it works on OS X, anyway.

Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • As by [pointed out Ken Thomases](http://stackoverflow.com/questions/38156192/find-process-where-a-particular-system-call-returns-a-particular-error#comment63751899_38160117), the function hfs_clonefile is declared static in [hfs_readwrite.c](https://github.com/opensource-apple/xnu/blob/27ffc00f33925b582391b1ef318b78b8bd3939d1/bsd/hfs/hfs_readwrite.c#L97), so dtrace does not have a probe on it. But by analysing the source code, I can find which functions calls hfs_clonefile and probe those instead. Your approach works with hfs_relocate for example. So thanks! –  Jul 03 '16 at 10:28