3

I'm trying to determine whether it's possible to distinguish between two separate handles on the same file, and a single handle with two file descriptors pointing to it, using metadata from procfs.

Case 1: Two File Handles

# setup
exec 3>test.lck
exec 4>test.lck
# usage
flock -x 3  # this grabs an exclusive lock
flock -s 4  # this blocks
echo "This code is never reached"

Case 2: One Handle, Two FDs

# setup
exec 3>test.lck
exec 4>&3
# usage
flock -x 3  # this grabs an exclusive lock
flock -s 4  # this converts that lock to a shared lock
echo "This code gets run"

If I'm inspecting a system's state from userland after the "setup" stage has finished and before the "usage", and I want to distinguish between those two cases, is the necessary metadata available? If not, what's the best way to expose it? (Is adding kernelspace pointers to /proc/*/fdinfo a reasonable action, which upstream is likely to accept as a patch?)

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • Did you take a look into `/proc/*/fd`? – nsilent22 Feb 20 '16 at 18:28
  • Of course, but if you have two `/proc/*/fd` entries that point to the same file, how do you distinguish case-A from case-B? `/proc/*/fdinfo` is slightly more helpful, but only slightly -- the most relevant information it can potentially contain is only present *after* a lock is grabbed. – Charles Duffy Feb 20 '16 at 18:38
  • 2
    But your two cases are identical (except for comments). – nsilent22 Feb 20 '16 at 18:48
  • @nsilent22, oh, hell! Damnit. Utter thinko; fixed them up to actually behave as described. – Charles Duffy Feb 20 '16 at 19:08

1 Answers1

1

I'm unaware of anything exposing this in proc as it is. Figuring this out may be useful when debugging some crap, but then you can just inspect the state with the kernel debugger or a systemtap script.

From your question it seems you want to achieve this in a manner which can be easily scripted and here I have to ask what is the real problem.

I have no idea if linux folks would be interested in exposing this. One problem is that exposing a pointer to file adds another infoleak and thus would be likely plugged in the future. Other means would require numbering all file objects and that's not going to happen. Regardless, you would be asked for a justification in a similar way I asked you above.

  • re: "exposing a pointer" -- if used with the `%Kp` kprintf form, they're sanitized (printed as 0s) if at the request of any unprivileged user with `kptr_restrict=1`, or in all conditions with `kptr_restrict=2`. As such, if `kptr_restrict` at a value other than 2, there's arguably user consent for exposing kernel pointers to userspace (albeit, with =1, only for users with root -- a restriction I'm perfectly fine with). – Charles Duffy Feb 23 '16 at 19:03
  • "The real problem", by the way, revolves around letting sysdig track the effects of flock() calls, for use in detecting lock contention, locking-related latency, &c. The tool, as a whole, builds a model of kernel state and allows recording and querying of that model in a manner that's quite effective -- but as of present, it's not sufficiently sophisticated for the query immediately at hand. – Charles Duffy Feb 23 '16 at 19:04
  • Returned pointers being 0 render the feature useless which makes it questionable whether it should be implemented like this in the first place. I don't know much about sysdig, it has to have a kernel component and that bit could take care of collecting necessary data. –  Feb 23 '16 at 19:27
  • The kernel component tracks a forward-going event stream; *initial* state is collected via a scan of procfs (started *after* the event stream is being recorded, of course, allowing changes to system state occurring during the procfs scan to be reconciled into a consistent reconstructed snapshot). – Charles Duffy Feb 23 '16 at 19:29
  • ...as for returned pointers being zero if `kptr_restrict=2`, that just means we don't support this feature if that's enabled, or likewise if running with a kernel that doesn't expose such pointers at all. – Charles Duffy Feb 23 '16 at 19:29