In addition of ownership/drive mapping issues mentioned in "Git: fatal: fsync error on 'sha1 file': Bad file descriptor", you can actually know what file is problematic.
When creating a loose object file, we didn't report the exact filename of the file we failed to fsync
, even though the information was readily available, which has been corrected with Git 2.36 (Q2 2022).
See commit c4e707f (30 Mar 2022) by Neeraj Singh (neerajsi-msft
).
(Merged by Junio C Hamano -- gitster
-- in commit 98f6a3a, 04 Apr 2022)
object-file
: pass filename to fsync_or_die
Reported-by: Ævar Arnfjörð Bjarmason
Signed-off-by: Neeraj Singh
If we die while trying to fsync a loose object file, pass the actual filename we're trying to sync.
This is likely to be more helpful for a user trying to diagnose the cause of the failure than the former 'loose object file' string.
It also sidesteps any concerns about translating the die
message differently for loose objects versus something else that has a real path.
No more:
fatal: fsync error on 'loose object file': Bad file descriptor
But (Git 2.36+, Q2 2022):
fatal: fsync error on 'path/to/file': Bad file descriptor
Plus, you can use trace2 to add more details, since Git 2.36 (Q2 2022): trace2
code has been taught to report stats for fsync
operations.
See commit 9a49876 (30 Mar 2022) by Neeraj Singh (neerajsi-msft
).
(Merged by Junio C Hamano -- gitster
-- in commit fe496dc, 04 Apr 2022)
trace2
: add stats for fsync
operations
Signed-off-by: Neeraj Singh
Add some global trace2 statistics for the number of fsyncs
performed during the lifetime of a Git process.
These stats are printed as part of trace2_cmd_exit_fl,
which is presumably where we might want to print any other cross-cutting statistics.
You would see:
fsync the_repository fsync/writeout-only count_fsync_writeout_only
fsync the_repository fsync/hardware-flush count_fsync_hardware_flush
Git 2.38 (Q3 2022) now omits fsync-related trace2 entries when their values are all zero.
See commit 3a251ba (18 Jul 2022) by Ævar Arnfjörð Bjarmason (avar
).
(Merged by Junio C Hamano -- gitster
-- in commit 7c7719a, 27 Jul 2022)
trace2
: only include "fsync
" events if we git_fsync()
Signed-off-by: Ævar Arnfjörð Bjarmason
Fix the overly verbose trace2 logging added in 9a49876 ("trace2
: add stats for fsync operations", 2022-03-30, Git v2.36.0-rc0 -- merge) (first released with v2.36.0).
Since that change every single "git
" command invocation has included these "data" events, even though we'll only make use of these with core.fsyncMethod=batch
, and even then only have non-zero values if we're writing object data to disk.
See c0f4752 ("core.fsyncmethod
: batched disk flushes for loose-objects", 2022-04-04, Git v2.37.0-rc0 -- merge listed in batch #7) for that feature.
As we're needing to indent the trace2_data_intmax()
lines let's introduce helper variables to ensure that our resulting lines (which were already too) don't exceed the recommendations of the CodingGuidelines.
Doing that requires either wrapping them twice, or introducing short throwaway variable names, let's do the latter.
The result was that e.g. "git version
"(man) would previously emit a total of 6 trace2 events with the GIT_TRACE2_EVENT
target (version, start, cmd_ancestry
, cmd_name
, exit, atexit), but afterwards would emit 8. We'd emit 2 "data" events before the "exit" event.
The reason we didn't catch this was that the trace2 unit tests added in a15860d (trace2
: t/helper/test-trace2, 2019-02-22, Git v2.22.0-rc0 -- merge listed in batch #2) (trace2: t/helper/test-trace2, t0210.sh, t0211.sh, t0212.sh, 2019-02-22) would omit any "data" events that weren't the ones it cared about.
Let's make the trace2 testing more strict, and further append any new events types we don't know about in "t/t0212/parse_events.perl
".
Since we only invoke the "test-tool trace2
" there's no guarantee that we'll catch other overly verbose events in the future, but we'll at least notice if we start emitting new events that are issues every time we log anything with trace2's JSON target.
We exclude the "data_json
" event type, we'd otherwise would fail on both "win test" and "win+VS test" CI due to the logging added in 353d3d7 ("trace2
: collect Windows-specific process information", 2019-02-22, Git v2.22.0-rc0 -- merge listed in batch #2).
It looks like that logging should really be using trace2_cmd_ancestry()
instead, which was introduced later in 2f732bf ("tr2
: log parent process name", 2021-07-21, Git v2.34.0-rc0 -- merge listed in batch #1), but let's leave it for now.
The fix-up to aaf8122 ("unpack-objects
: use stream_loose_object()
to unpack large objects", 2022-06-11, Git v2.38.0 -- merge listed in batch #3) is needed because we're changing the behavior of these events as discussed above.
Since we'd always emit a "hardware-flush" event the test added in aaf8122 wasn't testing anything except that this trace2 data was unconditionally logged.
Even if "core.fsyncMethod
" wasn't set to "batch" we'd pass the test.
Now we'll check the expected number of "writeout" v.s.
"flush" calls under "core.fsyncMethod=batch
", but note that this doesn't actually test if we carried out the sync using that method, on a platform where we'd have to fall back to fsync()
each of those "writeout" would really be a "flush" (i.e. a full fsync()
).
But in this case what we're testing is that the logic in "unpack-objects" behaves as expected, not the OS-specific question of whether we actually were able to use the "bulk" method.