TL;DR: Yes, but. More but than yes.
First things first. Since the standard C library must itself automatically garbage collect open file handles in the exit()
function (see standard quotes below), it is not necessary to ever call fclose
as long as:
You are absolutely certain that your program will eventually terminate either by returning from main()
or by calling exit()
.
You don't care how much time elapses before the file is closed (making data written to the file available to other processes).
You don't need to be informed if the close operation failed (perhaps because of disk failure).
Your process will not open more than FOPEN_MAX
files, and will not attempt to open the same file twice. (FOPEN_MAX
must be at least eight, but that includes the three standard streams.)
Of course, aside from very simple toy applications, those guarantees are pretty restrictive, particularly for files opened for writing. For a start, how are you going to guarantee that the host does not crash or get powered down (voiding condition 1)? So most programmers regard it as very bad style to not close all open files.
All the same, it is possible to imagine an application which only opens files for reading. In that case, the most serious issue with never calling fclose
will be the last one, the simultaneous open file limit. Five is a pretty small number, and even though most systems have much higher limits, they almost all have limits; if an application runs long enough, it will inevitably open too many files. (Condition 3 might be a problem, too, although not all operating systems impose this limit, and few systems impose the limit on files opened only for reading.)
As it happens, these are precisely the issues that garbage collection can, in theory, help solve. With a bit of work, it is possible to get a garbage collector to help manage the number of simultaneously open files. But... as mentioned, there are a number of Buts. Here's a few:
The standard library is under no obligation to dynamically allocate FILE
objects using malloc
, or indeed to dynamically allocate them at all. (A library which only allowed eight open files might have an internal statically allocated array of eight FILE
structures, for example.) So the garbage collector might never see the storage allocations. In order to involve the garbage collector in the removal of FILE
objects, every FILE*
needs to be wrapped inside a dynamically-allocated proxy (a "handle"), and every interface which takes or returns FILE*
pointers must be wrapped with one which creates a proxy. That's not too much work, but there are a lot of interfaces to wrap and the use of the wrappers basically relies on source modification; you might find it difficult to introduce FILE*
proxies if some files are opened by external library functions.
Although the garbage collector can be told what to do before it deletes certain objects (see below), most garbage collector libraries have no interface which provides for an object creation limit other than the availability of memory. The garbage collector can only solve the "too many open files" problem if it knows how many files are allowed to be open simultaneously, but it doesn't know and it doesn't have a way for you tell it. So you have to arrange for the garbage collector to be called manually when this limit is about to be breached. Of course, since you are already wrapping all calls to fopen
, as per point 1, you can add this logic to your wrapper, either by tracking the open file count, or by reacting to an error indication from fopen()
. (The C standard doesn't specify a portable mechanism for detecting this particular error, but Posix says that fopen
should fail and set errno
to EMFILE
if the process has too many files open. Posix also defines the ENFILE
error value for the case where there are too many files open in total over all processes; it's probably worthwhile to consider both of these cases.)
In addition, the garbage collector doesn't have a mechanism to limit garbage collection to a single resource type. (It would be very difficult to implement this in a mark-sweep garbage collector, such as the BDW collector, because all used memory needs to be scanned to find live pointers.) So triggering garbage collection whenever all file descriptor slots are used up could turn out to be quite expensive.
Finally, the garbage collector does not guarantee that garbage will be collected in a timely manner. If there is no resource pressure, the garbage collector could stay dormant for a long time, and if you are relying on the garbage collector to close your files, that means that the files could remain open for an unlimited amount of time even though they are no longer in use. So the first two conditions in the original list of requirements for omitting fclose()
continue to be in force, even with a garbage collector.
So. Yes, but, but, but, but. Here's what the Boehm GC documentation recommends (abbreviated):
- Actions that must be executed promptly… should be handled by explicit calls in the code.
- Scarce system resources should be managed explicitly whenever convenient. Use [garbage collection] only as a backup mechanism for the cases that would be hard to handle explicitly.
- If scarce resources are managed with [the garbage collector], the allocation routine for that resource (e.g. open file handles) should force a garbage collection (two if that doesn't suffice) if it finds itself short of the resource.
- If extremely scarce resources are managed (e.g. file descriptors on systems which have a limit of 20 open files), it may be necessary to introduce a descriptor caching scheme to hide the resource limit.
Now, suppose you've read all of that, and you still want to do it. It's actually pretty simple. As mentioned above, you need to define a proxy object, or handle, which holds a FILE*
. (If you are using Posix interfaces like open()
which use file descriptors -- small integers -- instead of FILE
structures, then the handle holds the fd. This is a different object type, obviously, but the mechanism is identical.)
In your wrapper for fopen()
(or open()
, or any of the other calls which return open FILE*
s or files), you dynamically allocate a handle, and then (in the case of the Boehm GC) call GC_register_finalizer
to tell the garbage collector what function to call when the resource is about to be deleted. Almost all GC libraries have some such facility; search for finalizer
in their documentation. Here's the documentation for the Boehm collector, out of which I extracted the list of warnings above.
Watch out to avoid race conditions when you are wrapping the open call. The recommended practice is as follows:
- Dynamically allocate the handle.
- Initialize its contents to a sentinel value (such as -1 or NULL) which indicates that the handle has not yet been assigned to an open file.
- Register a finalizer for the handle. The finalizer function should check for the sentinel value before attempting to call
fclose()
, so registering the handle at this point is fine.
- Open the file (or other such resource).
- If the open succeeds, reset the handle to use the returned from the open. If the failure has to do with resource exhaustion, trigger a manual garbage collection and repeat as necessary. (Be careful to limit the number of times you do that for a single open wrapper. Sometimes you need to do it twice, but three consecutive failures probably indicates some other kind of problem.)
- If the open eventually succeeded, return the handle. Otherwise, optionally deregister the finalizer (if your GC library allows that) and return an error indication.
Obligatory C standard quotes
Returning from main()
is the same as calling exit()
§5.1.2.2.3 (Program termination): (Only applies to hosted implementations)
- If the return type of the
main
function is a type compatible with int
, a return from the initial call to the main
function is equivalent to calling the exit function with the value returned by the main
function as its argument; reaching the }
that terminates the main
function returns a value of 0.
Calling exit()
flushes all file buffers and closes all open files
§7.22.4.4 (The exit function):
- Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the
tmpfile
function are removed…