What is meant by "flushing the buffer" here?
std::endl
causes the data in the stream's internal staging memory (its "buffer") to be "flushed" (transferred) to the operating system. The subsequent behavior depends on what type of device the stream is mapped to, but in general, flushing will give the appearance that the data has been physically transferred to the associated device. A sudden loss of power, however, might defeat the illusion.
This flushing involves some overhead (wasted time), and should therefore be minimized when execution speed is an important concern. Minimizing the overall impact of this overhead is the fundamental purpose of data buffering, but this goal can be defeated by excessive flushing.
Background information
The I/O of a computing system is typically very sophisticated and composed of multiple abstraction layers. Each such layer may introduce a certain amount of overhead. Data buffering is a way of reducing this overhead by minimizing the number of individual transactions performed between two layers of the system.
CPU/memory system-level buffering (caching): For very high activity, even the random-access-memory system of a computer can become a bottleneck. To address this, the CPU virtualizes memory accesses by providing multilple layers of hidden caches (the individual buffers of which are called cache lines). These processor caches buffer your algorithm's memory writes (pursuant to a writing policy) in order to minimize redundant accesses on the memory bus.
Application-level buffering: Although it isn't always necessary, it is not uncommon for an application to allocate chunks of memory to accumulate output data before passing it to the I/O library. This provides the fundamental benefit of allowing for random accesses (if necessary), but a significant reason for doing this is that it minimizes the overhead associated with making library calls -- which may be substantially more time-consuming than simply writing to a memory array.
I/O library buffering: The C++ IO stream library optionally manages a buffer for every open stream. This buffer is used, in particular, to limit the number of system calls to the operating system kernel because such calls tend to have some non-trivial overhead. This is the buffer which is flushed when using std::endl
.
operating system kernel and device drivers: The operating system routes the data to a specific device driver (or subsystem) based on what output device the stream is attached to. At this point, the actual behavior may vary widely depending on the nature and characteristics of that type of device. For example, when the device is a hard disk, the device driver might not initiate an immediate transfer to the device, but rather maintain its own buffer in order to further minimize redundant operations (since disks, too, are most efficiently written to in chunks). In order to explicitly flush kernel-level buffers, it may be necessary to call a system-level function such as fsync() on Linux
-- even closing the associated stream, doesn't necessarily force such flush.
Example output devices might include...
- a terminal on the local machine
- a terminal on a remote machine (via SSH or similar)
- data being sent to another application via pipes or sockets
- many variations of mass-storage devices and associated file-systems, which may be (again) locally attached or distributed via a network
hardware buffers: Specific hardware may contain its own memory buffers. Hard drives, for example, typically contain a disk buffer in order to (among other things) allow the physical writes to occur without requiring the system's CPU to be engaged in the entire process.
Under many circumstances, these various buffering layers tend to be (to a certain extent) redundant -- and therefore essentially overkill. However, the buffering at each layer can provide a tremendous gain in throughput if the other layers, for whatever reason, fail to deliver optimum buffering with respect to the overhead associated with each layer.
Long story short, std::endl
only addressed the buffer which is managed by the C++ IO stream library for that particular stream. After calling std::endl
, the data will have been moved to kernel-level management, and what happens next with the data depends on a great many factors.
How to avoid the overhead of std::endl
inline std::ostream & endl( std::ostream & os )
{
os.put( os.widen('\n') ); // http://en.cppreference.com/w/cpp/io/manip/endl
if ( debug_mode ) os.flush(); // supply 'debug_mode' however you want
return os;
}
In this example, you provide a custom endl
which can be called with-or-without invoking the internal call to flush()
(which is what forces the transfer to the operating system). Enabling the flush (with the debug_mode
variable) is useful for debugging scenarios where you want to be able to examine the output (for example a disk-file) when the program has terminated before cleanly closing the associated streams (which would have forced a final flush of the buffer).