The accepted answer is a good one. Here I'd like to illustrate it by giving concrete examples from the field of data compression.
LZW, the "GIF Patents"
In the late 1970s, two fundamental dictionary-compression algorithms were published by Lempel and Ziv, known as LZ77 and LZ78 for short. Many modern data-compression algorithms can be described as being of the LZ77 or LZ78 family, though they invariably include significant enhancements over the originals, particularly in the use of an entropy coder.
One LZ78-family algorithm that became very popular in the 1980s was LZW, named after it's progenitor algorithm and its author, hence Lempel-Ziv-Welch. LZW was a very straightforward development of LZ78 which applied it to an 8-bit byte stream and a 4096-entry dictionary that could conveniently fit in the typical 64KB address space of the time. Most LZW implementations store the dictionary codes using a variable bit width, increasing as the dictionary fills up.
The two most famous uses of LZW are in the UNIX compress
utility, and in the GIF image format which remains popular today. Early versions of the PKZIP utility also used it (known internally as "implosion"). By modern standards it isn't a very good compression algorithm, due mainly to its small dictionary and lack of further entropy coding. LZ78-based algorithms are now thought to converge on their optimal compression rate more slowly than equivalent LZ77 implementations.
Both LZ78 and LZW are quite simple in principle. Nevertheless, LZW was covered by no fewer than three US patents, plus corresponding patents filed internationally. Two of the US patents were assigned to Unisys, and the third to IBM.
The LZW algorithm was widely published outside the patent ecosystem, which led to its initial popularity. Software engineers usually don't read patents, and typically have trouble understanding the patent language sufficiently well to implement the algorithms they describe accurately on a practical computer. It's normally easier to understand an academic paper, or even a source-code listing, and those were the usual means of disseminating algorithms at the time. (To some extent, they still are.)
In the 1990s, Unisys expressed its intention to enforce the LZW patents it held. This would have required implementors of the algorithm to pay licence fees - even if they had never read or even heard of the patent, but now had millions upon millions of copies of LZW code on computers around the world which they were now responsible for.
It was soon realised that the patents essentially covered the encoding algorithm, not the decoder, so people who never compressed data using LZW, but only uncompressed existing data, were relatively safe. A workaround was discovered whereby an LZW-compatible file could be created without infringing the patent, but this produced equivalent compression to a simple RLE scheme - it was therefore only useful for making (big, inefficient) GIFs without infringing the patent.
Concern over the legality of using LZW led to a search for patent-free alternatives, and more-or-less directly to the development of the "deflate" algorithm, which is a straightforward combination of the LZ77 dictionary compressor (allowing dictionary windows up to 32KB) with the 1950s Huffmann entropy coder.
PKZIP version 2 dropped LZW support in favour of the measurably superior Deflate. Several other compression utilities, such as StuffIt on the Mac, followed suit. The GNU project introduced gzip
as a direct replacement for UNIX compress
, and zlib
as an easy way to use Deflate in other applications and formats.
The PNG image format then aimed to replace GIF with the twin advantages of better compression (via zlib
) and support for more than 256 colours - though it lost animation support, which proved to be a significant oversight.
Practically the only use for the GIF format today is its animation feature - and now that the LZW patents have expired, its popularity has resurged substantially. It must be noted however that GIF animation has nothing whatsoever to do with the LZW algorithm.
In this case, the patent-free alternative was found relatively easily and proved to be objectively superior, making migration away from the patented algorithm a relatively easy choice. Not every case is so fortuitous.
Arithmetic Coding
The earliest entropy coder to be proved "optimal" was the Huffmann coder of 1951, which itself was an improvement over the very similar Shannon-Fano coder. It essentially builds a balanced binary tree of the symbol dictionary, and stores one bit for each branch decision on the tree. Practical computing was still very much in its infancy at the time; if any patents were ever filed on the original Huffmann algorithm (which I doubt), they have long since expired, making it legally safe to use.
However, it is rare for the branches of a Huffmann tree to be perfectly balanced; this requires that the symbol frequencies are all powers of two. So in general, it is possible to obtain further compression by arithmetic coding rather than being constrained by bit boundaries. Essentially, a number with arbitrary precision is constructed, in a number space that is repeatedly subdivided in arbitrarily lopsided ratios according to relative symbol frequencies. Very frequent symbols are assigned large parts of the number space, and thereby can consume less than a bit per symbol in extreme cases. This is inherently superior to Huffmann coding, which must use at least one bit per symbol.
However, since arithmetic coding was developed in the 1970s, it was patented multiple times over a period of about 15 years. Most of the patents cover methods of implementing the algorithm efficiently for a particular type of data and on a particular class of computing hardware, but most of the reasonable options were in fact patented (though these patents, like LZW, have now expired). It was widely believed that a closely-related algorithm called range coding was patent-free, though this was mathematically equivalent to arithmetic coding.
The arithmetic-coding patents had direct, detrimental effects on at least two compression formats: bzip/bzip2
and JPEG.
The bzip
compression utility was an attempt to improve on gzip
's compression ratio by using more modern techniques; specifically the Burrows-Wheeler Transform (BWT) and arithmetic coding. Although the experiment was successful, the author realised that his software could not legally be published as Free Software in the US due to the patents, so he wrote bzip2
with the arithmetic coding replaced by Huffmann coding. This was still measurably better than gzip
, so bzip2
became popular in the Free Software community for a while (until even more advanced utilities became available). The original bzip
software was never widely published, but a decompress-only utility is available in case any files compressed by it come to light.
The JPEG standard of 1990 is extremely widely used; practically every digital camera, image editor and Web browser supports it today. However, almost without exception, the patent-free variant of the format using Huffmann coding is used. The standard also defines a variant using arithmetic coding, but due to the patents, it is rarely used and isn't even supported by most software. Several of the later arithmetic-coding patents are, in fact, specifically for JPEG-compliant arithmetic coders, sharply increasing the probability that an independent implementation of the JPEG standard might accidentally infringe.
A few archive compression utilities, including some versions of StuffIt, are capable of recognising JPEG files and handling them specially, essentially converting them to the arithmetic-coded variant in a losslessly reversible manner. This typically saves about 25% space, much better than applying a general-purpose compression algorithm to the original file. Without the arithmetic-coding patents stifling the more general use of the more efficient variant, JPEG files would mostly be this much smaller from the outset.
Now that the arithmetic-coding patents have mostly expired, use of this entropy-coding method is starting to increase. Several recent video compression standards rely on it heavily; notably both H.264 and H.265 use an arithmetic-coding algorithm called CABAC. H.264 also supports the less-advanced CAVLC, which does not use arithmetic coding, for the Baseline profile.
This case illustrates an example where the patented algorithm was significantly superior to the available patent-free alternatives, resulting in an observable and verifiable decrease in software capabilities available to end users over a multi-decade period of time.