Artificial intelligence and copyright
In the 2020s, the rapid increase in the capabilities of deep learning-based generative artificial intelligence models, including text-to-image models such as Stable Diffusion and large language models such as ChatGPT, are posing questions of how copyright law applies to the training and use of such models. Because there is limited existing case law, experts consider this area to be fraught with uncertainty.
The largest issue regards whether infringement occurs when the generative AI is trained or used. Popular deep learning models are generally trained on very large datasets of media scraped from the Internet, much of which is copyrighted. Since the process of assembling training data involves making copies of copyrighted works it may violate the copyright holder's exclusive right to control the reproduction of their work, unless the use is covered by exceptions under a given jurisdiction's copyright statute. Additionally, the use of a model's outputs could be infringing, and the model creator may be accused of "vicarious liability" for said infringement. As of 2023, there are a number of pending US lawsuits challenging the use of copyrighted data to train AI models, with defendants arguing that this falls under fair use.
Another issue is that, in jurisdictions such as the US, output generated solely by a machine is ineligible for copyright protection, as most jurisdictions protect only "original" works having a human author. However, some have argued that the operator of an AI may qualify for copyright if they exercise sufficient originality in their use of an AI model.