OpenAI claims it is impossible to train its Artificial Intelligence (AI) model without using copyrighted work. Consequently, the company has approached the British Parliament, asking for permission to use copyrighted materials to train its AI, as reported by Futurism.
“Because copyright today covers virtually every sort of human expression — including blog posts, photographs, forum posts, scraps of software code, and government documents — it would be impossible to train today’s leading AI models without using copyrighted materials,” said OpenAI in the evidence filing, as reported first by the Telegraph.
To make chatbots understand human language, developers have to train them with vast amounts of data, usually taken from the internet. OpenAI with ChatGPT is not an exception. However, the majority of data on the internet is copyrighted. Copyright enables creators, including writers, to protect their works so they cannot be used without permission.
OpenAI mentioned while it is possible to train ChatGPT with books on public domains only, this will have a significant negative impact on the chatbot’s capability, as reported by The Guardian. OpenAI said, “Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.”
However, if OpenAI continues using copyrighted materials to train its AI without the creators’ permission, this could lead to legal consequences. Just a few weeks ago, the New York Times sued OpenAI and its largest investor, Microsoft. The reason is that the New York Times believes both OpenAI and Microsoft profit from using the paper’s intellectual property.
This was not the first time OpenAI was brought into court under the allegation of profiting off of copyrighted materials. A few months ago, OpenAI was sued by the Authors Guild, which represented some big-name authors, such as George R. R. Martin, Jonathan Franzen, and David Baldacci. Just like the New York Times, the Authors Guild also objected to the fact that OpenAI used the authors’ copyrighted works to train its chatbot.