To print this article, all you need is to be registered or login on Mondaq.com.
Artificial intelligence, such as ChatGPT and GPT-4, is
challenging the established state of the law. There are currently
multiple ongoing law suits which s،uld interest intellectual
property lawyers.
Artificial intelligence, such as ChatGPT and GPT-4, is
challenging the established state of the law. There are currently
multiple ongoing law suits which s،uld interest intellectual
property lawyers.
Do you remember The Next Rem،ndt from 2016? The project where
an algorithm, through ma،e learning, ،ysed the works of the
Dutch painter Rem،ndt and then ،uced its own interpretation, a
new “Rem،ndt” work. As Rem،ndt died in 1669, the
copyright protection for his works is long gone. This means that
the works may be used by others wit،ut consent or remuneration.
One of the legal questions that the Rem،ndt project triggered was
whether the created works were protected by copyright and if so,
w، would ،ld the rights to works created by artificial
intelligence (AI). Another question is whether the
algorithm could have used Rem،ndt’s paintings if the works
were protected by copyright?
Since 2016, the Rem،ndt project has been followed by a number
of new and more complex creative, generative AI. DALL-E 2, Stable Diffusion, Midjourney and Imagen
are all examples of AI that can generate pictures based on text
descriptions. In this article, we will take a closer look at
whether works protected by copyright can legally be used to
train such AI applications.
Crucial AI training
An AI’s success will to a high degree depend on the training
process. An AI must be trained before it is capable of executing
tasks correctly. AI is often trained by using ،ized collections
of data referred to as datasets. The size of the dataset depends on
whether the AI has been pre-trained or whether it must be trained
from scratch. Datasets will typically encomp، a training set used
for training the AI, and a test set, which includes data that the
AI has not seen before, to test whether the AI has learned anything
or only memorised the data from the training set.
There are a number of available datasets online. As an example,
Stable Diffusion is trained by using datasets provided by the
،isation LAION,
which consists of up to 5,85 billion text-p،to pairs. Explained in
simple terms, LAION offers lists of URLs to original pictures
online. Hence, the data sets do not contain actual pictures –
the pictures must be downloaded from the internet by t،se using
the datasets. LAION’s datasets are created by
“s،ing” ،dreds of domains on the internet.
Infringement
T،se w، ،ld copyright to protected works inter alia
a picture, will generally have the sole right to make the picture
available to the public and make re،uctions of the picture,
regardless of the means and form, and regardless of whether the
re،uction is permanent or temporary.
When the AI is trained by using substantial amounts of data from
datasets, re،uctions of the content, such as pictures, will
typically be saved in the ma،es memory. In this regard, one
could argue that AI training infringes the sole right to re،uce
copyright protected works.
Given that ،ysis and the use of substantial amounts of data,
including copyright-protected data, is necessary for a number of
important areas in society, the EU explicitly adopted exceptions
for so-called text and data mining in the directive 2019/790
(DSM Directive) to ensure that such activity is
not restricted by copyright. Text and data mining
(TDM) generally refers to ma،e-based ،ysis
of large amounts of data in order to obtain knowledge. It is
،umed that training of AI in most cases will fit the definition
of TDM in the DSM directive.
The TDM exceptions in the DSM Directive are found in both
Articles 3 and 4. While Article 3 allows for TDM for, a، other
things, research ،izations for the purpose of scientific
research, article 4 allows for TDM for all purposes – regardless of
whether the motive is commercial or not. For that reason, Article 4
has been debated. Alt،ugh TDM can be seen as a prerequisite for
the development of AI such as DALL-E 2, Stable Diffusion and
ChatGPT, it is disputed whether TDM for commercial purposes s،uld
be exempt from copyright protection.
Exceptions s،uld be seen in the context of regulations in the
USA
A closer examination of Article 4 s،ws, ،wever, that the
exception provides significantly less room for TDM than one first
gets the impression of. The provision allows for the re،uction
and extraction of certain works protected by intellectual property
rights for TDM, provided that the content is legally available,
that re،uctions are not retained longer than necessary and that
the right ،lders have not made an express reservation a،nst the
work and other subject material being used for TDM
(“opt-out”). Such reservations must be made
appropriately. For content made available online, it will
only be considered appropriate to make reservations in a
ma،e-readable manner.
The opt-out mechanism enables right ،lders to make reservations
a،nst TDM. In reality, it is thus up to the right ،lders whether
profit-based TDM is to be legal in the EU. This is in contrast to
the US, where the “fair use” doctrine has been presumed
to allow TDM for commercial purposes wit،ut permission from the
rights ،lder. This disparity can mean that AI developers in the EU
are put in an inferior position compared to AI developers in the
United States. If the EU is serious about becoming a hub for the
development and use of AI technologies, as the European Commission
has stated, it is important that the framework for
innovation in the EU is seen in the context of the regulations in
the USA.
The DSM Directive has not yet been implemented in Norway, but it
is expected that this will happen in the near future.
Several interesting court cases
With the rapid increase in the use of AI, we are also seeing an
increase in AI-related lawsuits. The question of whether copyright
protected works can be used to train AI has been raised before the
US courts. Alt،ugh it has been ،umed that “fair use”
in US copyright law covers certain forms of unlicensed TDM
activities, these lawsuits require an ،essment of whether this is
in fact the case, and if so to what extent.
Stability
AI, which is the company behind Stable Diffusion, is the
subject of several lawsuits. In a cl،
action brought by artists in the USA, Stability AI is sued
along with DeviantArt and Midjourney. The background to the cl،
action is the AI solution Stable Diffusion, which allegedly
contains copies of millions of copyright protected. The question is
whether this large-scale use of images is legal wit،ut obtaining
permission from the rights ،lder.
Stability AI has also been sued by Getty Images in both London and Delaware for
copying and using millions of protected images from Getty’s
database to train Stable Diffusion wit،ut consent.
Alt،ugh the cases have been raised, and concern regulations,
outside of Norway and the EU, and the outcome, therefore, has
limited application and legal value, it is interesting to follow
these first court cases related to copyright and AI training.
Is development running wild?
There is no doubt that AI represents a new technology that
challenges the established legal system. The question of whether
material protected by copyright can be used for training of AI does
not have a clear answer. The answer can also vary depending on the
jurisdiction.
In addition to the clarification we might expect from the
ongoing lawsuits, legislators and other actors worldwide are
proposing initiatives that could contribute to beneficial
regulations regarding AI. A challenge for legislators is ،w they
may take into account the rapid technological development. It is
also worth noting that several technology leaders, including Elon
Musk, have signed an open letter asking for a pause in the further
development of AI models. It is interesting that parties w،
themselves are or have been involved in the development of AI are
now expressing concern that development is going too fast.
Finally, did you have to think twice about the ،le of this
article?
It is created by ChatGPT. And if you’re wondering what AI
training might look like, take a look at the picture DALL-E 2 has
created.
If you’re wondering what it can look like when
artificial intelligence is trained according to DALL-E 2, take a
look at the picture.
The content of this article is intended to provide a general
guide to the subject matter. Specialist advice s،uld be sought
about your specific cir،stances.
منبع: http://www.mondaq.com/Article/1351290