Five key takeaways from house judiciary committee AI

Another challenge is related to protecting Intellectual Property (IP), which has been a longstanding problem for the sector. Photo credit: Shutterstock

In light of several high-profile lawsuits in recent months, countries’ legislative frameworks are finally beginning to grapple with the challenges thrown up by copyright law and generative artificial intelligence (AI).

In January 2023, Getty Images announced a lawsuit against Stability AI in London’s High Court of Justice, alleging that the Stable Diffusion image generator infringed Getty’s copyrighted photographs and trademarks.

And, in February, the award-winning visual artists Sarah Andersen, Kelly McKernan, and Karla Ortiz filed a class action complaint in a US District Court in California against defendants Stability AI, Midjourney and DeviantArt, alleging that their works were used without permission as part of the companies’ AI training set.

Earlier, in November 2022, a group of anonymous programmers filed a class action lawsuit against GitHub, a Microsoft subsidiary, and OpenAI, alleging unauthorised and unlicensed use of the programmers’ software code to develop the defendants’ AI machines, Codex and Copilot.

Recognising a need for action, the House Judiciary Committee in the US has held a hearing, examining the intersection of generative AI and copyright law. The hearing, which took place on 17 May 2023, followed the Senate hearing on AI oversight the previous day, in which OpenAI CEO Sam Altman took the stand. What were the five key takeaways from the witness testimony?

1. Copyright’s well-established fair use doctrine arguably provides legal coverage for the training of AI models.

Sy Damle, Latham & Watkins LLP and former General Counsel of the US Copyright Office, argued that “the use of a copyrighted work to learn unprotectable facts and use those facts to create products that do not themselves infringe copyright is quintessential fair use”, and that the training of AI models generally adheres to this principle.

He spoke against the view that generative AI’s ability to replicate artistic styles undermines any fair use defence, saying, “This concern has nothing to do with copyright, which does not, and has never, granted monopolies over artistic or musical styles.”

2. Implementing a statutory or collective licencing regime would be a project “many orders of magnitude larger than any similar scheme in the history of American law”.

Sy Damle argued that it would be a bad policy to introduce statutory or collective licencing under which any use of copyrighted content to train an AI model would automatically trigger a payment obligation. This is because it would prevent case-by-case evaluation, eliminating the fair use doctrine.

Moreover, he observed that implementing such a regime would be overwhelmingly complex. A statutory licencing scheme would need to cover every publicly accessible work on the Internet – a body of work which likely numbers in the tens of billions. There are also an uncountable number of “orphan works” without identifiable owners, which would lead to massive volumes of unmatched royalties.

3. AI systems could generate outputs that potentially infringe on artists’ copyrights and right of publicity in various ways.

Chris Callison-Burch, Associate Professor of Computer and Information Science at the University of Pennsylvania and Visiting Research Scientist at the Allen Institute for Artificial Intelligence, pointed out that outputs of generative AI can violate copyright laws. For example, via memorisation of datasets, AI systems can output identical copies of copyrighted materials.

However, he observed that Google and other companies are developing strategies to prevent sophisticated prompting by the user that would elicit the underlying training data.

Text-to-image generation systems also have the ability to produce images with copyrightable characters in their dataset – a problem that may be hard for AI developers to avoid without a registry of copyrighted or trademarked characters.

He suggested that other uses of generative AI may violate “right-of-publicity” rather than copyright law. For example, there is the case of the AI-generated song called “Heart on My Sleeve””, designed to sound like the artists Drake and The Weeknd. There is also the issue of “substantial similarity” where outputs of generative AI systems look very similar to some of their training data.

4. Copyright holders can, under certain circumstances, opt out of having their works used to train AI systems.

Callison-Burch pointed out that there are several technical mechanisms that are being designed by industry to let copyright holders opt out. The first is an industry standard protocol that allows for websites to specify which parts should be indexed by web crawlers, and which part should be excluded. The protocol is implemented by placing a file called robots.txt on the website that hosts the copyrighted materials.

Organisations that collect training data, like Common Crawl and LAION, follow this protocol and exclude files that have been listed in robots.txt as “do not crawl”. There are also emerging industry efforts to allow artists and other copyright holders to opt out of future training.

5. Copyright or IP exemptions that allow AI developers to exploit creators without permission or compensation would disincentivise human-created works.

Dan Navarro, Grammy-nominated songwriter, singer, recording artist, and voice actor, argued that copyright or IP exemptions for AI developers would be “devastating”. He said, “creating shorts for AI will only erode the incentives to create new works – the works AI itself depends on.” Copyright, he said, exists to incentivise humans to create, whereas machines do not need incentives.

He finished by emphasising the irreplaceability of human-created works: “At the heart of the connection between artist and audience are shared, lived experiences only humans can relate to and convey.”

Ultimately, it appears that courts will likely favour AI developers if past legal precedents are any guide. Moreover, there are moves in the UK to expand the exception to copyright infringement rules that currently exists for data mining for non-commercial research purposes, to allow this to be used for any purpose. The EU is pursuing a similar strategy, making an infringement exception that would apply unless the rights owner had expressly reserved their rights.

For now, however, courts will continue to function as the battleground where AI companies, keen to remain at the forefront of innovation, clash with creatives, as they fight to safeguard the integrity of art.

Five key takeaways from the House Judiciary Committee hearing on AI and copyright law

Go deeper with GlobalData

ChatGPT Trailblazers - How Startups Democratize Generative Artificial Intelligence (AI)

Enterprise Security Software Sector Scorecard - Thematic Intelligence

Data Insights

1. Copyright’s well-established fair use doctrine arguably provides legal coverage for the training of AI models.

2. Implementing a statutory or collective licencing regime would be a project “many orders of magnitude larger than any similar scheme in the history of American law”.

3. AI systems could generate outputs that potentially infringe on artists’ copyrights and right of publicity in various ways.

4. Copyright holders can, under certain circumstances, opt out of having their works used to train AI systems.

5. Copyright or IP exemptions that allow AI developers to exploit creators without permission or compensation would disincentivise human-created works.

ChatGPT Trailblazers - How Startups Democratize Generative Artificial Intelligence (AI)

Enterprise Security Software Sector Scorecard - Thematic Intelligence

Data Insights

GlobalFoundries explores merger with United Microelectronics, reports say

OpenAI secures $40bn in SoftBank-led funding round

Liberation Day: What would Trump tariffs on semiconductors mean for the tech industry?

Retym raises $75m for AI data centre connectivity chips

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

Go deeper with GlobalData

Data Insights

1. Copyright’s well-established fair use doctrine arguably provides legal coverage for the training of AI models.

2. Implementing a statutory or collective licencing regime would be a project “many orders of magnitude larger than any similar scheme in the history of American law”.

3. AI systems could generate outputs that potentially infringe on artists’ copyrights and right of publicity in various ways.

4. Copyright holders can, under certain circumstances, opt out of having their works used to train AI systems.

5. Copyright or IP exemptions that allow AI developers to exploit creators without permission or compensation would disincentivise human-created works.

Sign up for our daily news round-up!

Give your business an edge with our leading industry insights.

Go deeper with GlobalData

Data Insights

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing