Where credit is due

Over time, I have become increasingly opposed to the idea of intellectual property. It is something I have been thinking about for a while, but recent open questions related to generative AI have brought this to the forefront of my mind. In particular, what I have been thinking about recently is that what copyright law says is decisively not the same as what is ethical. To some readers, this may seem obvious, but allow me to explain.

Part of what prompted me to think about this was reading tweets posts from people strongly opposed to Stable Diffusion claiming that it violated the copyrights of all the artists whose work was used to train the neural net. The standard rebuttal is of course that the Stable Diffusion is Fair Use, but is this accurate?

The trained model itself is very clearly transformative. It does not contain reproductions of the images used as training data, and there is no algorithm for reconstructing any desired training image from the model.* Compare this to cutting up photos from thousands of magazines into small square pieces and arranging those pieces to form a map of the world (a clear example of fair use). Additionally, since Stable Diffusion is open source, it is arguably not a commercial use of the training data. While the creators profit from the creation of the model in various ways, the model is not bought or sold.

*Note that it has been shown to be possible to extract whole images from training data, but not arbitrary images.

The actual concern artists have is the use of the model to create generated images. In my opinion, generated images may or may not violate an artist’s copyright. In the case of reproducing a copyrighted image nearly exactly, I think distributing this reproduction would clearly violate copyright. For generated images that imitate an artist’s style but do not directly resemble any image actually created by that artist, it’s more of a gray area. It is legal for one artist to imitate the style of another artist. Fair Use is notably not a black-and-white standard, it is a judgment call that depends on multiple factors. One factor is how the derivative work will (potentially) affect the marketability of the original, but working artists may have their original work devalued by either by illegal copies or by legal works that consumers regard as substitutes.

Taking a step back

Copyright infringement is not the same thing as plagiarism. Many forms of plagiarism are perfectly legal. Moreover, copyright isn’t really designed to prevent plagiarism. Copyright is an economic tool, a market regulation designed to correct for a specific type of market failure. In order for artists to keep producing new work, they need an income. Prior to capitalism, artists often had patrons to support them. Without that consistent support, artists have to rely on their art directly to make money. However, original artistic ideas don’t make money in capitalism, products do, and the artist isn’t necessarily the best equipped to manufacture products at scale. This is why we have book publishers, for example.

If an artistic product is easily duplicated, which early on basically meant anything that could be written down, then there’s no natural mechanism within capitalism to direct revenue back to the artist. Enter copyright, the right to sell or distribute a copy of a work. This is a new invention in human history, necessitated by capitalism. But since copyright can be bought and sold, it becomes another product. When an artist works for a corporation, their works are usually assets belonging to their employer. Every character in a superhero movie was some artist’s original character once, and now they are multimillion dollar properties.

Is Stable Diffusion fair?

Fair Use is a set of criteria by which exceptions to copyright can be made. It’s correcting for market failures caused by introducing copyright in the first place. Fairness in the usage of someone’s art is something different. The art community has internal norms and ethics regarding the reuse of art, and while there is some disagreement about edge cases, by-and-large these things are widely agreed upon.

Rather than Stable Diffusion being copyright infringement, I think there is a much stronger case to be made that it is plagiarism. This is because the people who did the work to create the images used as training data are not individually credited for their contributions, even though some individuals may have contributed hundreds or even thousands of images. Moreover, the creation of the AI model was done without their consent and clearly many artists would not have given consent if asked. This is an ethical issue with how AI are trained, not a legal issue pertaining to fictitious economic rights. The reasons for artists being unhappy with their work being including in the training data is immaterial.

Why is this distinction important? Generative AI is still very new, and there aren’t many established legal precedents concerning its use. I don’t think we should pin our hopes on the legal system to resolve these issues satisfactorily; rather, I think approaching it as an ethical issue (plagiarism) is going to be more productive. After all, instances of plagiarism prior to Stable Diffusion and ChatGPT have rarely ever been about copyright: when a student copies off Wikipedia for an essay, we don’t regard this as a copyright issue.

Of course, students are not typically making money on their plagiarized essays. Let’s look at a more economically comparable example: plagiarism on YouTube. The video essayist HBomberGuy recently released a video showing widespread plagiarism on YouTube. As it turns out, multiple rather large YouTube channels have plagiarized their videos from documentaries, books, articles, sites like Wikipedia, and and other YouTubers. Now, why is this analogous to AI art? For one, it is in many cases not (clearly) a violation of copyright. Additionally, this is a form of plagiarism that impugns upon the marketability of the original. Larger YouTube channels have, in this sense, at times “stolen” the revenue of smaller YouTubers by copying their ideas. More importantly, it prevents voices from being heard, often already-marginalized voices.

There is something a little bit soul-crushing about watching a man be paid thousands and thousands of dollars to literally plagiarize the phrase, “What is the real tangible impact of gay erasure?”

HBomerGuy

What is the solution?

In his video, HBomberGuy doesn’t offer a solution. Exposing plagiarism and supporting original creators seem to be the obvious steps to take, however. There is always a risk of going to far in trying to stamp out plagiarism, such that creativity is stifled. As Pablo Picasso said, “good artists borrow, great artists steal.” There’s no clear answer, though it’s always best to cite when possible. Could there someday be generative AI systems that cite their sources?

Leave a comment