Four of the nation’s leading book publishers have sued the Internet Archive, the online library best known for maintaining the Internet Wayback Machine. The Internet Archive makes scanned copies of books—both public domain and under copyright—available to the public on a site called the Open Library.
“Despite the Open Library moniker, IA’s actions grossly exceed legitimate library services, do violence to the Copyright Act, and constitute willful digital piracy on an industrial scale,” write publishers Hachette, HarperCollins, Wiley, and Penguin Random House in their complaint. The lawsuit was filed in New York federal court on Monday.
For almost a decade, the Open Library has offered users the ability to “borrow” scans of in-copyright books via the Internet. Until recently, the service was based on a concept called “controlled digital lending” that mimicked the constraints of a conventional library. The library would only “lend” as many digital copies of a book as it had physical copies in its warehouse. If all copies of a book were “checked out” by other patrons, you’d have to join a waiting list.
In March, as the coronavirus pandemic was gaining steam, the Internet Archive announced it was dispensing with this waiting-list system. Under a program it called the National Emergency Library, IA began allowing an unlimited number of people to check out the same book at the same time—even if IA only owned one physical copy.
Before this change, publishers largely looked the other way as IA and a few other libraries experimented with the digital lending concept. Some publishers’ groups condemned the practice, but no one filed a lawsuit over it. Perhaps the publishers feared setting an adverse precedent if the courts ruled that CDL was legal.
But the IA’s emergency lending program was harder for publishers to ignore. So this week, as a number of states have been lifting quarantine restrictions, the publishers sued the Internet Archive.
In an email to Ars Technica, IA founder Brewster Kahle described the lawsuit as “disappointing.”
“As a library, the Internet Archive acquires books and lends them, as libraries have always done,” he wrote. “Publishers suing libraries for lending books, in this case, protected digitized versions, and while schools and libraries are closed, is not in anyone’s interest.”
“The publishers have a pretty strong case”
The publishers’ legal argument is straightforward: the Internet Archive is making and distributing copies of books without permission from copyright holders. That’s generally illegal unless a defendant can show it is authorized by one of copyright law’s various exceptions.
Legal experts tell Ars that the Internet’s Archive’s best response is to argue that its program is fair use. That’s a flexible legal doctrine that has been used to justify a wide range of copying over the decades—from recording television broadcasts for personal use to quoting a few sentences of a book in a review. Most relevant for our purposes, the courts have held that it is a fair use to scan books for limited purposes such as building a book search engine.
When considering a fair use claim, courts consider several factors, including the impact of the use on the market for the original work. A book search engine, for example, is not a substitute for reading books but, rather, helps readers find new books they might want to buy. This is one of the reasons the courts found that book scanning for a search engine was legal under fair use.
But it’s harder to come up with compelling arguments that the Internet Archive’s open-ended lending program is fair use.
James Grimmelmann, a copyright scholar at Cornell University, told Ars that he is withholding judgment until he sees the Internet Archive’s response. However, he said, “it seems like the publishers have a pretty strong case.”
“I think there are arguments for fair use, but they’re not terribly strong arguments,” he said in a Monday phone interview.
A pandemic exception?
The Internet Archive would have had a stronger argument if it had continued to limit the number of copies that could be lent out. In that scenario, IA could argue that the program’s impact on the market was little different from a conventional library.
Obviously, a patron who checks out a book from a library is less likely to purchase a copy, undermining the market for the book. On the other hand, libraries themselves buy many books—and the more popular a book is, the more copies libraries must buy. So the overall impact of libraries on demand for books is not clear.
But once the IA stopped buying a copy of a book for every copy it lent out, this argument became a lot weaker. An institution like AI can buy a single copy of a book and then “lend” it to dozens, hundreds, or thousands of people at the same time. There’s little doubt that this has a negative impact on the market for new books.
Instead, the Internet Archive will likely need to make a more novel argument—that the unique circumstances of a pandemic justifies allowing types of infringement that would be clearly illegal at other times. Grimmelmann wasn’t able to identify any other cases where courts have made that kind of leap.
I also spoke to John Bergmayer, a copyright expert at the copyright reform group Public Knowledge. He said there was a “pretty strong fair use argument” for both the Internet Archive’s previous controlled digital lending program and its new approach without waiting lists. Bergmayer pointed to the fact that millions of books are currently locked up in libraries that have been closed due to the pandemic. That, he said, creates a unique situation that could justify digital lending activities that would otherwise be illegal.
But like Grimmelmann, Bergmayer couldn’t name any specific court decisions that back up IA’s aggressive interpretation of copyright law.
The stakes are high
While Grimmelmann was fairly bullish on the publishers’ legal prospects, he disagreed with one aspect of the industry’s argument. The Internet Archive is officially a non-profit, but the publishers’ lawsuit portrays the group as effectively a commercial operation profiting from copyright infringement. It points out that IA has earned millions of dollars from contracts to scan books on behalf of partners such as other libraries.
But Grimmelmann told Ars that this fundamentally misunderstands the motivations of Brewster Kahle, the founder of Internet Archive and still its driving force.
“Brewster Kahle is what the Russians might call a holy fool—someone who acts without real regard for himself or for worldly things in the service of a higher calling,” Grimmelmann said. The Internet Archive “is not a commercial venture,” he argued. Grimmelmann believes that Kahle, a 1990s dot-com entrepreneur who has sunk millions of dollars into the Internet Archive, is fundamentally an idealist.
But Kahle’s idealism—or foolishness—might cost him dearly. Copyright law allows statutory damages as high as $150,000 per work for willful infringement. And Grimmelmann tells Ars that if the publishers win the case, they’ll have a strong case that the infringement was willful.
The Internet Archive has scanned more than a million books that are still under copyright, so a loss could easily lead to billions of dollars in damages—far beyond the non-profit’s ability to pay. So if the publishers win the lawsuit, they could force the Internet Archive out of business. That would be an incalculable loss given the group’s work archiving other types of content, including the early Web.
However, the publishers may not be interested in forcing the Internet Archive out of business. Their goal is to get the Internet Archive to stop scanning their books. If they win the lawsuit, they might force the group to shut down its book scanning operation and promise to not start it up again, then allow it to continue its other, less controversial offerings.