PrintGoogle Books is changing how the world reads.
The project’s start coincided with the emergence of the search engine itself. According to the Google Books Web site, in 1996 Stanford graduates Sergey Brin and Larry Page were working on a digital libraries project that sought to index book content and enable Web crawlers to analyze cross-references and citations within books. The crawler they built inspired the PageRank algorithm that fuels the search engine Internet users know so well. The Google Books project didn’t reach full steam until 2002, however, when the company actively sought to digitize as many books as possible.
They’ve been pretty successful. Through direct partnerships with authors and publishers or collaborations with universities like Columbia and public libraries, Google Books is rapidly digitizing much of the written word. Books in the public domain—ones, in other words, whose copyrights have expired and can be distributed for free—are fully on view and downloadable in PDF form. Books still under copyright are often available through limited views. The project has marked advantages: authors and publishers utilized limited view to tap into an Internet market they couldn’t access before, and students and scholars can easily search through texts to analyze cross references and pull quotes and information. But there are also significant problems. Before Google, no company had ever digitized books under copyright online, so no company ever had to examine the legality of doing so.
In 2005, the Authors Guild and the Association of American Publishers both independently filed class action lawsuits against Google for “massive copyright infringement” for publishing snippets of in-copyright works without authorial consent. On Oct. 28, 2008, Google, the AG, and the AAP reached a settlement in which authors of in-copyright books were to receive a cut of the profits Google made through Google Books, including online advertising revenues. The corporation also set up a database to allow authors and publishers to opt out of their digitization project, but the AG and AAP were still agitated over Google’s lackluster efforts to directly contact rights holders to establish consent to publish their works online. They set a court date for Oct. 7, 2009 (Federal judge Denny Chin has postponed the court date indefinitely). And on Friday, Sept. 18, 2009, the Department of Justice released its own highly critical report on the lawsuit, questioning Google’s aggressive plans for so-called “orphan books”—books that, while still under copyright, are out of print and essentially abandoned by the author and publisher. The report also slammed the company for publishing portions of in-copyright books without direct consent from the authors.
Now the situation becomes even more complicated. Columbia is one of the main participating universities in the digitization project. Jim Neal, vice president for information services and university librarian at Columbia, is a strident defender of the University’s stance. “One of our challenges as a library is to put more and more of our content online,” he explains. And while Columbia does have in-house digitization technology “somewhere in the bowels of Butler,” money is a considerable obstacle. Due to the need to hire staff to manage high-tech digital cameras, catalogue, and preserve the media, costs would inflate rapidly. A much more viable option is outsourcing the work to another company, especially one as already invested in massive digitization projects as Google is.
From Neal’s perspective, the online program could be the best way to make Butler’s unique collections available online to anyone in the world. The Rare Book and Manuscript Library has already largely been uploaded, and the Columbia University Libraries are in the process of preparing the History of Religion project and the History of Medicine for upload as well. Columbia has also given Google all out-of-copy books published prior to 1923.
This will be comparable to Google’s agreements with public libraries, and will, Neal suggests, facilitate interlibrary loans. According to some of Google’s opponents however, the opposite is true. Beyond AG and AAP’s questions about copyright, another group, Open Book Alliance, is arguing from an antitrust perspective. A conglomerate made up of individuals, libraries, nonprofits, and companies including Yahoo, Microsoft, and Amazon, launched a formal attack against Google’s advances this past August. Peter Brantley and Gary Reback outline the problem with Google in their manifesto “Opening the Book.” While large, well-funded universities can easily participate in Google’s book-scanning effort, most will “be forced to pay monopoly prices for access ... creating a system of have and have-nots in our nation’s educational system.” Community libraries would only get a single terminal for the private database, and public schools K-12 would likely get nothing. The digitization then, instead of spreading knowledge to those with less access to it, instead “widens the digital divide by limiting access to digital books in financially hard-hit communities that have budget-constrained libraries.”
It will be a long while before these issues will be settled. Kenneth Crews, director of the Copyright Advisory Office of Columbia Libraries, predicts that the earliest possible hearing for the lawsuit would take place in late 2010, and will likely have a re-hearing in 2011. In the meantime, campus resources are being set up to help authors and readers alike understand the Google Books case. The CAO is building a Web site where users will find a page of information and links about the settlement. The student group Free Culture does not have an official stance on the Google Books case, and, according to spokesperson Gabe Schubiner, CC ‘10, is not planning on adopting one. Schubiner says that Free Culture wants to provide an outlet for students on all sides of the issue to become informed and, if they so choose, involved with the case.