Turnitin and Google Book Search: same thing?

Thanks for visiting this blog for the first time. Check out the home page for the most recent posts, or the archives if you're looking for something in particular. Here are some of our favorite posts, which you might enjoy:

If you like what you see, we hope you'll consider subscribing to the RSS feed.

The Washington Post reports today on a couple of Virginia high school students who are suing anti-plagiarism service turnitin.com for copyright infringement. According to press accounts, the service is used by 6,000 schools, including Harvard and Georgetown. The way it works is that students turn in papers to their teachers by submitting them through Turnitin’s website. Turnitin then compares the submitted papers to a snapshot of the web, to databases of published articles, and to its own database of millions of other student papers. The problem is that the submitted papers are added to the company’s database of student papers without student permission. Plaintiffs in the case specifically marked their papers asking that they not be archived but they where nonetheless. The students have a website at dontturnitin.com.

What’s striking to me is how similar this is to Google Book Search. It remains to be seen whether Turnitin will make a fair use defense, but their past statements suggest that they will. (Here is a PDF of a legal opinion that Turnitin commissioned.)

Google is copying books without the copyright owners’ consent and storing them in a searchable database, just as Turnitin does with student papers. Google copies the whole book, but argues it’s a fair use because they only display a “snippet” of the text in search results. Turnitin also copies the whole work and only displays snippets to teachers if there’s a plagiarism match. Both Google and Turnitin make commercial use of the works they copy and they both arguably serve educational purposes. And If Google’s use doesn’t affect the “potential market” for licensing books to be included in searchable databases, then Turnitin’s use certainly doesn’t affect the potential market for licensing papers to be included in a plagiarism database.

So, can these cases be distinguished? If not, are they both fair use? I’m still thinking about this one, and I’d like to hear what your analysis is.

Cross-posted at TLF. You can leave and read comments there. →

Mar 30, 2007 | Comment | Tags: , ,

Google Book Search the new MP3.com?

The New Yorker has a dispatch from Jefferey Toobin updating us on the Google Book Search case. It’s a good primer if you haven’t been following this issue, and also fills in some details if you have. Interesting tidbits include the fact that they haven’t started witness depositions yet, and the parties won’t be able to make motions for summary judgment for another year. More interesting is the fact that both Google and the plaintiffs (authors and publishers) are sure this will settle out of court.

“The suits that have been filed are a business negotiation that happens to be going on in the courts,” [Google's] Marissa Mayer told me. “We think of it as a business negotiation that has a large legal-system component to it.” According to Pat Schroeder, the former congresswoman, who is the president of the Association of American Publishers, “This is basically a business deal. Let’s find a way to work this out. It can be done. Google can license these rights, go to the rights holder of these books, and make a deal.”

Lawrence Lessig points out that while a settlement would be good for both parties, it could create a practical precedent that if one wanted to start a book-scanning project, one had to license the books–a lot like the precedent set by the MP3.com case that was ultimately settled out of court.

Another interesting bit about the technology itself is how Google plans to rely on linking from the wider web to give the information in books the context its search algorithms need to produce good results:

“Web sites are part of a network, and that’s a significant part of how we rank sites in our search—how much other sites refer to the others.” But, he added, “Books are not part of a network. There is a huge research challenge, to understand the relationship between books. … We just started, and we need to make these books networked, and we need people to help us do that,” [Google's Dan] Clancy said.

Cross-posted at TLF. You can leave and read comments there. →

Jan 30, 2007 | Comments Off | Tags: , , ,

  •  
  •