Digimarc, the leading supplier of watermarking technology, announced this week the release of Digimarc Guardian Watermarking for Publishing, a transactional watermarking (a/k/a “social DRM”) scheme that complements its Guardian piracy monitoring service. Launch customers include the “big five” trade publisher HarperCollins, a division of News Corp., and the e-book supply chain company LibreDigital, a division of the printing giant RR Donnelley that distributes e-books for HarperCollins in the US.
With this development, Digimarc finally realizes the synergies inherent in its acquisition of Attributor almost two years ago. Digimarc’s roots are in digital image watermarking, and it has expanded into watermarking technology for music and other media types. Attributor’s original business was piracy monitoring for publishers via a form of fingerprinting — crawling the web in search of snippets of copyrighted text materials submitted by publisher customers.
One of the shortcomings in Attributor’s piracy monitoring technology was the difficulty in determining whether a piece of text that it found online was legitimately licensed or, if not, if it was likely to be a fair use copy. Attributor could use certain cues from surrounding text or HTML to help make these determinations, but they are educated guesses and not infallable.
The practical difference between fingerprinting and watermarking is that watermarking requires the publisher to insert something into its material that can be detected later, while fingerprinting doesn’t. But watermarking has two advantages over fingerprinting. One is that it provides a virtually unambiguous signal that the content was lifted wholesale from its source; thus a copy of content with a watermark is more likely to be infringing. The other is that while fingerprinting can be used to determine the identity of the content, watermarking can be used to embed any data at all into it (up to a size limit) — including data about the identity of the user who purchased the file.
The Digimarc Guardian watermark is complementary to the existing Attributor technology; Digimarc has most likely adapted Attributor’s web-crawling system to detect watermarks as well as use fingerprinting pattern-matching techniques to find copyrighted material online.
Digimarc had to develop a new type of watermark for this application, one that’s similar to those of Booxtream and other providers of what Bill McCoy of the International Digital Publishing Forum has called “social DRM.” Watermarks do not restrict or control use of content; they merely serve as forensic markers, so that watermark detection tools can find content in online places (such as cyberlockers or file-sharing services) where they probably shouldn’t be.
A “watermark” in an e-book can consist of text characters that are either plainly visible or hidden among the actual material. The type of data most often found in a “social DRM” scheme for e-books likewise can take two forms: personal information about the user who purchased the e-book (such as an email address) or an ID number that the distributor can use to look up the user or transaction in a database and is otherwise meaningless. (The idea behind the term “social DRM” is that the presence of the watermark is intended to deter users from “oversharing” files if they know that their identities are embedded in them.) The Digimarc scheme adopted by LibreDigital for HarperCollins uses hidden watermarks containing IDs that don’t reveal personal information by themselves.
In contrast, the tech publisher O’Reilly Media uses users’ email addresses as visible watermarks on its DRM-free e-books. Visible transactional watermarking for e-books dates back to Microsoft’s old Microsoft Reader (.LIT) scheme in the early 2000s, which gave publishers the option of embedding users’ credit card numbers in e-books — information that users surely would rather not “overshare.”
HarperCollins uses watermarks in conjunction with the various DRM schemes in which its e-books are distributed. The scheme is compatible with EPUB, PDF, and MOBI (Amazon Kindle) e-book formats, meaning that it could possibly work with the DRMs used by all of the leading e-book retailers.
However, it’s unclear which retailers’ e-books will actually include the watermarks. The scheme requires that LibreDigital feed individual e-book files to retailers for each transaction, rather than single files that the retailers then copy and distribute to end users; and the companies involved haven’t specified which retailers work with LibreDigital in this particular way. (I’m not betting on Amazon being one of them.) In any case, HarperCollins intends to use the scheme to gather information about which retailers are “leaky,” i.e., which ones distribute e-books that end up in illegal places online.
Hollywood routinely uses a combination of transactional watermarks and DRM for high-value content, such as high-definition movies in early release windows. And at least some of the major record labels have used a simpler form of this technique in music downloads for some time: when they send music files to retailers, they embed watermarks that indicate the identity of the retailer, not the end user. HarperCollins is unlikely to be the first publisher to use both “social DRM” watermarks and actual DRM, but it is the first one to be mentioned in a press release. The two technologies are complementary and have been used separately as well as together.