A year ago, Getty Images, one of the world’s largest stock image agencies, reached a licensing deal with a startup called SparkRebel, which I described as “Pinterest for fashionistas, with Buy buttons.” On that site, people would post images of items of clothing they’re interested in. An image recognition engine would try to identify the photo and thus the identity of each apparel item. If the item was identified and its manufacturer had a deal with SparkRebel, the site would show a Buy button, which users could click to purchase the item. It was a clever use of content identification technology to support licensing of content used for commercial purposes.
SparkRebel used Getty Images’ ImageIRC image recognition technology. ImageIRC uses the concept of fingerprinting: it examines an image, calculates a set of numbers that represent the image, and looks those numbers up in a database of fingerprints to see if finds a match. Matches needn’t be exact; the fingerprinting algorithm can usually compute the correct fingerprint even if the image has been color-shifted, downsampled, cropped (up to a point), etc. In other words, Getty Images is to still images as Google’s Content ID is to YouTube videos and Audible Magic is to various sites that host music files.
In Getty Images’ deal with SparkRebel, SparkRebel would pay Getty Images a licensing fee whenever a user posted an image to which Getty owned the rights. Those of use who watched this deal at the time wondered if Getty Images was trying to get Pinterest — the leading site where users posted images of commercial products — to agree to a similar deal. Given Getty Images’ firm “no comment” replies to questions about it, the answer was clearly yes. Many of the photos posted on Pinterest (as opposed to, say, Instagram) are commercial images copied and pasted from other websites, so Getty could have made a case that Pinterest was promoting infringement of its copyrights.
It took a while, but Getty Images did conclude a licensing deal with Pinterest last Friday — a few months after SparkRebel ceased operations. Under the deal, whenever ImageIRC finds a match to an image that a user “pins” on Pinterest, Pinterest will pay Getty Images a licensing fee, just as with SparkRebel. The additional feature of the deal is that Getty Images will send Pinterest metadata about the matched image, which Pinterest can display for the user. The metadata includes the time and location of the photo, the identity of the photographer, caption, an image ID, and licensing information.
Neither Getty nor Pinterest has mentioned anything about blocking or flagging images that users aren’t permitted to pin to the site; Pinterest still allows any user photos on the site, regardless of the terms under which Getty normally licenses them. Pinterest continues to follow DMCA 512 policies of responding to takedown notices and terminating the accounts of users who repeatedly violate copyrights.
Pinterest’s announcement of the deal on its blog mentions the license fees, but otherwise does not mention any copyright issues; instead it focuses on “New data to help improve Pinterest.” Putting the fees aside, the deal is a win for Pinterest as well as Getty Images (not to mention Pinterest’s user community).
For Getty Images, this deal establishes an important precedent for image-sharing services that store lots of professional images and use them for commercial purposes. Other services that use images to drive commerce will likely follow Pinterest’s example and make licensing deals with Getty Images. But Getty gets another benefit besides money that could turn out to be just as important: distribution of image metadata.
One of the biggest problems that the stock image industry has with the Internet is that most ways of copying images from one place to another strip metadata away. When photographers and editors prepare images for distribution, they use tools like Adobe Photoshop, which incorporates Adobe’s XMP (eXtensible Metadata Platform) metadata scheme for storing metadata that travels with images. XMP metadata can be stored alongside images on web pages. But it doesn’t survive copying and pasting photos through web browsers.
It is actually illegal under section 1202 of the Digital Millennium Copyright Act to intentionally remove “copyright management information” from a copyrighted work in order to evade detection of infringement, though there is some ambiguity over issues such as what qualifies as copyright management information. Nevertheless, images that users copy and paste among websites generally have no copyright management information.* Getty Images’ arrangement with Pinterest recovers metadata for images posted to the site that match its database. This certainly won’t solve the image metadata problem in general, but it’s a start.
*Some images may have invisible embedded watermarks that indicate copyright management information. Typically such watermarks will contain IDs that point to entries in image licensors’ databases, which in turn contain things like the photographer’s name, licensing terms, and so on. Whether invisible embedded watermarks qualify as copyright management information under DMCA 1202 is somewhat up in the air. If a high enough court decided that they do, that could make tools for hacking “social DRM” e-book watermarks illegal in the United States.