The Museum of Modern Art in New York uploaded collection data from nearly 124,000 artworks to GitHub, a code-sharing and Git repository hosting service, in July with less than expected fanfare. It’s not the first time a museum has given the public access to this type of information, and MoMA’s release isn’t the most extensive.
So what’s the big deal, anyway?
“They’re MoMA!” said David Newbury, lead developer for Art Tracks at Carnegie Museum of Art, a data project that analyzes the history and origin of artworks. “Because they’re such a reputable institution, because they’re such a powerful force, it makes everyone else’s job easier.”
What he means is that it will be easier for people like him to persuade their museum director bosses to do the same. And to do so using a Creative Commons Zero license, as MoMA did. The license allows users to “copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.”
That, Newbury said, changes everything.
“It enables people to do things with this data that no one has ever thought of,” he said, citing the Metropolitan Museum of Art MediaLab’s Chrome extension Meow Met as an example. “It’s one of those things that exposes people to art in ways that they wouldn’t otherwise be exposed to it.”
Metadata as art
Fiona Romeo, MoMA’s Director of Digital Content and Strategy, explained in a post on Medium that the museum’s open data, which includes title, artist, date made, medium, dimensions and date acquired, “is primarily intended to be useful to scholars.”
Jer Thorp, a co-founder of data research group The Office for Creative Research, wrote in a post on Medium that people will do what they always do when a significant data release is announced: “they’ll analyze it and visualize it and regress it and cluster it and query it and process it.” But he said while he hopes people will use the data to reach “hopefully illuminating conclusions,” there’s more to it than just that.
“This data can be and should be terrain for exploration, forum for interrogation, and substrate for creation,” Thorp wrote. “There is prose and poetry and performance to be made from these rows and columns.”
That’s exactly what The Office for Creative Research did. The OCR collaborated with theater company Elevator Repair Service to create “A Sort of Joy (Thousands of Exhausted Things)” using an earlier version of the dataset.
But don’t discount the value of the metadata release for traditional art research. Newbury said MoMA’s dataset is particularly valuable for linking other publicly available collections. He said it also frees researchers from having to get permissions from various institutions, which sometimes halts studies, another advantage of the CC0 license.
MoMA’s metadata release has made it easier for someone to do a color analysis of 16th century Dutch paintings, for example, Newbury said. Or, as Software Development Times Senior Editor Alex Handy suggested, one could create a map and index of the world’s most important artwork.
“A worldwide searchable database of art would be useful to tourists, researchers and museums alike,” he wrote.
Limitations of the data
Unlike data releases by some other museums, MoMA’s doesn’t include any images. Thorp looked beyond the limitations and called it liberating.
“Instead of paying attention to paintings and sculptures and films and design objects, we can focus on the artists, on time periods, on dimensions and materials,” he wrote. “In these fringes we can see things that are often obscured when we’re staring at the artworks directly.”
FiveThirtyEight’s Oliver Roeder took a non-aesthetic approach to some of his analysis of the dataset.
When asked why the museum didn’t include images as part of the metadata release, Romeo, the director of digital content and strategy, said in an email that MoMA’s online collection now has large zoomable images.
“And in our next update to the open data release, we plan to include an image URL for each object record,” she said. “But MoMA has a collection of modern and contemporary art, so much of the work is still protected by copyright, which is managed by the artist, the artist’s estate or organizations like the Artists Rights Society.”
Romeo added that the release was part of a wider collection initiative that includes excerpting and linking from its website to Wikipedia and Getty information for the artists in MoMA’s collection.
A model for metadata releases
The Cooper Hewitt, Smithsonian Design Museum in New York released its metadata on GitHub in February 2012. Within months, Cooper Hewitt published an update about an application it created with the data and visualizations created by someone not affiliated with the museum.
Since the release, the museum’s in-gallery experience has completely changed, said Seb Chan, who recently left Cooper Hewitt, where he led the metadata release as the museum’s director of digital and emerging media.
“Although it technically runs off the public API that came later than the data release,” he said, referring to application program interfaces. “Without the data release, the API would not have been possible from an institutional perspective.”
Chan, who is now the chief experience officer at the Australian Centre for the Moving Image, said university students had successfully used the dataset in recent years. He said NYU Interactive Telecommunications Program students created everything from games to visualizations. And students at Harvard’s MetaLAB used the dataset for its Beautiful Data project.
As the academic year begins in the U.S. and Europe, Chan said he expected students to undertake similar projects with MoMA’s newly released data.
If that happens, it’s doubtful that MoMA would mind.
“We’re really excited to see what you make of — and with — MoMA’s collection data,” Romeo wrote.