Try out Yale's new research tool

LUX, a free new tool from Yale University, is perfect for research projects where you want to be led down a rabbit hole of infinite connections adjacent to a subject of interest. It’s a central hub that contains 17 million searchable objects across Yale’s museums, archives, and libraries.

The tool works somewhat like a search engine. However, search engines tend to return hits that then offer you links to travel onwards to a new site. LUX builds relationships between the object you’re searching for and other related objects in the collection. It goes beyond the objects themselves and finds obscure connections. For example, if you were searching for a piece of artwork, it would surface other works from the same author, as well as other art created around the same time or in the same location. Or, if you were to search for meteorites, it would pull up images of actual meteorites from the university’s museums, as well as art and books about meteorites. Previously you would have to go to different places—a natural history museum for the meteorites, and a library for books—or Google separate entries and piece together these different resources.

At the heart of LUX is a backend data model called a knowledge graph. They’re usually made up of datasets from different sources and are a way of organizing that information into a network of relationships. You can think of it like the evidence pin-up-board detectives use to visualize the connections between people, objects, places, and events. The concept was arguably popularized by Google in 2012. Van Gogh World Wide operates off a similar data model. And the technique is only becoming increasingly popular in the art world as more works get digitized.

“No one likes to search, everyone likes to find,” Robert Sanderson, senior director for digital cultural heritage at Yale University, said at a media briefing Thursday. LUX is able to provide robust context around the object that’s being searched for.

When you enter a term into the search bar, tabs on the page separate the search into different categories: objects, works, people and groups, places, concepts, and events. The advanced search feature as well as filters on the side allow you to narrow down your search. When you click through to a page, there might be hyperlinks that can lead you to discover cross-connections. For example, if you click through to a link of an artwork, then onto the hyperlink of its painter, you will find more information about concepts influenced by this painter, their production timeline, related people and groups, and other works created by, or created about, the artist.

The project to build this tool has been in the works for the past five years. And Yale hopes that by doing the heavy lifting, it can make it easier for other institutions to build their own version of LUX. As such, the code for LUX will be open-sourced. That means anyone can view the configurations on databases, as well as all the transformations Yale did on the data. The database that does the searching is proprietary, but can be licensed. There will be a smaller, similar database that will be more widely available for smaller institutions with fewer resources.

Importantly, LUX does not use artificial intelligence. Instead of using large language models, the team rely on human intelligence, meaning that they hired students to build out the depths of the metadata, and add identifiers to datasets within the collections over a span of six years.

According to Sanderson, the team did run some experiments with ChatGPT, asking it to find specific objects in collections. The AI would give an accession number and a url link for the query, but the link often didn’t work, and the number led to a completely different object. “The model understands how language works, but it’s not a knowledge model, it’s not a fact model,” he said. “You get answers that are convincing but wrong.”

The LUX that’s available to the public today is still a work in progress. Already, the team has ideas on how to improve it, and new features that they’re thinking of adding. You’ll notice that on the result pages, there will be a big blue button for user feedback if there’s an ethical issue or if the data is wrong for some reason.