A preview of what's next for Google search

What does the future of internet search look like? Google envisions it as looking more like a casual conversation with a friend.

While Google’s search engine has been online for over two decades, the technology that powers it has been constantly evolving. Recently, the company announced a new artificial-intelligence system called MUM, which stands for Multitask Unified Model. MUM is designed to pick up the subtleties and nuances of human language at a global scale, which could help users find information they search for more easily or allow them to ask more abstract questions.

Google already used MUM in an independent task to learn more about the different ways people refer to COVID vaccines, but says that the new tech is not yet part of their search system. While there’s currently no set timeline on when the feature will roll out in live search, the team is actively working on developing other one-off tasks for MUM to complete.

Here’s what to know about what MUM is, how it’s different from what’s come before, and more.

Solving the COVID vaccine name game

When vaccines became available earlier this year, Pandu Nayak, the VP of search at Google, and colleagues designed an “experience” that gave people information about the COVID vaccines–where to get them, how they work, and where they were available–when users searched for it. The experience patchworked all this essential and relevant information together and pinned it to the top of the first page of search results. But first, the team needed to program it so it only popped up when the queries were about COVID vaccines. That could become a problem because people around the world may refer to COVID vaccines in different ways, and by different names.

Last year, the team spent hundreds of hours combing through resources to identify all the different names for COVID itself. But this year, they had MUM. “We were able to set up a very simple experiment with MUM that within seconds was able to generate over 800 names for 17 different vaccines in 50 different languages,” Nayak says. “We have a lot of language tasks that need to be solved, whether it’s classification, ranking, information extraction, and a whole host of others. In the short term, we expect to use MUM to improve each of those. Not that it will lead into a new feature or a new experience, rather, existing features and existing experiences will just work that much better.”

Meeting MUM at Google I/O

We first heard about MUM back at the Google I/O developer’s conference in the spring, when Prabhakar Raghavan, senior vice president at Google, unveiled it.

The new tech is the natural evolution of machine-learning based search that Google has been refining and modifying over the last decade. Google boasts that MUM is able to acquire deep knowledge of the world, understand language and generate it, and train across 75 languages at once. There’s also internal pilots testing if it can be multimodal—that is, able to simultaneously understand different forms of information like text, images, and video.

All this complexity can be illustrated by a simple example laid out at the conference and via a blog post. Suppose you ask Google, “I’ve hiked Mt. Adams and now want to hike Mt. Fuji next fall, what should I do differently to prepare?” This is the type of search query most people wouldn’t bother typing in today, because users understand that that’s generally not how you search for information online.

“This is a question you would casually ask a friend, but search engines today can’t answer it directly because it’s so conversational and nuanced,” Raghavan explained at I/O. But ideally, MUM would understand you’re looking to compare two mountains, and also understand that “prepare” could include things like fitness training for the terrain and hiking gear for fall weather. It would be able to dissect your question and break it down into a set of queries, learn about each aspect of your problem, then put it back together. Users can click to learn more about search results related to each aspect of the question, and also get an overarching text that explains how the original query was answered.

Experiences like these are the long-term goal of MUM’s engineers, and the time it will take to reach that goal is not yet clear. Working backwards, in the medium term, engineers at Google are training MUM to recognize the relationship between words and images, and it’s going well. Nayak says that when they asked MUM to generate an image for a new piece of text they fed it, like Siberian Husky, it did “quite a remarkable job.”

A brief history of search

Since its inception in 1998, Google has been continuously mapping the web, gathering the overwhelming slew of content out there and creating an index to organize all the information.

You can think of the Google search index as working like the index at the back of a book. It tells you all the pages that a specific word occurs on. Except with the internet, there are two important differences. One, is that a book might have 300 to maybe 1,000 pages, which is modest compared to the web’s trillions of pages. The second important difference is that with an index at the back of a book, you look up one word at a time, whereas on the web, you look up combinations of words. “We get billions of queries everyday from across the world because of this scale and because of this combinatorial explosion,” Nayak says. “And the remarkable fact here is that 15 percent of the searches that we get every day are ones that we have never seen before. There’s an incredible amount of novelty in the query stream.”

Part of the novelty is attributed to new ways of misspelling words, adds Nayak, and part of it is because the world is constantly changing, and there are new (and sometimes very specific) things that people ask for.

How Google Search Works (in 5 minutes) thumbnail

To pare all the possible web information down to the ones that are really relevant to your query, Google uses an algorithm to rank what it thinks are the most useful pages at the top, using factors like freshness and location, and also how different pages link to one another. “By far, the most important class of factors has to do with language understanding,” says Nayak. “Language understanding is really at the heart of search, because you need to understand what the query means, you need to understand what documents mean, and how those two match each other.”

Of course, software can’t truly understand language the way we do, with all its subtleties and nuances. But programmers can develop various strategies that try to approximate how we understand language. Just over 16 years ago, Google built the first version of the synonym system, which accounted for the fact that different words have different meanings in different contexts. So “change” can mean “adjust” when you’re talking about laptop brightness. Without understanding this, many relevant pages would’ve been excluded from search results due to variations in word choices.

Then, about a decade ago, the company created the knowledge graph. The idea behind it was that words, in queries or in documents, aren’t just streams of characters, but can mean something if referring to people, places, or things in the world. “If you don’t understand the reference of what a particular string of characters mean, then you haven’t fully understood what that word means,” Nayak explains. Entities such as people, places, things, companies, were put into a database, and the knowledge graph links the relationships between them. It also compiles a quick summary on the need-to-know fast facts on an entity like a celebrity or a landmark.

For example, if you search for “Marie Curie,” Google’s knowledge graph can tell you when and where she was born, who she was married to, who her children were, where she went to university, and what she was known for. It’s a way of conveniently showcasing information outside of just the list of page results that Google displays after a search.

Machine learning heats up

About six years ago, Google launched their first version of machine learning-based search. Then, it continued to improve upon it based on mounting research in the deep learning community around natural language algorithms that can look at the context in which a word is used to understand its meaning and figure out which parts of the context to pay attention to. In 2019, Google introduced the BERT architecture for search. Its training algorithm was effectively an array of “fill in the blanks” exercises. You would take a common phrase, block out random words, and ask the network to predict what those words are. It’s also called the masked language model.

For a query like, “can you get medicine for someone at the pharmacy,” previously, a searcher would get a result about picking up prescriptions at the pharmacy. BERT understood that it’s not only picking up a prescription, but it’s picking up a prescription for someone else, like a friend or a family member. “We were able to surface a more relevant result because it picked up some subtlety in the question that previously we were not able to handle,” Nayak says.

Moving forward, MUM is able to not only understand language like BERT, but is also able to generate language. Comparatively, MUM is much larger than BERT and has more capabilities (Google says that it’s about 1,000 times more powerful). MUM is trained on a high quality subset of the public web corpus across all the different languages that Google serves. The search team removes low quality content, adult content, explicit content, hate speech, so the kind of language MUM learns is, in a sense, good (hopefully). By being trained simultaneously on all the languages at the same time, it’s able to generalize information from languages with oodles of data to languages with less data, which can fill in the gaps where there is less data available for training.

But Nayak acknowledges that there are definitely challenges with large language models like MUM that the team is actively working to resolve. “One, for example, is the question of bias. Because this is trained off the web corpus, there’s this concern of whether it reflects or reinforces biases that are there in the web,” Nayak says. The fact that it’s trained on a high quality subset of the corpus, Nayak hopes, will eliminate some of the most egregious biases. Google continues to use search quality raters and other evaluation processes to check over their results and look for patterns of problems. “It doesn’t solve all problems, but it is a significant mitigation.”

MUM is building on an assembly of innovative features that Google has been experimenting with to make search better. “Today, when people come to search, it’s not like they come with fully formed queries in their heads. They come to search with some broad intent about something happening in their lives,” Nayak says. “You have to take this fuzzy need that you have, convert it into one or more queries that you can issue to Google, learn about different aspects of the problem and put it together.”

Features like autocomplete have, to an extent, tried to help make the search process easier, but MUM could open up a new set of possibilities. “The real question I think with all search tools,” Nayak says, “because they are tools, is: Even if it’s not perfect, is it useful?”