By Dave Gershgorn
In November 2007, Google laid the groundwork to dominate the mobile market by releasing Android, an open source operating system for phones. Eight years later to the month, Android has an an 80 percent market share, and Google is using the same trick—this time with artificial intelligence.
Today Google is announcing TensorFlow, its open source platform for machine learning, giving anyone a computer and internet connection (and casual background in deep learning algorithms) access to one of the most powerful machine learning platforms ever created. More than 50 Google products have adopted TensorFlow to harness deep learning (machine learning using deep neural networks) as a tool, from identifying you and your friends in the Photos app to refining its core search engine. Google has become a machine learning company. Now they’re taking what makes their services special, and giving it to the world.
Introducing TensorFlow, the Android of AI
TensorFlow is a library of files that allows researchers and computer scientists to build systems that break down data, like photos or voice recordings, and have the computer make future decisions based on that information. This is the basis of machine learning: computers understanding data, and then using it to make decisions. When scaled to be very complex, machine learning is a stab at making computers smarter. That's the broader, and more ill-defined field of artificial intelligence. TensorFlow is extraordinary complex, because of its precision and speed in digesting and outputting data, and can unequivocally be placed in the realm of artificial intelligence tools.
Here are the nitty-gritty details: the TensorFlow system uses data flow graphs. In this system, data with multiple dimensions (values) are passed along from mathematical computation to mathematical computation. Those complex bits of data are called tensors. The math-y bits are called nodes, and the way the data changes from node to node tells the overall system relationships in the data. These tensors flow through the graph of nodes, and that's where the name TensorFlow comes from.
Open-sourcing TensorFlow allows researchers and even grad students the opportunity to work with professionally-built software, sure, but the real effect is the potential to inform every machine learning company’s research across the board. Now organizations of all sizes—from small startups to huge companies on par with Google—can take the TensorFlow system, adapt it to their own needs, and use it to compete directly against Google itself. More than anything, the release gives the world’s largest internet company authority in artificial intelligence.
Stanford computer science professor Christopher Manning was given TensorFlow a little more than three months ago, and his students had the opportunity to tinker with the system. After just a few weeks of using it himself, Manning decided that he’s going to implement it into his curriculum.
Besides Android, he also likens the platform to Gmail, Google’s ubiquitous email application. There are competitors, but Gmail is cleaner and makes more sense in most applications.
“It’s not that before this there weren’t any high level libraries available for deep learning,” Manning says. “But in general these other libraries are things by three academics and a grad student.”
“We’re hoping, basically, to accelerate machine learning research and deployment”
While the others, most notably Torch and Theano, do have small groups updating them, it’s nothing like the full force of the developers working on Google’s machine learning infrastructure. Manning says that while TensorFlow is a huge gift to the community (one capable of reducing time spent optimizing the neural networks by 100 times), they might indirectly benefit from open-sourcing their tools.
"A very small amount of companies have been trying to hire up a very large percentage of the talented people in artificial intelligence in general, and deep learning in particular,” Manning says. “Google is not a charity, I’m sure it’s also occurred to them that by ceding this, we will have a lot of Ph.D students who will be in universities and already liking Google deep learning tools.”
Jeff Dean, one of Google’s top engineers and one of the two people who could be listed as an author for TensorFlow (the other is Rajat Monga), is cautious about estimating the adoption in the community. He says that while it’s something Google has found immensely useful in their own work, the real test is whether the community will find it as capable. The idea is to provide a tool so the whole community will be able to go from not just ideas, but actual implementations of things more rapidly.
“We’re hoping, basically, to accelerate machine learning research and deployment,” Dean says. And while this is a big gift the community, the ideal scenario is that the community gives back, and shares what they’ve made with other researchers (and Google). “The machine learning community has been really good at polishing ideas, and that’s a really good thing, but it’s not the same thing as polishing working code associated with research ideas,” Dean says.
He also mentions that TensorFlow will help Google interns when they return back to their schools, because they can now access the once-proprietary systems on projects they might not have finished during their time at the company.
The TensorFlow system is a pretty complete package for an individual researcher. The system is a complete, standalone library associated with tools and an Apache 2.0 license, so it can be used in commercial settings. It can be compiled on desktops or laptops, or deployed on mobile (Android first, naturally, and then iOS to come later). It also comes with tutorials and documentation on how to modify and play with the platform.
Manning suggests that the ability to run deep learning algorithms on mobile devices will be an important factor that separates TensorFlow from other open-source systems.
For those who want to use the system as-is, Google is providing a version that researchers can start using right now (as pre-built binaries). There’s also an application programming interface (API), for software developers to train and control their TensorFlow models. And this isn’t a knockoff—it’s the literal system used in the Google app, and more than 50 other products.
Inside Google's Artificial Intelligence Lab
Google is opening this platform to the world, which gives us an equal opportunity to peek in and see how the company thinks about developing machine learning systems.
Internally, Google has spent the last three years building a massive platform for artificial intelligence and now they’re unleashing it on the world. Although, Google would prefer you call it machine intelligence. They feel that the word artificial intelligence carries too many connotations, and fundamentally, they’re trying to create genuine intelligence—just in machines.
It’s the model that they’ve used within the company for years: where any engineer who wants to play with an artificial neural network can fork it off the system and tinker. That’s the kind of open structure that allows 100 teams within a company to build powerful machine learning techniques.
"Machine learning is a core, transformative way by which we’re re-thinking how we’re doing everything," Google CEO Sundar Pichai said on the company’s earnings call in October 2015. "We are thoughtfully applying it across all our products, be it search, ads, YouTube, or Play. And we're in early days, but you will see us — in a systematic way — apply machine learning in all these areas."
Welcome to Google, where everything is AI and AI is everything
It’s difficult to lay out a concrete diagram of machine intelligence research at Google, because it’s always changing, and saturates nearly every team in the company.
Google’s VP of engineering, John Giannandrea, calls this an “embedded model.” I met him at one of the many sleek modern moderns at Google’s headquarters in sunny Mountain View, California, in the fall of 2015.
I was on a floor technically not open to the public, and when I was left unattended for a moment, an engineer came up to me, noticing I wasn’t wearing an employee badge. He asked who I was, and saying I was a writer didn’t smooth the situation over. Google prides itself on making its research open to the public, but work in the labs is kept under heavy wraps.
Google prides itself on making its research open to the public, but work in the labs is kept under heavy wraps.
For me, Google’s embedded model meant a lot of walking. The Googleplex contains 3.5 million square feet of office space over about seven acres of land. Google staff ride bikes between buildings, which are surrounded by well-groomed parks where Googlers sit with laptops, undoubtedly grappling with complex computer science conundrums or playing Minecraft during their lunch break. Different teams work in different buildings, and embedded machine intelligence researchers switch buildings when they switch teams.
Inside, most of what I saw looks like a normal office building. There are cubicles, computers with loads of monitors, and people discussing work in hushed tones while glancing nervously towards the journalist. There are holes cut in the wall to catch a quick nap—you know, office things.
Organizationally, there’s a pool of researchers always working on general machine intelligence problems, and that work feeds back into Google's core products, such as the Photos app, Voice Search, and Search itself. There are some projects that start as just something Google wants to get better at. Giannandrea suggests handwriting as an example.
“We, as a company, want to understand how people would write a word. So that’s something we would invest in forever, even if we didn’t have a product,” he says.
But because Google is so vast in its offerings, there’s usually a tool that can use each research element. (Handwriting ended up in Google Keep, the note-taking software.)
“There’s no world in which Google doesn’t want to have better speech recognition, language translation, language understanding."
When that use is figured out, the researcher hops onto the product team to help with implementation. Product teams develop specific applications that we all use, like the Photos app or Google Translate.
In general research, the teams are divided by their area of interest. There’s a team focused on teaching computers to see, a team working to understand language, a team looking at better voice recognition, and so on.
“There’s no world in which Google doesn’t want to have better speech recognition, language translation, language understanding— so these frontiers of research in computer science are things we invest in all the time,” Giannandrea says.
There are more than 1000 researchers at Google working on these machine intelligence applications, constantly rotating between applied and theoretical research. Some of these researchers work on simpler problems that wouldn't be considered artificial intelligence, in the strictest sense of the word, but are more statistical methods of prediction.
Google’s new parent company, Alphabet, doesn’t make a big impact on the way Google’s machine intelligence research will continue, according to Google spokesman Jason Freidenfelds. While the research team will stay within Google Proper, there won’t be any barriers from working with Life Sciences or Google [x] on machine learning applications.
The Voice of the future
A rising star in Google’s catalog of tools is Voice Search. You’ve probably run into it before even if you didn’t know exactly what it it was: it’s the little microphone icon in the main Google search bar, which when pressed, let’s you speak your search query to Google instead of typing it in. That same little microphone appears in Google’s Search app for iPhone and Android, and can be found within the Android search bar itself on many smartphones.
Although superficially thought of as a rival to Siri, Google Voice search has actually become a secondary gateway to Google’s vast knowledge base, and to the language recognition team’s delight, it’s finally getting more popular.
While Google doesn’t release the percentage of voice searches in relation to text, it does provide a veritable rabbit hole of statistics: mobile search is now more popular than desktop, mobile voice search has doubled in the last year, about 50 percent of American phone and tablet users know they can ask Google questions, and a third of them actually do it.
That’s a long sentence saying that while Google won’t say how many voice searches are made, Google’s press team assures me it’s a lot.
Besides a few hundred iterations of the algorithm per year, Search has worked pretty much the same for years. But getting people confident enough to speak with their devices has been a struggle.
Senior researcher Françoise Beaufays works on developing the voice recognition engine behind Voice Search, and says that increased adoption is because the feature just works better now.
“When we started doing speech recognition, users weren’t fully confident. they were using it, but you could tell there was hesitation, the technology wasn’t as good as it is now,” Beaufays says. “Fast forward to nowadays, people are comfortable doing anything possible by voice in their office.”
Beaufays speaks quickly with a French accent, and is trilingual—on top of her fluency in neural network architecture. She led the Speech team just ripped out the service’s old engine used to recognize sounds, and replaced it with a new, more advanced system that uses a new brand of recurrent neural networks.
For a machine to understand speech, it needs to first learn what words and phrases sound like. That means audio files, and a lot of them. These files are processed by the algorithm, which create a huge graph of which sounds correlate to which sounds, words, and phrases. When an audio clip is presented to the computer, it analyzes the clip by pushing the audio waveform through the graph, in an attempt to find a path that best explains the audio.
“That path in the end will say, ‘We went through this sequence of sounds, and that maps to this sequence of words, and that makes this sentence,’” Beaufays says.
Whenever you make a voice search, the audio is uploaded to Google servers.
But all this relies on those initial audio files, which is called training data. This training data is actually made of millions of real voice searches by Google users. Whenever you make a voice search, the audio is uploaded to Google servers, and if you’ve opted into letting Google use it, can be integrated into the bank of clips used to train the machine.
But before it’s used, the data goes through a few steps. First (and most importantly to you), it’s scrubbed of all your information. That means timestamps, location data, your user profile, everything. The raw waveform is then sent to a human transcriber, because the algorithm needs reliable text to associate with the clip. Every clip needs this metadata, and a “bad” clip is really just one that isn’t properly transcribed.There are even instances where researchers add in artificial noise, in order for the machine to understand what different words sounds like in different situations.
Beaufays stresses that this program is opt-in. This is important, given the privacy concerns—which are rational—that regularly bubble up as Google continues amassing more information about the world and our lives. But if you don’t want Google to use your voice, you don’t have to let it. Also, there are ways of deleting your searches after the fact.
But these techniques have made voice search more effective. According to Google, two years ago the error rate was 25 percent, which means one of every four searches was wrong. Now, that number is down to 8 percent.
But what happens when Google can’t train on your data?
The intelligent Inbox
Last week, Google announced that it’s beginning to use machine learning in your email (if you use the Inbox app, which is separate from Gmail), and yes, it’s built on TensorFlow, according to Alex Gawley, product director for Gmail.
“We started to see some of the power of the neural nets our research team was building,” Gawley says. “That it might just be possible for us to help with more than just understanding and organizing. it might just be possible for us to help with things like writing mail.”
The feature is called Smart Reply, and basically one recurrent neural network reads your email and hands it off to a second, which generates three potential responses. You choose, and the email is sent. But email is just as sensitive as photos, if not more in some cases.
No person at Google reads your emails, which is important to keep in mind. However, data on which choice you made does get sent back to inform the global model. That’s how it learns. From that, researchers can ask the machine to answer certain questions, and from there understand what might need to be fixed in the neural networks. The software is same for everybody, too, which is something
Smart Reply also gives us a peek into how machine learning products are built within Google. The Inbox team deployed this feature internally, to test and feed the machine some ideas of what was right and wrong, a process called dogfooding. (The phrase comes from the idea of eating your own dog food, and is an example of why tech is bizarre.)
The whole team uses it, and documents bugs, and gives it more and more information to feed from. When the app behaves correctly in the controlled environment, and can be scaled, its released.
That’s the end goal in a smartphone with machine intelligence: the true digital personal assistant, ultimately predictive and vastly knowledgeable—the part of your brain you’re not born with.
Internal testing gives researchers a chance to predict potential bugs when the neural nets are exposed to mass quantities of data. For instance, at first Smart Reply wanted to tell everyone “I love you.” But that was just because on personal emails, “I love you” was a very common phrase, so the machine thought it was important.
All this is an attempt to make your work easier—that’s what a most of the company’s products are aiming to do, especially Google Now, the personal assistant of the Google world. The team’s catch phrase is “just the right information at the right time.” Aparna Chennapragada, the head of Google Now, says that machine intelligence needs to be thoughtfully considered when being built into the platform, in order to complement the human brain.
“You want to pick problems that are hard for humans, and easy for machines, not the other way around,” Chennapragada says. “It’s about making the technology do the heavy lifting for you, rather than doing it yourself. “
At the moment, the product is really just exploring how to use these methods to make your life easier. Chennapragada likens it to where research sat with voice recognition 5 years ago—it was okay, but didn’t work every single time.
They’re looking now at how to leverage three different kinds of data to serve you with tidbits of information. They see the phone as a “partial attention device” and an ideal service shouldn’t overload you with information.
“If you look at how each of us uses the phone, it’s between things that you’re doing in your life. It’s bite sized piece of information that you’re looking for,” Chennapragada says. “One of the things we think about is how we can work on your behalf, proactively, all the time.”
That’s the end goal in a smartphone with machine intelligence: the true digital personal assistant, ultimately predictive and vastly knowledgeable—the part of your brain you’re not born with.
So to get there, your phone needs data about you: your schedule, what you search for, what music you listen to, and where you go. This is the easiest kind of information to get, because it’s already on the device.
But when you combine that personal information with knowledge about the world, through Google’s KnowledgeGraph (more on this later), and data being sourced from other users, the world is brought to your fingertips. You might not know how to navigate an airport, but your phone does.
Another example of the way Google uses data from lots of people is gauging road traffic. By pulling anonymous location data from phones on the highway, Google can tell that cars are moving slower than usual. The same goes for being able to tell when a restaurant or coffee shop is busy.
Google Now represents the way Google approaches machine intelligence. They’re aware that a general intelligence model that can translate and tell you what’s in a picture is years and years away, so in the meantime, they’re creating a mosaic of tools that act in harmony to provide the best experience possible.
Organizing the world’s information
Okay, so I mentioned that Google Now works with KnowledgeGraph. What’s that?
John Giannandrea, the head of Google’s research from earlier, was brought into Google in 2010. He founded a company called Metaweb, which related text and objects on the internet. It was a logical parallel to search—not only finding things, but finding similar bits and pieces of information. He had worked on this issue even before that, when he was the CTO of Netscape. (Remember Netscape?)
But this all manifested in KnowledgeGraph, which debuted in 2012 as the bits of information and text that automatically pop up when you search for facts. If you search “When was Popular Science founded?” Google will supply the answer (which is 1872).
This is Google’s way of not only cataloging the internet, but making it more accessible and useful to its users. It was also the first leak of artificial intelligence into the main product, search. Since then, Google has handed 15 percent of its daily search traffic to artificial intelligence model called RankBrain. This system is the common sense of search—it’s meant to catch the queries that traditional algorithms can’t figure out.
Beyond integration into its core search algorithms, and the expansion into products, Google also has a few moonshots in the works. For that, they rely on Geoff Hinton.
Hinton is one of the foremost thinkers in artificial intelligence—he’s often listed in the same sentence as other high-level researchers like Yann LeCun at Facebook, Google’s Andrew Ng, and Yoshua Bengio. (In fact, LeCun, Hinton, and Bengio wrote a review in Nature this May on deep learning, which reads like the literal textbook on AI.)
"There’s a thin line between magic and mystery, and we want to be on the right side of it."
Speaking with Hinton is like talking to someone who lives five years in the future. Our conversation centered around turning documents into thought vectors, so that machines could understand and remember lengthy versions, and reverse engineer the algorithm our brain uses to learn.
Many computer programs today, for example, brute force the problem of analyzing what a text document means by looking up the dictionary definitions of words in the document, and the grammar. But in order to understand the document like a human, a computer would ideally be able to break the document down into a series of distinct thoughts.
“Google would love to be able to take a document, and figure out what the reasoning is, what the document is saying, and how one thought follows from the previous thoughts,” Hinton says. “If we could start doing that, then it could give you much better answers to queries, because it actually reads the documents and understands them.”
When asked why we aren’t doing this already, Hinton says if we’re trying to match comprehension to the brain, it’s a matter of scale. The artificial neural networks researchers use now just don’t have the complexity of our brain, even at when scaled to their current limits. The best ones we have might have hundreds of millions of weights that can be manipulated (LeCun uses the analogy of a black box with a million knobs on the outside to explain fiddling with weights.) But Hinton explains that our brains have 100 trillion—that’s 100,000 times more information.
In the face of being dwarfed by scale, Hinton is still optimistic that this streak of artificial intelligence research won’t fizzle out as it has in the past. (Artificial intelligence research has seen “winters,” where progress hasn’t matched expectations and investment has faded.) A large factor of this is the increasingly popular idea of thought vectors, as mentioned earlier. But the most comforting thing to Hinton is the progress in the last five years, especially object recognition and speech. These problems were often seen as too complex in the past, and now the error rate has drastically decreased on standardized tests.
“They’re getting close to human-level performance. Not in all aspects, but things like object recognition. A few years ago, computer vision people would have told you ‘no, you’re not going to get to that level in many years.’ So that’s a reason for being optimistic,” Hinton says.
But no matter how well a machine may complement or emulate the human brain, it doesn’t mean anything if the average person can’t figure out how to use it. That’s Google’s plan to dominate artificial intelligence—making it simple as possible. While the machinations behind the curtains are complex and dynamic, the end result are ubiquitous tools that work, and the means to improve those tools if you’re so inclined.
“There’s a thin line between magic and mystery,” Google Now’s Chennapragada says.“And we want to be on the right side of it.”