Here’s How Mark Zuckerberg Made His Own AI Assistant

When new engineers join Facebook–no matter whether they’re just out of college or VP-level veterans–they spend their first six weeks in Bootcamp, an intensive program designed to help them learn the ins and outs of the company’s massive code base and the always-evolving set of programming tools at their disposal.

Mark Zuckerberg, Facebook’s original engineer, contributed more to that code than anyone else in the early years of its existence. But the 32-year-old CEO never went through the Bootcamp program, which was launched in 2006, two years after he founded the company in his Harvard dorm room.

Last January, Zuckerberg announced that he planned to build an AI system to run his home using Facebook tools, in the latest of the personal-growth challenges he gives himself each year. An exciting exploration of the state of the art of AI–a technology field essential to Facebook’s future–the project also forced him to refresh his command of the company’s programming tools and processes. That in turn has reconnected him to the daily experience of the thousands of engineers he manages and the engineering culture that’s at the heart of one of the world’s most important technology companies.

But being CEO of Facebook is not the kind of job you can abandon for six weeks in the interest of continuing education. “I didn’t go through a formal Bootcamp,” Zuckerberg told me last week in the spacious living room of his classic 113-year-old wood-frame Palo Alto, California, home, where I’ve come for a Jarvis demo and the first interview he has given about this year’s personal-challenge project. “But when I ask people questions, you can imagine that they respond pretty quickly.”

Mark Zuckerberg turns the lights off with Jarvis, his personal AI system.

Zuckerberg has always enjoyed what he calls the “deterministic” nature of engineering–the element of being able to sit down and build something that does exactly what you want it to do. For all the wildly ambitious things he can accomplish as the head of a company of more than 15,000 people that has billions of users across Messenger, WhatsApp, Instagram, and Facebook itself, he missed that pleasurable certainty.

That’s why he has continued to work on small programming projects in his rare spare time, and why his personal challenge back in 2012 was to code every day. He has participated in several company hackathons over the years and, as an exercise, once wrote a system that paired Facebook’s org chart and the internal social graph to see which groups within the company were most socially connected.

Often, Zuckerberg told me, he emerges from a coding session feeling much like he does when he studies Mandarin, the language he learned as his 2010 challenge. He feels like his brain is activated, on fire.

Facebook’s engineering culture, though, mandates that if your work breaks, you have to stop what you’re doing and fix it. That’s just not practical for the hyper-busy, globetrotting CEO. “I’m either going to get pulled out of meetings, or someone is going to have to fix my code, which is kind of a big no-no,” he says. So it’s been quite some time since he actually checked in any code at work.

Over the last year, though, Zuckerberg has spent between 100 and 150 hours on his home project. Though it’s named for Tony Stark’s futuristic Jarvis AI in the Iron Man movies, it’s more akin to a homemade, highly personal version of something like Amazon’s Alexa service, letting him and his wife Priscilla Chan use a custom iPhone app or a Facebook Messenger bot to turn lights on and off, play music based on personal tastes, open the front gate for friends, make toast, and even wake up their one-year-old daughter Max with Mandarin lessons.

Morgan Freeman is the voice of Jarvis.

AN OFF-HOURS EXPERIMENT

When you visit Mark Zuckerberg’s house, set well back on a 17,000-square-foot lot on a quiet, leafy street in a very posh Silicon Valley neighborhood, Jarvis recognizes you and automatically alerts him that you’ve arrived. But one of the weirdest things is that–once you’ve made your way through a wooden gate and along a walkway of citrus and maple–Zuck himself comes out to greet you.

That wouldn’t be so odd except that he looks exactly the same in person, down to the short brown hair and gray T-shirt and jeans, as he does in countless photos and videos. It takes a moment to be certain it’s not an avatar standing at his door welcoming you.

At work, the last few weeks have been especially eventful for Zuckerberg, who has been grappling with three separate and substantial controversies: questions about whether Facebook was a prime driver of fake news in the lead-up to the presidential election; scrutiny of his communications with venture capitalist (and Facebook board member) Marc Andreessen as the board considered Zuckerberg’s request to maintain voting control of the company even if he sells most of his stock; and concern from advertisers over errors in how Facebook measures video viewership.

Talking about something like Jarvis is surely an easier task. Sitting on a dark green couch in his living room with Beast, his dreadlocked Hungarian Sheepdog, by his side, Zuckerberg seemed relaxed as he explained how the system he’d built over the last year has made things easier–and occasionally harder–for him, Priscilla, and Max.

In his January post announcing the Jarvis project, Zuckerberg wrote that he’d set out to build a system allowing him to control everything in the house, including music, lights, and temperature, with his voice. He also wanted Jarvis to let his friends in the house just by looking at their faces when they arrive and to alert him to anything important going on in Max’s room. And he hoped to design the system to “visualize data in VR to help me build better services and lead my organizations [at Facebook] more efficiently.”

Now, in December, he has achieved all of that, save for the bit about VR. And it works. However, when he showed off the system to me in person, I learned that it sometimes needs a little coddling.

Zuckerberg began by demoing the Messenger bot he’d built as a front end for the system. Using his iPhone, he typed simple commands to turn the lights off and on, and sure enough, they went off and then on.

On the other hand, he also built the system to respond to voice commands, via a custom iOS app he’d created, and there, the results were decidedly more inconsistent. He had to tell the system four times to turn the lights off before it got dark.

“Wow, that’s like the most fails that it’s ever had,” he said, embarrassed.

Zuckerberg wanted Jarvis to be able to understand a certain degree of linguistic nuance.

Getting the system to play music was more successful. “Play us some music,” he commanded, and a couple seconds later, David Guetta’s “Would I Lie to You” began playing–very quietly–over the living room’s speakers. “Turn the volume up,” he said, twice, and it did. He also had to tell it to stop playing the music twice before it quit.

One of the Jarvis features that Zuckerberg is most proud of is its ability to learn his and Priscilla’s musical tastes so that when she asks it to play something, it can select a song taking her preferences into consideration rather than his. At the same time, he designed it so it would respond to requests like playing a certain style of music–light, or family, for example–or music similar to that of specific artists.

“Play something like the Red Hot Chili Peppers,” Zuckerberg told Jarvis. A couple of seconds later, it pumped Nirvana’s “Smells Like Teen Spirit” into the living room. “That’s a reasonably close analogy, would you say?,” he posited.

Zuckerberg also wanted Jarvis to be able to understand a certain degree of linguistic nuance. “When you’re thinking about music, if you’re telling it to ‘play something,’” he says, “that something can be a song, it could be a set of songs, it could be an artist, it could be an album, [or] it could be a recommendation.”

Jarvis incorporates voice control for music playback.

One case he found challenging was getting Jarvis to parse very similar phrases. Adele provided a perfect example. “Saying ‘play “Someone Like You”’ means play that specific song,” he explains. “Saying ‘play someone like Adele’ means asking it to find a recommendation for an artist like Adele and play some of their good songs. Saying ‘play some Adele’ means go find some of her best songs and make a playlist.”

“And those phrases, ‘Someone like you,’ ‘someone like Adele,’ and ‘some Adele’ are very similar but mean completely different things. So having the range to be able to both do a lot of different things, not just turn the lights up and down, but be able to discern the difference through getting feedback, that was interesting to work on.”

“A GOOD WAY TO MAKE YOUR WIFE MAD AT YOU”

Getting the right kind of music to play was one thing. Making sure Jarvis doesn’t piss off Priscilla is quite another.

Even asking the system to turn lights on or off or play music can introduce a surprising amount of ambiguity, if it’s unsure about where it’s supposed to do so. For example, Zuckerberg and his wife sometimes use different phrases for things–he says “living room,” while she calls it the “family room.” So Jarvis needed to understand synonyms. But Zuck didn’t want to just program in the different phrases; teaching Jarvis to learn them, and other contextual nuances, was a much more interesting problem.

The system lets Zuckerberg use a Messenger bot to welcome friends to his home.

“You’ll run into things like, I’ll just say ‘turn on the lights in this room,’ and then they’ll be on too bright, so Priscilla will [say] ‘make it dimmer,’” he says. “But she didn’t say what room to make it dimmer in, so it needs to know where we are, and…where we get the context wrong, and I’m like, ‘play some music,’ it’ll just start playing in Max’s room because…that’s where we were before.”

If Max happens to be napping when that happens? “That’s a huge bummer. That’s a good way to make your wife mad at you.”

Another example of the importance of location: As part of its regimen for creating an optimal TV-watching experience, Jarvis can turn the lights off. “One of the rooms that is adjacent to the [TV] room is…Priscilla’s office,” Zuckerberg says, “so we had this funny thing for a while where…we’re going to watch TV, and [Jarvis] would just turn off all the lights downstairs, and she’d be trying to work, and she’d be like, ‘MARK!’”

EASIER THAN EXPECTED. BUT…

While Zuckerberg usually picks just one annual personal challenge, he selected two in 2016–the second being to run 365 miles. That’s so he wasn’t too sedentary while building Jarvis, as he had been while reading a book every two weeks for 2015’s challenge.

In fact, though, Jarvis took less time to build than the running did, in large part thanks to the collection of Facebook tools he was able to leverage for tasks such as image and voice recognition.

What he didn’t expect, though, was that the majority of the project would be figuring out how to connect Jarvis to the various systems in his house–a Crestron home-automation system for lights, doors, and temperature; a Samsung TV; security systems; Sonos streaming boxes and Spotify for music–that he wanted to be able to control.

giphy

Internet-connected fridges don’t come with Facebook security certificates.

Strictly speaking, Zuckerberg’s home network is part of Facebook’s corporate infrastructure. Protecting it requires that anything connected to it has a Facebook security certificate–essentially a digital authentication key that ensures a specified device is safe.

That imposed limits on what he could control. Internet-connected fridges, for example, don’t come with Facebook security certificates. That’s not a problem for most people, but most people aren’t Mark Zuckerberg. Keeping his security at home airtight was a primary concern.

One way Zuckerberg has found to securely control certain appliances is through internet-connected switches that let him at least turn the power on and off remotely. He wanted to be able to have Jarvis make breakfast toast using slices of bread he’d stuck in a toaster earlier. But no modern toaster will let you push bread down when the toaster is powered off–a safety concern. So he bought a low-tech one from the 1950s to be able to toast on command.

Ultimately, getting everything connected the way he wanted required many hours reverse-engineering the software hooks offered by the products and services he adopted–and that was before he even began programming the actual AI.

Zuckerberg gets a Messenger notification that Jarvis has opened the gate as he controls his Sonos music system.

“IT’S NOT A PRODUCTION SYSTEM THAT’S READY TO GO”

Notwithstanding Jarvis’s inability to perform perfectly in front of a journalist, Zuckerberg is proud of what he achieved with the project, and he’s willing to compare his work favorably to systems anyone can buy, like Amazon’s Echo (powered by Alexa) and Google’s Home (powered by the Google Assistant).

“It’s not a production system that’s ready to go to other people,” he stresses. “But if I couldn’t build a system that can at least do what [Echo and Home can], I probably would have been pretty disappointed in myself.”

He hastens to add that building systems like those from Amazon and Google, designed to let millions of people control a multitude of devices, is much harder than designing an AI for a single home, and that he was in no way dismissing what those companies have done. There’s also no plans to make this a Facebook product.

But, he says, “if I wasn’t able to extend what the AI could do around music recommendation or using face rec in different ways, or understanding context as I go around the house, then I would have thought I wasn’t really pushing this forward that much.”

“It’s kind of neat to be able to wake up in the morning and just tell it ‘good morning.’

In fact, he said, he plans to publish a summary of what he worked on, and he’d be pleased if some of his conclusions are eventually integrated into publicly available systems. That approach reflects Facebook’s general philosophy about open-sourcing much of its work, especially in AI.

One such learning has to do with the way we interact with text and voice. Speaking to Jarvis and having it talk back makes sense for playing music. (In the demo I got, Jarvis speaks in a garden-variety synthesized female voice not unlike that of Siri or Alexa; Freeman had yet to record his Jarvis lines.) But Zuckerberg found that in many other cases, text was more desirable, especially when there were other people around.

“If I’m letting someone in at the gate..that’s not relevant to the people around me,” he says, “ so I’d much rather just text it.”

The name “Jarvis” alludes to Tony Stark’s amazing AI helper in the Iron Man movies.

Even if he speaks a command, he often prefers Jarvis to text back to him or “display what it’s going to do rather than speak out loud,” he says. “Because when it’s speaking, it’s demanding the floor, and that’s kind of an annoying thing.”

That said, there are definitely times when voice is the way to go. “Once you can speak to it, and it can speak back, it just feels much more–I don’t want to say part of the family, because that’s too much–but it just feels more embodied, so Max just loves it,” Zuckerberg says.

Zuckerberg has no illusions that what he built in less than 150 hours could come close to what Facebook’s AI professionals–including some of the best AI talent in the industry–can do in the thousand or more hours one engineer might devote to a single project in a year.

Still, after nearly a year of widespread curiosity, he has gotten Jarvis to the point where he’s ready to show it to the world. He’ll keep on tinkering with it, he says, because he uses it every day and there will always be small fixes to be made or new functions to add. But he’s pretty happy with what he and his family have at their disposal.

A conversational interface lets Zuckerberg type to perform tasks such as turning lights off for the night.

“It’s kind of neat to be able to wake up in the morning and just tell it ‘good morning,’ or wake up and then have the house wake up,” he says. “And similarly, to be able to get into bed at night and not have to turn everything off before that, just be like, ‘good night,’ and have it shut down the house and make sure the doors are locked.”

Zuckerberg, of course, isn’t just a husband and father looking to make his family’s life better at home. He’s also the leader of a company whose destiny will be shaped, as much as anything else, by how effective it is at enabling technical people to create great things. And one of the best things about working on the Jarvis project was the renewed exposure it gave him to the Facebook engineering experience.

“Because I spent so much time coding with Facebook’s tools, and I don’t normally get to do that as CEO of the company,” he says, “I feel like I got the full experience of what a new engineer coming to Facebook and ramping up would get. And I just have so much more appreciation and direct experience with all these internal tools that we’ve built that are such a big part of the culture.”