facebook is researching ai systems that see, hear, and remember everything you do

© Illustration by Alex Castro / The Verge

Facebook is pouring a lot of time and money into augmented reality, including building its own AR glasses with Ray-Ban. Right now, these gadgets can only record and share imagery, but what does the company think such devices will be used for in the future?

A new research project led by Facebook’s AI team suggests the scope of the company’s ambitions. It imagines AI systems that are constantly analyzing peoples’ lives using first-person video; recording what they see, do, and hear in order to help them with everyday tasks. Facebook’s researchers have outlined a series of skills it wants these systems to develop, including “episodic memory” (answering questions like “where did I leave my keys?”) and “audio-visual diarization” (remembering who said what when).

“there’s possibilities down the road that we’d be leveraging this kind of research”

Right now, the tasks outlined above cannot be achieved reliably by any AI system, and Facebook stresses that this is a research project rather than a commercial development. However, it’s clear that the company sees functionality like these as the future of AR computing. “Definitely, thinking about augmented reality and what we’d like to be able to do with it, there’s possibilities down the road that we’d be leveraging this kind of research,” Facebook AI research scientist Kristen Grauman told The Verge.

Such ambitions have huge privacy implications. Privacy experts are already worried about how Facebook’s AR glasses allow wearers to covertly record members of the public. Such concerns will only be exacerbated if future versions of the hardware not only record footage, but analyze and transcribe it, turning wearers into walking surveillance machines.

facebook is researching ai systems that see, hear, and remember everything you do

© Photo by Amanda Lopez for The Verge Facebook’s first pair of commercial AR glasses can only record and share videos and pictures — not analyze it.

The name of Facebook’s research project is Ego4D, which refers to the analysis of first-person, or “egocentric,” video. It consists of two major components: an open dataset of egocentric video and a series of benchmarks that Facebook thinks AI systems should be able to tackle in the future.

Facebook helped collect 3,205 hours of first-person footage from around the world

The dataset is the biggest of its kind ever created, and Facebook partnered with 13 universities around the world to collect the data. In total, some 3,205 hours of footage were recorded by 855 participants living in nine different countries. The universities, rather than Facebook, were responsible for collecting the data. Participants, some of whom were paid, wore GoPro cameras and AR glasses to record video of unscripted activity. This ranges from construction work to baking to playing with pets and socializing with friends. All footage was de-identified by the universities, which included blurring the faces of bystanders and removing any personally identifiable information.

Grauman says the dataset is the “first of its kind in both scale and diversity.” The nearest comparable project, she says, contains 100 hours of first-person footage shot entirely in kitchens. “We’ve open up the eyes of these AI systems to more than just kitchens in the UK and Sicily, but [to footage from] Saudi Arabia, Tokyo, Los Angeles, and Colombia.”

The second component of Ego4D is a series of benchmarks, or tasks, that Facebook wants researchers around the world to try and solve using AI systems trained on its dataset. The company describes these as:

Episodic memory: What happened when (e.g., “Where did I leave my keys?”)?

Forecasting: What am I likely to do next (e.g., “Wait, you’ve already added salt to this recipe”)?

Hand and object manipulation: What am I doing (e.g., “Teach me how to play the drums”)?

Audio-visual diarization: Who said what when (e.g., “What was the main topic during class?”)?

Social interaction: Who is interacting with whom (e.g., “Help me better hear the person talking to me at this noisy restaurant”)?

Right now, AI systems would find tackling any of these problems incredibly difficult, but creating datasets and benchmarks are tried-and-tested methods to spur development in the field of AI.

Indeed, the creation of one particular dataset and an associated annual competition, known as ImageNet, is often credited with kickstarting the recent AI boom. The ImagetNet datasets consists of pictures of a huge variety of objects which researchers trained AI systems to identify. In 2012, the winning entry in the competition used a particular method of deep learning to blast past rivals, inaugurating the current era of research.

facebook is researching ai systems that see, hear, and remember everything you do

© Image: Facebook Facebook’s Ego4D dataset should help spur research into AI systems that can analyze first-person data.

Facebook is hoping its Ego4D project will have similar effects for the world of augmented reality. The company says systems trained on Ego4D might one day not only be used in wearable cameras but also home assistant robots, which also rely on first-person cameras to navigate the world around them.

“The project has the chance to really catalyze work in this field in a way that hasn’t really been possible yet,” says Grauman. “To move our field from the ability to analyze piles of photos and videos that were human-taken with a very special purpose, to this fluid, ongoing first-person visual stream that AR systems, robots, need to understand in the context of ongoing activity.”

Facebook’s development of AI surveillance systems will worry many

Although the tasks that Facebook outlines certainly seem practical, the company’s interest in this area will worry many. Facebook’s record on privacy is abysmal, spanning data leaks and $5 billion fines from the FTC. It’s also been shown repeatedly that the company values growth and engagement above users’ well-being in many domains. With this in mind, it’s worrying that benchmarks in this Ego4D project do not include prominent privacy safeguards. For example, the “audio-visual diarization” task (transcribing what different people say) never mentions removing data about people who don’t want to be recorded.

When asked about these issues, a spokesperson for Facebook told The Verge that it expected that privacy safeguards would be introduced further down the line. “We expect that to the extent companies use this dataset and benchmark to develop commercial applications, they will develop safeguards for such applications,” said the spokesperson. “For example, before AR glasses can enhance someone’s voice, there could be a protocol in place that they follow to ask someone else’s glasses for permission, or they could limit the range of the device so it can only pick up sounds from the people with whom I am already having a conversation or who are in my immediate vicinity.”

For now, such safeguards are only hypothetical.

Internet Explorer Channel Network



Hertz says it may expand supply of Teslas to Uber to 150,000

The logo of car rental company Hertz is seen at a branch office in Zurich, Switzerland November 17, 2020. REUTERS/Arnd Wiegmann SAN FRANCISCO, Oct 28 (Reuters) – Rental car company Hertz (HTZZ.PK) said on Thursday it could expand plans to supply Tesla (TSLA.O) vehicles to Uber (UBER.N) to 150,000 during…

Read more: Hertz says it may expand supply of Teslas to Uber to 150,000

AirPods 3 review: Apple upped its sound game

Apple's third-generation AirPods have an improved design and sound. But are they better than the AirPods Pro that cost around the same price?

Read more: AirPods 3 review: Apple upped its sound game

Facebook's Zuckerberg lays out 'metaverse' vision at developers event

Facebook Chairman and CEO Mark Zuckerberg addresses the audience on “the challenges of protecting free speech while combating hate speech online, fighting misinformation, and political data privacy and security,” at a forum hosted by Georgetown University's Institute of Politics and Public Service (GU Politics) and the McCourt School of Public…

Read more: Facebook's Zuckerberg lays out 'metaverse' vision at developers event

U.S. congresswoman Greene bought Trump SPAC shares

U.S. Rep. Marjorie Taylor Greene (R-GA) speaks to reporters about being temporarily suspended from Twitter for tweets which violated the social media's misinformation policy on COVID-19 from her office in Longworth House Office Building in Washington, U.S., July 20, 2021. REUTERS/Elizabeth Frantz/File Photo Oct 28 (Reuters) – U.S. Representative Marjorie…

Read more: U.S. congresswoman Greene bought Trump SPAC shares

Cellnex eyes having 200,000 masts from planned 130,000

A telecom antenna of Spain's telecoms infrastructures firm Cellnex are seen under main telecom tower, known as “Piruli”, in Madrid, Spain, March 10, 2016. REUTERS/Sergio Perez BARCELONA, Oct 28 (Reuters) – Europe's largest mobile phone tower operator Cellnex (CLNX.MC) eyes having 200,000 sites, up from the 130,000 it plans to…

Read more: Cellnex eyes having 200,000 masts from planned 130,000

Facebook teases Polar, a mobile app for creating AR filters

At Facebook's AR/VR-focused event Connect, the company announced a new app focused on bringing more creators into the world of building for augmented reality. The upcoming iOS app, called Polar, allows users to build their own AR filters for Facebook and Instagram powered by the Spark AR platform. Users will…

Read more: Facebook teases Polar, a mobile app for creating AR filters

Facebook Messenger audio calls are coming to Oculus Quest

Facebook is keen to improve the process of joining VR experiences in its Quest headset with friends. Today at the company's Connect conference, they announced that Quest users will be able to make and take audio calls via Messenger. The functionality will be available later this year, the company says.…

Read more: Facebook Messenger audio calls are coming to Oculus Quest

Oculus Quest 2 gets a new living room, Slack and more apps

Oculus Quest 2 users will see a new living space: Horizon Home, a more social version of the virtual living room that pops up when you slip on your headset. Instead of just being a pretty and sterile virtual environment, you'll be able to invite friends to hang out in…

Read more: Oculus Quest 2 gets a new living room, Slack and more apps

Market needs 'more certainty' on Biden spending plan: Analyst

More than 700M people use AR effects across Facebook's apps & devices every month

Here are the EE Pocket-lint Awards nominees for Best Gaming Laptop 2021 and how to vote

Facebook Minimized the Outcome of Its Research Regarding Instagram's Effect on Mental Health

Slower Tesla FSD Beta Updates, Elon Musk Says in Latest Tweet: Will This Change Make It Safer?

Apple Watch Series 7 vs. SE: Why Apple's cheaper watch is the right choice for most people

Facebook is testing its prototype Aria smart glasses in cars via BMW

Facebook wants to pivot Oculus Quest and AR toward work and apps