Microsoft removes face-recognition database from internet

A collection of 10 million pictures of faces, gathered without consent, has been quietly removed.

Facial recognition reuters
Facial recognition software is becoming more popular and widespread, leading to privacy concerns [Thomas Peter/Reuters]

Microsoft’s database of pictures of 10 million faces, gathered without consent and used to train military and commercial facial recognition software around the world, has been taken offline, the Financial Times has reported.

The images had been scraped from search engines and were published as a dataset named MS Celeb in 2016.

“The site was intended for academic purposes,” Microsoft said in a statement. “It was run by an employee that is no longer with Microsoft and has since been removed.”

The Financial Times has reported on other datasets which have also subsequently been removed, notably one named Brainwash, built by Stanford University researchers, and another compiled by researchers at Duke University, named Duke MTMC.


Microsoft’s huge dataset had been used by IBM, Panasonic, Alibaba, Nvidia, Hitachi, Sensetime and Megvii, according to an investigation by Adam Harvey of Megapixels, in reearch cited by the FT.

In 2018, the European Union‘s General Data Protection Law came into effect. But Microsoft said the dataset had not been taken down because of legal obligations, but because “the research challenge is over”.

Source: News Agencies