Published: 25th July 2020
Google launches non-profit collaborative fund to address missing data for key research
The fund aims to unlock the power of machine learning by providing data scientists, researchers, and social entrepreneurs in low- and middle-income communities around the world
Google has launched a new Lacuna Fund which is the world's first collaborative nonprofit effort to directly address the missing labeled data in the field of languages to health and agriculture and more. There is currently a lack of relevant, labeled data to represent and address the challenges that face much of the world's population.
"To help close this gap, Google.org is making a $2.5 million grant alongside The Rockefeller Foundation, Canada's International Development Resource Center (IDRC) and Germany's GiZ FAIR Forward to launch Lacuna Fund," said Daphne Luong, Director, Google AI. The fund aims to unlock the power of machine learning by providing data scientists, researchers, and social entrepreneurs in low- and middle-income communities around the world with resources to produce labeled datasets that address urgent problems, Google said in a statement this week.
Machine learning has shown enormous promise for social good, whether in helping respond to global health pandemics or reach citizens before natural disasters hit. But even as machine learning technology becomes increasingly accessible, social innovators still face significant barriers in their efforts to use this technology to unlock new solutions. Labeled data is a particular type of data that is useful in generating machine learning models.
Google said that the fund will provide resources and support to produce new labeled datasets, as well as augment or update existing ones to be more representative, relevant and sustainable. To create a labeled dataset, example data is systematically "tagged" by knowledgeable humans with one or more concepts or entities each one represents. For example, a researcher might label short videos of insects with their type; images of fungi with whether or not they are harmful to plants around them; or passages of Swahili text with the parts of speech that each word represents.
In turn, these datasets could enable biologists to track insect migration; farmers to accurately identify threats to their crops, and Swahili speakers to use an automated text messaging service to get vital health information. "Thanks in part to the rise of cloud computing, in particular services like Cloud AutoML and libraries like TensorFlow, AI is increasingly able to help address society's most pressing issues," said Luong.