This article was originally published on Medium: World of Opportunity.
Disaster risk specialists have an oft-repeated phrase: There is nothing “natural” in natural disasters. That is, the hazards are natural, but disasters can be reduced or averted with planning and investment. Disaster risk starts when people — and the things they use and own — are in situations that expose them to hazards. For example, the location and construction material of someone’s home can tell us how likely it would be damaged by a flood or landslide.
We need detailed geographic data about populations and their built environment to understand this exposure and to inform disaster reduction investments like early warning systems, risk financing mechanisms, and public services management. This becomes even more important as the gendered and compounding risks associated with COVID-19 are projected to fall on the most marginalized communities.
For example, take Ghana — a country with whom the World Bank works closely to reduce its disaster risk. In 2015, flooding from heavy rains displaced 50 thousand people living in the Odaw river basin of Greater Accra, Ghana’s capital. Today, Accra’s leaders reflect on their past as they invest in actions that will build a better future. What would the aftermath of the 2015 rains have been if government and civil society had the data to pinpoint who and what was most vulnerable?
Artificial intelligence scales up local knowledge
Getting those data is easier said than done. Part of the challenge is knowing what data is needed. Many features pertinent to urban service delivery and urban risk profiles — such as the status of waste management, health and education services, condition of drainage networks, stability of slopes and embankments — are well-known to ward level leaders and communities, but often difficult for the city government to monitor.
Collecting data accurately and to scale can also be trying. Field-based community mapping efforts like Open Cities Africa and the Resilience Academy coordinate with local leadership to create rich, locally validated details about these vulnerable places. Through Open Cities Africa, Accra’s government has worked with residents to map thousands of features including 35,000 buildings into OpenStreetMap.
Even though on-the-ground data collection is affordable and sorely needed, field methods by themselves cannot always keep up with urban growth’s increasing density and sprawl. Advances in machine learning (ML), an application of artificial intelligence (AI), can help scale up data collection efforts. Where community mapping meets ML, we are seeing local data collection and validation efforts across a much larger geographic area than what field methods could do alone.
But we’re also learning that AI without ethical safeguards can have damaging consequences. The Labs team at the Global Facility for Disaster Reduction and Recovery (GFDRR) explored what that means for vulnerable populations in the recently completed Open Cities AI Challenge in partnership with DrivenData and Azavea.
The Challenge featured locally collected high-resolution drone imagery and manually labeled geographic data from more than a dozen African cities. This extensive and geographically diverse dataset enabled experts to develop ML models that can automatically classify every pixel representing building footprints (aka “building footprint segmentation”) from drone imagery.
Sample outputs reflecting predictions from the winning building footprint segmentation model from the Open Cities AI Challenge (red) compared with ground truth labels (black). Left image from Lusaka, Zambia, right image from Zanzibar, Tanzania.
Machine learning done responsibly
The Challenge offered an unprecedented opportunity to use locally-produced training data to map African cities with AI assistance. It also shed light on ways that bias and error in AI techniques can misrepresent marginalized communities.
In addition to the algorithmic challenge of improving building footprint segmentation, Challenge participants engaged in a Responsible AI for Disaster Risk Management (Responsible AI for DRM) track to explore ethical considerations and consequences of bias in data collection and ML applications for mapping. The three winning Responsible AI entries draw common threads in recognizing biases and downstream consequences in ML systems, maintaining ethical oversight and safeguards, and promoting data stewardship, especially where the most vulnerable members of society are impacted.
“It’s commonly said that algorithms learn the inherent bias in the data that they are trained on: ‘bias in, bias out’. It’s clear how this could work for case studies such as loan acceptance or fraud detection, where models are often trained on individual personal data, past outcomes or decision-making records from previous cases. But in the case of machine learning on aerial imagery for mapping, we’re ultimately just working with pixels. How can a model learn a prejudice purely from pixels?”Catherine Inness, a Responsible AI Challenge track winner
Energy and climate data consultant Chris Arderne illustrated how seemingly benign technical decisions, such as the choice of accuracy metric used to evaluate a model, can lead to unintended real-world consequences like ML systems that under-recognize small and informally constructed buildings.
Catherine Inness, a master’s student in data science at UCL London, discusses the tendency for ML model improvements to optimize for the majority population — leaving the burden of error to disproportionately affect often already vulnerable groups — and reviews possible safeguard measures across the ML development pipeline.
Lastly, data scientists Thomas Kavanagh and Alex Weston propose an ethical framework for the use of contributed geographic information and highlight the responsibility of ML developers as data stewards: identifying, acknowledging, and reducing biases and potential abuses that come with any data collection and decision-making system. Read more from all three entries on OpenDRI.org.
What does ethical data stewardship look like?
“We take community participation very seriously. If the people are engaged, the impact is felt at the end of the day, because they were involved from Day 1. After all, the project is for them.”Jamila Salihu, Assistant Development Planning Officer, Ayawaso East Municipal Assembly, Accra, Ghana.
Guided by Responsible AI principles, local data stewardship must happen at many stages:
- Community leadership should have a voice when determining the data model, i.e. the types of information to collect.
- Residents should also be empowered in data collection and in validation, providing recommendations for tough labelling situations where an ML system could misinterpret the image.
- Communities deserve to access, use, and take ownership over the data collected about them.
Open Cities projects include a consultation phase with local residents and community stakeholders to refine the data model. Left: Men from Ngaoundéré, Cameroon identify flood and rockslide risk areas on a map. Credit: Michel Tchotsoua, ACAGER; Right: Neighborhood leader Mrs. Pay Pay speaks on behalf of her flood- and erosion-prone district of Delapaix at an Open Cities civil society workshop in Kinshasa, Democratic Republic of Congo. Photo credit: OSFAC
We need advances in geospatial data and ML systems to scale our efforts to protect populations. But we need these advances to be grounded by the communities they aim to protect.
Responsible AI for disaster risk and other sectors of development is a growing area of research, and the World Bank and GFDRR continue to engage with these ethical considerations through community mapping and youth digital employment and learning programs in Africa and around the globe.
Reach out to the Responsible AI for DRM working group to learn more about activities around this emerging body of work.