Transferability of AI-Based Hazard Models to Data-Scarce Regions

Summary
In this research line, we explore how to build Machine Learning (ML) based hazard models in areas where data is scarce. ML requires large amounts of data to train models, but data for some hazards is limited or unavailable in certain regions. To address this, we develop methodologies to train models in well-studied areas where data is available and then transfer them to data-scarce regions. Currently, we focus on hailstorms - a phenomenon that is rare, difficult to measure, and causes significant economic losses, particularly in agriculture. However, there is a substantial lack of hailstorm data globally.
In our approach, we combine hailstorm data from the relatively extensive datasets available in the US and transfer these models to Europe, where data is much less abundant, especially in some regions. Limited data from the target region can be incorporated into the workflow to assist with model adaptation and validation.
The top image shows a diagram of our main model workflow. Using data from our selected US region (top left), we build a model to predict the probability of hailstorm occurrence in Europe (top right). These models can generate hazard maps for agricultural applications. The bottom images demonstrate an example of this application, showing the modeled probability of experiencing at least one hailstorm per year over mainland Spain, with separate maps for vineyard areas (bottom left) and fruit tree regions (bottom right).
Objectives
- Build ML-based models of natural hazards such as hailstorms.
- Study how to use data from regions outside the area of interest to build models
- Apply novel ML domain shift adaptation techniques
- Use the built models to construct hazard maps