In this post, you will learn about the framework, MOSAIKS (Multi-Task Observation using Satellite Imagery & Kitchen Sinks) which can be used to create machine learning linear regression models for climate change. Here is the list of few prediction use cases which has already been tested with MOSAIKS and found to have high model performance:
- Forest cover
- Population density
- Nighttime lights
- Road length
- Housing price
- Crop yields
- Poverty mapping
What is MOSAIKS?
MOSAIKS provides a set of features created from Satellite imagery dataset. We are talking about 90TB of data gathered per day from 700+ satellites. These features can be combined with machine learning algorithms to address global challenges by remotely estimating socioeconomic and environmental conditions in data-poor regions. Combining satellite imagery with machine learning is also termed as SIML approach.
The set of features generated using MOSAIKS can be merged spatially with the labels. Thereafter, you can run a linear regression of your labels on the MOSAIKS features, measure performance and use the model for making predictions in your area of interest.
The algorithmic component of the MOSAIKS system is built upon the random convolutional features (RCF) algorithm.
The MOSAIKS features facilitates a generalizable and accessible approach to machine learning with global satellite imagery. The picture below represents how MOSAIKS can be used to perform different predictions tasks related to solving problems in the areas of socioeconomic and environmental issues.
Pay attention to some of the following in above picture:
- K-dimensional features set is created by drawing a fixed sample of K patches from the satellite imagery set, convolving the patches across each image and passing the resultant data set to nonlinear activation function. The output from K nonlinear activation maps is K-dimensional features set.
- The features set is merged with users provided labels. For different problem statements, you can come up with different labels. The picture below represents aspect of merging unsupervised features with labels and applying regression algorithm.
- Create training and test split of labelled data set.
- Training regression models such as ridge regression, evaluate the performance. The manner it is different from training a convolutional neural network (CNN) is that CNN learn domain image specific features.
- The trained model can be used for making predictions.
MOSAIKS is tested to achieve comparable performance with respect to a fine-tuned ResNet-18 at a fraction of the computational cost. The picture below shows the comparison of MOSAIKS trained regression models against ResNet-18 and pre-trained CNN.
There is a limited access to skills, data, compute and resources in relation to understanding and processing data gathered from satellites in form satellite imagery. Transforming the satellite imagery data into relevant statistics is costly and requires skills which may not be available with many. This is where MOSAIKS come as a boon. It converts the satellite imagery data to K-dimensional features set which can be merged with user-chosen labels and used to train the regression models for making predictions.
How to use MOSAIKS for solving problems related to Climate Change?
As mentioned above, MOSAIKS provides the K-dimensional features set which can be used to solve training regression models related to different problems related to social-economic and environmental factors. Here are the steps which can be used to solve problems related to climate change:
- First and foremost, identify the problem. The best way to identify problem is to ask questions related with current climate change problems. For example, how can I reverse the deforestation?
Download MOSAIKS features from our API for the areas where you have labels
Merge the features spatially with your labels.
Run a regression of your labels on the MOSAIKS features
Here are some important pages which can help you get started:
- Arxiv paper – A generalizable and accessible approach to machine learning with global satellite imagery
- Github repository consisting of code and data
- Tutorial – How to use MOSAIKS
- Slide deck representing findings of Esther Rolf
Here is a great youtube video on MOSAIKS
- Random Forest vs AdaBoost: Difference, Python Example - December 8, 2023
- Decoding Bagging in Random Forest: Examples - December 8, 2023
- Feature Importance & Random Forest – Sklearn Python Example - December 8, 2023