← All work Research · Geo-statistical ML

GRAMIK

Geospatial Risk Analysis of Mosquitoes in Kenya (GRAMIK), a computational tool that quantifies the risk of mosquito occurrence across geographic locations. I built it during a software engineering role with Civicom Aid in Mombasa.

Python · Flask KNN regression Leaflet.js Public health 40 localities

Summary

GRAMIK predicts mosquito-occurrence risk on a 1 (low) to 3 (high) scale across Kenyan localities, adapting to the way risk shifts month to month with the seasons. It pairs K-Nearest-Neighbors regression with spatio-temporal data and environmental factors such as elevation and proximity to water bodies, producing a fine-grained risk surface that public-health teams can act on.

The data

The dataset characterizes 40 Kenyan localities, each with geographic coordinates, elevation, Köppen climate classification, distance from the nearest water body, and a monthly risk level for all twelve months. It integrates endemic-region and species data from NCBI, vector-density studies, seasonal-abundance research, and occurrence records from the GBIF database.

The method

  • Spatial modeling. KNN regression trained on latitude/longitude predicts elevation and distance-from-water at unmeasured points, using the five nearest neighbors and relevant climate features.
  • Temporal weighting. The algorithm accepts a date, derives the current and next month (handling December→January transitions), and weights risk proportionally to days remaining and elapsed.
  • Climate modulation. Köppen classifications adjust the temporal weightings. A tropical-rainforest locality stays consistently high, while hot semi-arid zones swing with water availability.
  • Risk calculation. A weighted average of monthly risk is modulated by sub-linear transforms of elevation and water distance, then clipped to the 1–3 scale for interpretability.

Why it matters

The result is an adaptive, spatially aware method for quantifying mosquito risk that uses both temporal and spatial signals. It produces fine-grained risk levels that help target vector-control resources, and the same framework carries over to other epidemiological studies. Outputs are mapped on an interactive Leaflet.js view.