Skip to content

dineshjagai/CIS-520-Final-Project

Repository files navigation

CIS520 FINAL PROJECT!

Dengue Vector

Team-members

Project Title

Predicting the spread of dengue virus in San Juan and Iquitos over a five year period

Goal

To predict the number of dengue cases each week (in each location) based on environmental variables describing changes in temperature, precipitation, vegetation, and more.

PROJECT DESCRIPTION

Objectives

  1. Motivation (adapted from DengAI)
  • Dengue fever is a mosquito-borne disease that occurs in tropical and sub-tropical parts of the world. In mild cases, symptoms are similar to the flu: fever, rash, and muscle and joint pain. In severe cases, dengue fever can cause severe bleeding, low blood pressure, and even death.
  • Because it is carried by mosquitoes, the transmission dynamics of dengue are related to climate variables such as temperature and precipitation. Although the relationship to climate is complex, a growing number of scientists argue that climate change is likely to produce distributional shifts that will have significant public health implications worldwide.
  • In recent years dengue fever has been spreading. Historically, the disease has been most prevalent in Southeast Asia and the Pacific islands. These days many of the nearly half billion cases per year are occurring in Latin America.
  • Our goal is to predict the number of dengue cases each week (in each location) based on environmental variables describing changes in temperature, precipitation, vegetation, and more.
  • DATASET
    • Sets of features(N = 1456, p = 22)
      - City and date indicators
      city – City abbreviations: sj for San Juan and iq for Iquitos
      week_start_date – Date given in yyyy-mm-dd format
      - NOAA's GHCN daily climate data weather station measurements
      station_max_temp_c – Maximum temperature
      station_min_temp_c – Minimum temperature
      station_avg_temp_c – Average temperature
      station_precip_mm – Total precipitation
      station_diur_temp_rng_c – Diurnal temperature range
      - PERSIANN satellite precipitation measurements (0.25x0.25 degree scale)
      precipitation_amt_mm – Total precipitation
      - NOAA's NCEP Climate Forecast System Reanalysis measurement (0.5x0.5 degree scale)
      reanalysis_sat_precip_amt_mm – Total precipitation
      reanalysis_dew_point_temp_k – Mean dew point temperature
      reanalysis_air_temp_k – Mean air temperature
      reanalysis_relative_humidity_percent – Mean relative humidity
      reanalysis_specific_humidity_g_per_kg – Mean specific humidity
      reanalysis_precip_amt_kg_per_m2 – Total precipitation
      reanalysis_max_air_temp_k – Maximum air temperature
      reanalysis_min_air_temp_k – Minimum air temperature
      reanalysis_avg_temp_k – Average air temperature
      reanalysis_tdtr_k – Diurnal temperature range
      - Satellite vegetation - Normalized difference vegetation index (NDVI) - NOAA's CDR Normalized Difference Vegetation Index (0.5x0.5 degree scale) measurements
      ndvi_se – Pixel southeast of city centroid
      ndvi_sw – Pixel southwest of city centroid
      ndvi_ne – Pixel northeast of city centroid
      ndvi_nw – Pixel northwest of city centroid
  1. Related Work

  2. Problem Formulation

    • Predit the number of degue cases each week in San Juan and Iquitos over a five year period (2008 - 2013) using given environmental variables describing changes in temperature, precipitation, vegetation, and more from 1990.
  3. Methods

    • Used an imputation method to clean the data
    • Use simple regression methods to pedict the number of cases based on our 22 features
    • Use deep learning with neural networks for a more robust approach.
  4. Evaluation

  • Performance is evaluated according to the mean absolute error.
    MAE
  1. Project Plan
    • Week 1 11/4
      - Clean the data
      - Impute the missing data
      - Visualize the data

    • Week 2 11/11
      - Work on mimimizing the loss on the training set using different regression methods

    • Week 3 11/18
      - Work on minimizing the loss on the test set using different regression methods
      - Use different cross-validation methods to select any needed parameters
      - Research different deep learning methods

    • Week 4 11/25
      - Use a Neural Network (deep learning)

    • Week 5 12/2
      - Work on Presentation

About

ML project to predict the spread of the dengue virus

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages