Data Science – Final Project


Predictive Analysis of U.S. Wildfires

As part of the final project requirement for CSCI 1951A (Data Science), I worked in a team of four to predict fire size or fire size category using data derived from a Kaggle data set : https://www.kaggle.com/rtatman/188-million-us-wildfires
This dataset contains 1.88 million rows of wildfire information ranging between 1992 to 2015.

We consider fire year, cause of fire, state, month the fire occurred, average temperature of the month, and average temperature of the year as our feature set.

We used three different ML algorithms in total. Since our dataset included both fire size (in acres, continuous variable) and fire size classes (7 classes based on fire size, categorical variable), we treated them separately as the outcome variable and applied different ML algorithms for each type of dependent variable. By using different approaches and comparing them with each other, we hoped to find models that can better predict the outcome. Specifically, we used regression tree for fire size, and classification tree, and logistic regression for fire size classes.

We encapsulate all our findings as well as provide additional visualizations in a dashboard built using the React framework.

The dashboard can be accessed here : https://uswildfiresanalysis.web.app/

Leave a Reply

Your email address will not be published. Required fields are marked *

5 × one =