By Jay Jhaveri, Computer Engineering, VESIT
As you all know, at the moment our world is in the middle of a crisis! People have been stuck in their homes due to the COVID-19 pandemic and are eagerly waiting for a vaccine to be discovered so that they can go back to their normal life.
BUT the question is, vaccine once developed, will the general public willingly administer these vaccines? To answer this question we retrospect the previous pandemic, The 2009 H1N1 Pandemic.
With leadingindia.ai and BENNETT UNIVERSITY, this summer we had a beautiful opportunity to work under Dr. Kuldeep Chaurasia to build a model to Predict H1N1 and Seasonal Flu Vaccines.
WHAT IS A VACCINE?
It is a substance used to stimulate the production of antibodies and provide immunity against one or several diseases, prepared from the causative agent of a disease, its products, or a synthetic substitute, treated to act as an antigen without inducing the disease.
THE H1N1 VIRUS
H1N1 or swine flu virus first emerged in the spring of 2009 in Mexico and then in the United States and quickly spread across the globe.
A unique combination of influenza genes was discovered in this novel H1N1 virus which was not identified prior in humans or animals.
This contagious novel virus had a very powerful impact on the whole world on June 11, 2009, the World Health Organization (WHO) declared that a pandemic of 2009 H1N1 flu or swine flu had begun.
According to the CDC, the first and foremost step in protecting oneself of this virus is a yearly flu vaccination. Various factors such as age, the health status of an individual which affects the ability of the vaccination to provide protection to the person who is vaccinated.
Several activities were performed using various social media platforms and broadcasting networks such as Twitter to track the levels of disease activity and the concern of the public towards this pandemic situation.
In this study, we use the data obtained from the National 2009 H1N1 Flu Survey to predict how likely people got H1N1 and seasonal flu vaccines. Thus, we can hope for a better immunized society.
The dataset provides us with details like flu-related behaviors, opinions about flu vaccine safety and effectiveness, recent respiratory illness, and pneumococcal vaccination status in addition to H1N1 and seasonal flu vaccination status of adults and children.
The main aim of our project is to find the probability that a person will receive H1N1 and seasonal Flu vaccination based on many parameters. The data obtained from the National 2009 H1N1 Flu Survey (NHFS) contains 3 CSV files namely the training set features, the training set labels, and the test set features. The data has been obtained from over 53000 people from which around 26000 observations have been considered for the training set and the rest have been considered for the testing set. We first have to perform data cleansing: –
- Taking Care of Missing Data
- Encoding Categorical Data
- Splitting the Dataset
- Hyperparameter Tuning
We have considered various methodologies and compared different Machine Learning and Artificial Neural Network models to predict the required probability. The Machine Learning algorithms such as Multiple Linear regression, Support Vector Regression, Random Forest Regression and Logistic Regression were used.
Artificial Neural Network (ANN) with different optimizers such as Adam, RMSprop, SGD were used to predict the probability of the test set features.
In the default models, it has been observed that the best performing method on the dataset has been the Artificial Neural Network method with 2 hidden layers and activation function being selu and the optimizer being SGD optimizer the sigmoid function is used in the output layer for activation function. The accuracy obtained with ANN is shown in Table I. Other machine learning algorithms have also yielded comparatively good results except for logistic regression, which has been the worst-performing model with accuracy less than 70% in both H1N1 flu and seasonal flu vaccination prediction. A comparison of all the methods used during implementation has been shown in Table I.
Results have also been plotted using the ROC AUC curve. In Graph 1, Graph 2, Graph 3, Graph 4 we can observe the performance of various models on the dataset and it can hence be concluded that Artificial Neural Network method has performed the best with accuracy over 82% in H1N1 flu vaccination prediction and 86% in Seasonal flu vaccination prediction.
Conclusion and Future Research
In the future with advancements in technology, the quality and the quantity of data could increase which could result in better performance and analysis of the issue. More information about the seasons, especially non-pandemic seasons could be very helpful for our analysis of this project. In future we also look forward to exploring more machine learning algorithms, methods and deep learning techniques so that we can obtain more optimal results. Prediction of H1N1 vaccine is done best with the help of SVM model with RBF kernel using hyperparameter tuning and seasonal flu vaccination prediction is done best with Artificial neural network. In has been observed that hyperparameter tuning yielded better results for H1N1 flu vaccination prediction. For Seasonal flu vaccination prediction hyperparameter tuning didn’t obtain expected accuracy hence we conclude that ANN has worked best for Seasonal flu vaccination prediction.
We hence can give a firm answer to the question asked at the beginning of this blog, NO. Many people will still not administer vaccines but will be protected by Herd Immunity.
We will be glad to share our work in future articles that will give a deep dive into what we have built.
Till then, any suggestions and recommendations are most welcome!
Original Blog Link: Pandemics! A Harsh Reality 🙁