Covid-19 Analysis

Analysis of Covid-19

On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people and this COVID-19 was identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring.

In India the first COVID-19 case was reported on 30 January in a student who arrived in Kerala state from Wuhan. Then 2 more cases were reported in the next 2 days in Kerala again. For almost a month, no new cases were reported in India, however, on 8th March, five new cases of coronavirus in Kerala were again reported and since then the cases have been rising affecting 14 states.

Business Understanding

Have to use EDA to find out how this COVID has spread in China , Wuhan, Rest of the World and in India along with its neighbouring countries too.More specfically in India like how many cases per day, travel history of patients and how much hospital beds are available and till now how much people have been tested positived, recovered or have lost their lives.And Analyzse and understnd the spread od COVID-19 in India , reason , hypotheses which are prevailing mow a days.Basically in depth analysis to touch every aspect of the virus.


The data used in the analysis has been gathered from the various sources , brief description of the data in mentioned below and the detail description is mentioned in the data dictionary. All the data mentioned below id updated everyday on hourly basis

  1. covid_19_clean_complete.csv

The file contains the cumulative count of confirmed, death and recovered cases of COVID-19 from different countries from 22nd January 2020

  1. covid_19_data.csv

Johns Hopkins University has made an excellent dashboard using the affected cases data. Data is extracted from the google sheets associated and made available here.The github link of the data is this and the link for the dashboard is this

This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.

  1. COVID_open_line_list_data.csv This file is obtained from hereand this is indiviual level data

  2. COVID19_line_list_data.csv This files is obtained fromhere and this is individual level data

  3. COVID19 India Complete Dataset April 2020.xlsx Has the complete details of the COVID19 situation in India. Starting from the patient's travel history, to testing count, growth rate, etc. link for the data this

  4. IndividualDetails.csv Individual level details are present in IndividualDetails.csv file and is obtained fromhere

  5. Covid cases in India.csv Deatils of cases in India State/UT wisehereand here

  6. per_day_cases.xlsx daily cases in Indiahere and here

  7. population_by_country_2020.csv The data has been scraped from the worldometer


The main Objective from this analysyis is to get an details overview of COVID-19 in India and in Rest of World by touching each and every scenario and create a report on weekly basis and try to bulid a Machine Learning Model to forecast about the COVID-19.


