This project consists of data cleaning and exploratory data analysis using python on a dataset containing data about youtube videos from various countries.
Dataset link : https://www.kaggle.com/datasnaek/youtube-new
Libraries used : 1.Pandas 2.Numpy 3.seaborn 4.matplotlib
Cleaning :
- Corrected datatypes for various attributes.
- Checked for missing values
- Fixed date and time formats
EDA:
- Ratio of likes and dislikes between various categories of videos.
- User preferences based on categories.
- Trending videos in each country
- Relationship between likes and whether or not a video is trending.
- Maximum number of days taken for a video to become trending.
- User comments based on category.
- Frequently occurring words in tags and description of videos.
- Correlation between likes,dislikes,views and comments.