A dashboard made with the kaggle dataset Netflix Shows and Movies and AWS S3 to enable real-time access to Netflix content data, to display insightful data from the dataset such as rating distributions and top genres.
-
Install Python 3.9 or later
-
Install AWS CLI
-
Create a virtual environment:
python -m venv venv
-
Activate the virtual environment:
- MacOS:
source venv/bin/activate
- MacOS:
- Go to Kaggle and download the "Netflix Shows and Movies" dataset.
- Create an AWS account if you don't have one.
- Create an S3 bucket to store the dataset.
- Create an IAM user with appropriate permissions (S3 access).
- Save the access key and secret key.
- Place the following content in a
.env
file:AWS_ACCESS_KEY_ID=your_access_key AWS_SECRET_ACCESS_KEY=your_secret_key AWS_REGION=your_region
- Install dependencies:
pip install -r requirements.txt
Upload the dataset to your S3 bucket:
aws s3 cp netflix_titles.csv s3://your-bucket-name/
docker build -t netflix-dashboard .
docker run -p 8050:8050 netflix-dashboard
- Install the EB CLI:
pip install awsebcli
- Initialize the EB project:
eb init -p docker netflix-dashboard
- Create an Elastic Beanstalk environment:
eb create netflix-dashboard-env
- Local: Visit http://localhost:8050
- After EB deployment: Use the provided EB URL
The dashboard will display:
- Distribution of movies vs TV shows
- Content added by year
- Top 10 genres
- Rating distribution
- Average duration by type