module-2-exercise-2

Course Title: Econ 106 Computer programming for economcis
Instructor: Christopher Llones
Exercise: Netflix Dataset Analysis in R
Due Date: 4 March 2026

Objective

This exercise will assess your ability to apply R programming skills specifically using the dplyr package and the pipe operator (%>%) to explore and analyze a real-world dataset. You will work with the Netflix Movies & TV Shows dataset to answer questions using code.

Instructions

Use R and the dplyr package to answer each question.
Submit your R script file (.R) with your code and outputs.
Use the pipe operator (%>%) for all data manipulations.
You may use additional packages like tidyr or stringr if needed.
Ensure your code is clean, commented, and reproducible.

Dataset and files

Access the dataset and R script template from the econ106-exercise-2 folder.
Submit your completed R script file (.R) by the due date and upload using this link: Submission Link.
You may temporarily save your script here.

Questions

Part 1: Data exploration

How many rows and columns are in the dataset?
List all unique types of content (e.g., Movie, TV Show).
How many titles were released in 2020?

Part 2: filtering and summarising

Filter the dataset to show only TV Shows released in India. How many are there?
Find the top 5 most common ratings.
Which year had the most titles added to Netflix?

Part 3: grouping and aggregation

Group the data by type and count how many entries each type has.
Group the data by release_year and summarize the number of titles released per year.
Which country has produced the most content on Netflix?

Advanced Filtering

Filter the dataset to show all Movies with a duration longer than 100 minutes.
Find all titles directed by ‘Steven Spielberg’.
List all titles with the genre containing ‘Documentary’.