Top Ten List

Top Ten List

Our ETL project provides a pipeline for entertainment fans to find the top ten movies for a given year between 2008-2018 based on gross box office sales and music albums based on number sold. This information is then matched with review and rating information.


GitHub Repository
Technologies Used
Data Sources
Example Code
  • Languages
    • Python
    • Javascript
    • HTML
    • CSS
    Data Extraction and Munging
    • jupyter notebook
    • pandas
    • numpy
    • requests
    • splinter
    • BeautifulSoup
    Database
    • MongoDB
    • pymongo
    • BoxOfficeMojo: box-office revenue for movies; affiliated with IMDb and is available for general public use.
    • Billboard: popularity of music albums and songs based on sales.
    • Metacritic: user and critic reviews of movies and music.
  • Scrape BoxOfficeMojo
    Scraping BoxOfficeMojo

    Scrape Billboard
    Scraping Billboard

    Scrape Metacritic
    Scraping Metacritic

    Database Connection and Setup
    MongoDB