This post provides an analysis demo using flights data in R. The source data includes information on over 7.2 million scheduled non-stop U.S. domestic flights during 2018. Given the data’s large size, the DBI package will be used to access a SQLite database where the data is stored. That way,...
[Read More]
Scrape Glassdoor Company Reviews
in R Using the gdscraper Package
Most things on the web can be scraped and there’s many methods to do so, but did you know R has these capabilities? I’ve had a few folks ask me how and having automated the web scraping for Glassdoor company reviews with the gdscrapeR package, I’ll demo its usage as...
[Read More]
Predictive Analytics with (Cacao) Decision Tree Algorithms
Part Two of Chocolate & Machine Learning with Python
Here’s part two of a two-part series in which I take chocolate nerd-dom to the next level with a dive into chocolate bar rating prediction using Python. After exploring and preparing the chocolate dataset in part one, we’ll apply it to a Decision Tree classifier algorithm and perform some tuning...
[Read More]
Bittersweet Exploration through Data Preparation
Part One of Chocolate & Machine Learning with Python
This is part one of a two-part series where we’ll get a high-level overview of a chocolate dataset and prepare it for a predictive model. It will be just as much about exploring chocolate as it is about cleaning and conditioning data, so if you also have a sweet tooth,...
[Read More]
Text Mining Company Reviews (in R)
Case of MBB Consulting
This post is about applying basic text mining tasks to Glassdoor company reviews to find out what employees write the most about to describe their workplace experiences, and whether they tend to be expressed in a more negative, positive or neutral way. Final results are shown as visuals, both tables...
[Read More]