Trees in pivot table terminology

Recently, I’ve been requested by non-data-mining colleagues to explain how Classification and Regression Trees work. While a detailed explanation with examples exists in my co-authored textbook Data Mining for Business Intelligence, I found that the following explanation worked well with people who are familiar with Excel’s Pivot Tables: Classification tree for predicting vulnerability to famine Suppose the goal is to generate predictions for some variable, numerical or categorical, given a set of predictors. The idea behind trees is to create groups of records with similar profiles in terms of their predictors, and then average the outcome variable of interest to … Continue reading Trees in pivot table terminology

Histograms in Excel

Histograms are very useful charts for displaying the distribution of a numerical measurement. The idea is to bucket the numerical measurement into intervals, and then to display the frequency (or percentage) of records in each interval. Two ways to generate a histogram in Excel are: Create a pivot table, with the measurement of interest in the Column area, and Count of that measurement (or any measurement) in the Data area. Then, right-click the column area and “Group and Show Detail >  Group” will create the intervals. Now simply click the chart wizard to create the matching chart. You will still … Continue reading Histograms in Excel

Simpson’s Paradox in Bhutan

This year I am on academic sabbatical, hence the lower rate of postings. Moreover, postings this year might have an interesting twist, since I am in Bhutan volunteering at an IT Institute. As part of the effort, I am conducting workshops on various topics on the interface of IT and data analysis. IT is quite at its infancy here in Bhutan, which makes me assess and use IT very differently than I am used to. My first posting is about Simpson’s paradox arising in a Bhutanese context (I will post separately on Simpson’s Paradox in the future): The Bhutan Survey … Continue reading Simpson’s Paradox in Bhutan