Data Exploration Celebration: The ENBIS 2009 Challenge

The European Network for Business and Industrial Statistics (ENBIS) has released the 2009 ENBIS Challenge. The challenge this time is to use an exploratory data analysis (EDA) tool to answer a bunch of questions regarding sales of laptop computers in London. The data on nearly 200,000 transactions include 3 files: sales data (for each computer sold, with time stamps and zipcode locations of customer and store), computer configuration information, and geographic information linking zipcodes to GIS coordinates. Participants are challenged to answer a set of 11 questions using EDA. The challenge is sponsored by JMP (by SAS), who are obviously

Histograms in Excel

Histograms are very useful charts for displaying the distribution of a numerical measurement. The idea is to bucket the numerical measurement into intervals, and then to display the frequency (or percentage) of records in each interval. Two ways to generate a histogram in Excel are: Create a pivot table, with the measurement of interest in the Column area, and Count of that measurement (or any measurement) in the Data area. Then, right-click the column area and "Group and Show Detail >  Group" will create the intervals. Now simply click the chart wizard to create the matching chart. You will still

Data conversion and open-source software

Recently I was trying to open a data file that was created in the statistical software SPSS. SPSS is widely used in the social sciences (a competitor to SAS), and appears to have some ground here in Bhutan. Being in Bhutan with slow and erratic internet connection though, I've failed once and again to use the software through our school's portal. Finding the local SPSS representative seemed a bit surreal, and so I went off trying to solve the problem in another way. First stop: Googling "convert .sav to .csv" lead me nowhere. SPSS and SAS both have an annoying