Randomized experiments (or randomized controlled trials, RCT) are a powerful tool for testing causal relationships. Their main principle is random assignment, where subjects or items are assigned randomly to one of the experimental conditions. A classic example is a clinical trial with one or more treatment groups and a no-treatment (control) group, where individuals are assigned at random to one of these groups. Story 1: (Internet) experiments in industry Internet experiments have now become a major activity in giant companies such as Amazon, Google, and Microsoft, in smaller web-based companies, and among academic researchers in management and the social sciences. … Continue reading Key challenges in online experiments: where are the statisticians?
I recently watched an interesting webinar on Seeking the Magic Optimization Metric: When Complex Relationships Between Predictors Lead You Astray by Kelly Uphoff, manager of experimental analytics at Netflix. The presenter mentioned that Netflix is a heavy user of A/B testing for experimentation, and in this talk focused on the goal of optimizing retention. In ideal A/B testing, the company would test the effect of an intervention of choice (such as displaying a promotion on their website) on retention, by assigning it to a random sample of users, and then comparing retention of the intervention group to that of a control … Continue reading Predictive relationships and A/B testing
I find it illuminating to read statistics “bibles” in various fields, which not only open my eyes to different domains, but also present the statistical approach and methods somewhat differently and considering unique domain-specific issues that cause “hmmmm” moments. The 4th edition of Fundamentals of Clinical Trials, whose authors combine extensive practical experience at NIH and in academia, is full of hmmm moments. In one, the authors mention an important issue related to sampling that I have not encountered in other fields. In clinical trials, the gold standard is to allocate participants to either an intervention or a non-intervention (baseline) … Continue reading Statistical considerations and psychological effects in clinical trials
Image from http://www.slews.de Spatial data are inherently important in environmental applications. An example is collecting data from air or water quality sensors. Such data collection mechanisms introduce dependence in the collected data due to their spatial proximity/distance. This dependence must be taken into account not only in the data analysis stage (and there is a good statistical literature on spatial data analysis methods), but also in the design of experiments stage. One example of a design question is where to locate the sensors and how many sensors are needed? Where does explain vs. predict come into the picture? An interesting 2006 … Continue reading Designing an experiment on a spatial network: To Explain or To Predict?