It’s one thing to collect and analyze behavioral big data (BBD) and another to understand what it means to be the subject of that data. To really understand. Yes, we’re all aware that our social network accounts and IoT devices share our private information with large and small companies and other organizations. And although we complain about our privacy, we are forgiving about sharing it, most likely because we really appreciate the benefits. So, I decided to check out my data sharing in a way that I cannot ignore: I started wearing a fitness band. I bought one of the … Continue reading Experimenting with quantified self: two months hooked up to a fitness band
The interest in using pre-diagnostic data for the early detection of disease outbreaks, has evolved in interesting ways in the last 10 years. In the early 2000s, I was involved in an effort to explore the potential of non-traditional data sources, such as over-the-counter pharmacy sales and web searches on medical websites, which might give earlier signs of a disease outbreak than confirmed diagnostic data (lab tests, doctor diagnoses, etc.). The pre-diagnostic data sources that we looked at were not only expected to have an earlier footprint of the outbreak compared to traditional diagnostic data, but they were also collected … Continue reading Early detection of what?
I find it illuminating to read statistics “bibles” in various fields, which not only open my eyes to different domains, but also present the statistical approach and methods somewhat differently and considering unique domain-specific issues that cause “hmmmm” moments. The 4th edition of Fundamentals of Clinical Trials, whose authors combine extensive practical experience at NIH and in academia, is full of hmmm moments. In one, the authors mention an important issue related to sampling that I have not encountered in other fields. In clinical trials, the gold standard is to allocate participants to either an intervention or a non-intervention (baseline) … Continue reading Statistical considerations and psychological effects in clinical trials
Image from KDnuggets.com While debates over privacy issues related to electronic health records are still ongoing, predictive analytics are beginning to being used with administrative health data (available to health insurance companies, aka, “health provider networks”). One such venue are large data mining contests. Let me describe a few and then get to my point about their contribution to pubic health, medicine and to data mining research. The latest and grandest is the ongoing $3 million prize contest by Hereitage Provider Network, which opened in 2010 and lasts 2 years. The contest’s stated goal is to create “an algorithm that … Continue reading Mining health-related data: How to benefit scientific research