Bad Methods: Just choose a technique that gives the answer you want

Randall Maas 11/3/2013 8:34:09 AM

A bad method spotted in a computer science / "big data" paper:

"We performed initial experiments with different machine learning algorithms and found that gradient tree boosting out-performed logistic regression, as well as other tree-based methods. Thus, all of our in-depth analysis is conducted with this algorithm."
Lars Backstrom, Jon Kleinberg, Romantic Partnerships and the Dispersion of Social Ties: A Network Analysis of Relationship Status on Facebook, 2013 Oct 24

No discussion is made for a theoretical reason the predictive performance of said techniques. Nope.

(A better approach would have been to provide a justification for the different techniques relative merits and followed with this as experimental confirmation.)