Wednesday, October 6, 2010

Why does Ensembling Work?

A new and popular feature in the data mining world is ensembling, a practice by which you combine the output of several models to arrive at the final result. To be honest, it not that new anymore, and has been around for  several years, but is only now starting to be deployed  on a more broader scale.

For those not very familiar with the topic, applications of ensembling can be very straightforward, where you take some simple measure (like max, min, average), or more complex and developed using regressions or neural networks. I have also seen applications where  decision trees are used to develop a segment level strategy for ensembling.

A question that often gets raised in this context is, why does ensembling work?  You will very likely get asked about this  when you are proposing an ensembling solution to a business partner, as it is so much harder to interpret.     I am not sure I have a complete  answer to this question, but intuitively it makes sense. If  more models are telling you that somebody is a good prospect, then it is more likely to be true.  I often give the movie analogy, where if one person tells you that a movie is great,  you might  or might not like it, depending on your tastes; however, when a few hundred people tell you the same thing, your  chance of not liking the movie is very less. Again, not sure if there is full 1x1 correspondence between the workings of ensembling and  this analogy, but I find it does appeal to peoples intuitive senses.

One aspect of ensembling that is not captured in the above analogy, is that of complementarity.  Do the models complement each other and reinforce each others strengths, very much like an ensemble of musical instruments in an orchestra, which complement each others sounds to create a rich experience? Some of this  definitely happens, because it is very common for  an ensemble of different modeling techniques to work very well together. I have personally seen and have others also narrate for example, that  an ensemble of neural net and logistic regression will probably give better results than just 2 different logistic regressions or two different neural networks.

Appreciate your perspectives.

-- datamining_guy

No comments:

Post a Comment