When it comes to classification trees, there are three major algorithms used in practice. CART (“Classification and Regression Trees”), C4.5, and CHAID.
All three algorithms create classification rules by constructing a tree-like structure of the data. However, they are different in a few important ways.
The main difference is in the tree construction process. In order to avoid over-fitting the data, all methods try to limit the size of the resulting tree. CHAID (and variants of CHAID) achieve this by using a statistical stopping rule that discontinuous tree growth. In contrast, both CART and C4.5 first grow the full tree and then prune it back. The tree pruning is done by examining the performance of the tree on a holdout dataset, and comparing it to the performance on the training set. The tree is pruned until the performance is similar on both datasets (thereby indicating that there is no over-fitting of the training set). This highlights another difference between the methods: CHAID and C4.5 use a single dataset to arrive at the final tree, whereas CART uses a training set to build the tree and a holdout set to prune it.
A difference between CART and the other two is that the CART splitting rule allows only binary splits (e.g., “if Income