Top 10 Data mining algorithm – C4.5

What does C4.5 do?

It constructs a classifier in the form of a decision tree. It is supervised learning as the data set needs to be labeled with classes.

How C4.5 is different than other decision tree systems?

  1. C4.5 uses information gain to determine which attribute should be used first as the decision node
    1. Information gain helps to measure the most informative attribute in a mathematics way.
  2. Single-pass pruning process to mitigate over-fitting
  3. Can work with both continuous and discrete data
  4. Finally, incomplete data is dealt with in its own ways

Why use C4.5?

Arguably, the best selling point of decision trees is their ease of interpretation and explanation. They are also quite fast, quite popular and the output is human readable.


Top 10 data mining algorithms in plain english

