Decision Trees Definition |
 |
Decision tree is a widely used data mining method. In decision theory, a decision tree
is a graph of decisions and their possible consequences, represented in form of
branches and nodes.
This data mining
method is
been used in various fields in business and science for many years and has given
outstanding results.
Decision Tree Structure
Decision trees offer a symbolic decision-making model with high
level of interpretability.
A decision tree is a special form of tree structure. The tree consists of nodes where a logical
decision has to be made, and connecting branches that are chosen according to the
result of this decision.
The nodes and branches that are followed constitute a
sequential path through a decision tree that reaches a final decision in the
end. (see our examples of decision
trees for more information on their structure).
Decision Trees Creation
Decision trees are generated from the training data in a top-down direction.
The root node of a decision tree is the trees initial state - the first
decision node. Each node in a tree contains some data.
On a basis of an algorithm some calculations are completed, and the decision tree node is
been split into
two or more branches. In some cases, the node cannot be split, in this case
it will be the final decision node.
The process is repeated until obtaining a completely
discriminating tree. At this very point the decision tree might have nodes that
are too specific to noise, that might be present in the training data. This is
called over-fitting. To avoid
over-fitting, a decision tree is generalized, by eliminating sub-trees.
Once a decision tree solution is generated from the learning data, it can be
used for predictive analysis or estimating the best decision.
Application of a decision tree to a data example is a straightforward
top-down decision-making process. The process can be controlled by starting it
in the root node, taking the appropriate branch, and terminating when a leaf
node is reached.
Decision trees can be created manually or with the help of
data mining
software.
Next >
|