Chapter 6 – Decision Trees
In this chapter, we introduce an algorithm that can be used for both classification and regression: decision trees. Tree-based methods are very popular because they require little pre-processing to generate reliable models. As always, this chapter includes first a lecture to understand the concept of decision tree, followed by tutorials.
For those you in Sociology 1205, this chapter corresponds to the Doing Data Science module for unit 6. You will find all the scripts in our R Studio Cloud workspace.
Learning Objectives
In this chapter, we cover the following topics:
- understanding decision trees as a form of supervised learning;
- understanding decision trees as a form of classification and regression;
- identifying when a decision is the model to use;
- interpreting the output of a decision tree;
- generating decision tree models and outputs in R.
Lecture
Tutorial 1 – Part 1
Tutorial 1 – Part 2
Tutorial 2 – Part 1
Tutorial 2 – Part 2
Tutorial 2 – Part 3
Key functions used in this chapter
- rpart(): the function that creates a decision tree model;
- prp(): the function that plots an rpart model.