Chapter 5 – Logistic Regression

In this chapter, we turn to our second regression algorithm: logistic regression, where we try to predict the class or probability of class assignment of a binary variable based on a set of predictors. There are some complexities related to logistic regression that we address in the lecture and the tutorials.

For those of you in Sociology 1205, this chapter corresponds to the Doing Data Science module for Unit 5. You will find all the scripts in our R Studio Cloud workspace.

Learning Objectives

In this chapter, we cover the following topics:

  • understanding logistic regression as a form of supervised learning;
  • understanding logistic regression as a form of non-linear regression and a form of classification;
  • understanding the conditions where logistic regression is the proper algorithm to use;
  • interpreting logistic regression model outputs;
  • evaluating logistic regression model based on different measures;
  • interpreting confusion matrices;
  • extracting evaluation metrics from a confusion matrix;
  • running a logistic regression analysis in R.

Lecture – Part 1

Lecture – Part 2

Tutorial – Part 1

Tutorial – Part 2

Tutorial – Part 3

Key functions used in this chapter

  • glm(): the function that produces a logistic regression model;
  • step(): the function that selects a model based on specified criteria;
  • anova(): the function that runs an analysis of variance to compare models;

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Modeling Data with R Copyright © 2022 by Christine A Monnier is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book