Chapter 4 – Working with Grouped Variables

In this chapter, our last one focusing on the dplyr package, we will be grouping variables and then examined the characteristics of the groups, looking for similarities and differences across these groups.

For those of you in Sociology 1205, this chapter covers the data wrangling module for unit 4. You will find the tutorial and exercise scripts in your R Studio Cloud environment.

Grouping is the process whereby you group observations that share a common category from a factor variable. For instance, if you have a variable “sex” with two levels (male and female), group your observation by sex would allow you to compare men and women with respect to your other variables. Any factor variable can be grouped to allow for such comparisons across groups. Grouping is both very common and very useful.

Learning Objectives

In this chapter, we will cover the following topics:

  • grouping observations with the group_by() function;
  • combining group_by() with other dplyr functions for deeper data exploration.

Part 1 – Introduction

In this first part, we examine the principles of grouping and what it does to a dataset.

Part 2 – Grouping with group_by()

Part 3 – More Grouping

The mechanics of grouping are not very complicated and the output of grouping is pretty easy to read.

Key functions used in this chapter

  • group_by(): the function to group observations by levels of a factor variable;
  • ungroup(): the function that undoes the grouping done by group_by();
  • str(): the function that prints the structure of the dataset in the console;
  • summary(): a function that provides a summary of a dataset (not to be confused with summarize()) in the console.

Before moving on to the next chapter or the exercise (for those of you taking Sociology 1205), test your understanding with the quiz below.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Data Wrangling with R Copyright © 2022 by Christine A Monnier is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book