Introduction

This workbook is a brief introduction to data wrangling. Data wrangling is the process through which raw data is cleaned, transformed, reshaped to get it ready for analysis and modeling. Data wrangling also constitutes the bulk of the work with data. It is not as glamorous as sophisticated statistical work or modeling, or even data visualization, but it is the indispensable step that makes all the other work possible.

In this book, we introduce core concepts in data wrangling, use the R language, the R Studio interface, and the tidyverse set of packages that are dedicated to getting us to tidy data.

This book is composed of eight chapters. For those of you taking my Sociology 1205 class, the chapters correspond to each unit for the data wrangling modules.

In each chapter, you will find some tutorial videos explaining the different steps in data wrangling using the tidyverse packages dplyr, tidyr, stringr, among others. For those of you taking my class, You will find the code script in your R Studio Cloud environment as well as the required exercise scripts for the class assignments.

You should not passively just watch the videos. I highly recommend taking notes, running the code along with the tutorial, stopping when necessary to examine the code output, to ensure that you understand the mechanics and results of each procedure.

This book builds upon the skills presented in the R and R Studio for Absolute Beginners book. If you have not yet read that book and watched the tutorial videos, please do so before proceeding. The present book assumes mastery of the basics of R and R Studio.

Good luck as you develop your data wrangling skills.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Data Wrangling with R Copyright © 2022 by Christine A Monnier is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book