This workbook is a brief introduction to data visualization and exploratory data analysis (EDA). Data visualization is the process of mapping numeric and text data into visual representation to help our audience understand the story we are trying to tell with data. In contrast, EDA is the process through which a data analyst gets to know his/her data by creating a lot of different visualization in order to identify patterns of interest in the data. Where data visualization is deliberate, with a goal in mind, EDA does not start with a preconceived goal.

In this book, we introduce core concepts in data visualization and EDA, use the R language, the R Studio interface, and the ggplot2 package out of the tidyverse to assist us in creating our data visualization and exploring our data.

This book is composed of eight chapters. For those of you taking my Sociology 1205 class, the chapters correspond to each unit for the EDA modules. The EDA modules are more focused on the practicalities of composing plots, with various levels of complexity and how to look for data patterns. In the dataviz modules, constructed around The Truthful Art, by Alberto Cairo, you will focus more on a conceptual understanding of both EDA and data visualization. You can expect some overlap though.

In each chapter, you will find some tutorial videos explaining the different steps in EDA, again using the tidyverse package ggplot2, among others. For those of you taking my class, you will find the code scripts in your R Studio Cloud environment as well as the required exercise scripts for the class assignments.

You should not passively just watch the videos. I highly recommend taking notes, running the code along with the tutorial, stopping when necessary to examine the code output, to ensure that you understand the mechanics and results of each procedure.

This book builds upon the skills presented in the R and R Studio for Absolute Beginners book. If you have not yet read that book and watched the tutorial videos, please do so before proceeding. The present book assumes mastery of the basics of R and R Studio.

Good luck as you develop your data visualization and EDA skills.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Visualizing Data with R Copyright © 2022 by Christine Monnier is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book