From 593b6e56fb0723fc752b0ed100f75c8ce5da9bc2 Mon Sep 17 00:00:00 2001 From: hadley Date: Sun, 24 Jul 2016 15:04:41 -0500 Subject: [PATCH] Tibble tweaks --- model-basics.Rmd | 1 - tibble.Rmd | 31 ++++++++++++++++++++++++------- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/model-basics.Rmd b/model-basics.Rmd index 56a056a..3998291 100644 --- a/model-basics.Rmd +++ b/model-basics.Rmd @@ -52,7 +52,6 @@ library(broom) # Modelling requires plently of visualisation and data manipulation library(ggplot2) -library(tibble) library(dplyr) library(tidyr) diff --git a/tibble.Rmd b/tibble.Rmd index bb710d1..969d871 100644 --- a/tibble.Rmd +++ b/tibble.Rmd @@ -2,11 +2,13 @@ ## Introduction -Throughout this book we work with "tibbles" instead of the traditional data frame. Tibbles _are_ data frames, but tweak some older behaviours to make life a littler easier. R is an old language, and some things that were true 10 or 20 years ago no longer apply. It's difficult to change base R without breaking existing code, so most innovation occurs in packages. Here we will describe the tibble package, which provides opinionated data frames that make working in the tidyverse a little easier. You can learn more about tibbles in the accompanying vignette: `vignette("tibble")`. +Throughout this book we work with "tibbles" instead of the traditional data frame. Tibbles _are_ data frames, but tweak some older behaviours to make life a littler easier. R is an old language, and some things that were useful 10 or 20 years ago now get in your way. It's difficult to change base R without breaking existing code, so most innovation occurs in packages. Here we will describe the tibble package, which provides opinionated data frames that make working in the tidyverse a little easier. + +If this chapter leaves you wanting to learn even more about tibbles, you can read more about them in the accompanying vignette: `vignette("tibble")`. ### Prerequisites -In this chapter we'll specifically explore the tibble package. Most of the chapters won't load this explicitly, as the packages (like readr, tidyr, and dplyr) will load it for you. You'll only need if you are creating tibbles "by hand". +In this chapter we'll specifically explore the tibble package. Most chapters don't load tibble explicitly, because most of the functions you'll use from tibble are automatically provided by dplyr. You'll only need if you are creating tibbles "by hand". ```{r setup} library(tibble) @@ -20,7 +22,7 @@ The majority of the functions that you'll use in this book already produce tibbl as_tibble(iris) ``` -`as_tibble()` knows how to convert data frames, lists (provided the elements are equal length vectors), matrices, and tables. +`as_tibble()` knows how to convert data frames, lists (provided the elements are vectors with the same length), matrices, and tables. You can create a new tibble from individual vectors with `tibble()`: @@ -28,7 +30,7 @@ You can create a new tibble from individual vectors with `tibble()`: tibble(x = 1:5, y = 1, z = x ^ 2 + y) ``` -`tibble()` automatically recycles inputs of length 1, and you can refer to variables that you just created. Compared to `data.frame()`, `tibble()` does much less: it never changes the type of the inputs (e.g. it never converts strings to factors!), it never changes the names of variables, and it never creates `row.names()`. +`tibble()` automatically recycles inputs of length 1, and you can refer to variables that you just created. Compared to `data.frame()`, `tibble()` does much less: it never changes the type of the inputs (e.g. it never converts strings to factors!), it never changes the names of variables, and it never creates row names. Another way to create a tibble is with `frame_data()`, which is customised for data entry in R code. Column headings are defined by formulas (`~`), and entries are separated by commas: @@ -40,6 +42,12 @@ frame_data( ) ``` +### Exercises + +1. What function tells you if an object is a tibble? + +1. What does `enframe()` do? When might you use it? + ## Tibbles vs. data frames There are two main differences in the usage of a data frame vs a tibble: printing, and subsetting. @@ -97,15 +105,24 @@ Tibbles clearly delineate `[` and `[[`: `[` always returns another tibble, `[[` ```{r} # With data frames, [ sometimes returns a data frame, and sometimes returns # a vector -df[, 1] +df[, "abc"] # With tibbles, [ always returns another tibble -tb[, 1] +tb[, "abc"] # To extract a single element, you should always use [[ -tb[[1]] +tb[["abc"]] ``` +This is useful to know if you want to extract a single column at the end of dplyr pipeline. + +### Exercises + +1. How can you print all rows of a tibble? + +1. What option controls how many additional column names are printed + at the footer of a tibble? + ## Interacting with legacy code Some older functions don't work with tibbles because they expect `df[, 1]` to return a vector, not a data frame. If you encounter one of these functions, use `as.data.frame()` to turn a tibble back to a data frame: