Clarify str_view() color, closes #1479
This commit is contained in:
parent
336b70bbc4
commit
10558e4d57
29
strings.qmd
29
strings.qmd
|
@ -129,9 +129,12 @@ x
|
|||
str_view(x)
|
||||
```
|
||||
|
||||
Note that `str_view()` uses a blue background for tabs to make them easier to spot.
|
||||
Note that `str_view()` uses curly braces for tabs to make them easier to spot[^strings-3].
|
||||
One of the challenges of working with text is that there's a variety of ways that white space can end up in the text, so this background helps you recognize that something strange is going on.
|
||||
|
||||
[^strings-3]: `str_view()` also uses color to bring tabs, spaces, matches, etc. to your attention.
|
||||
The colors don't currently show up in the book, but you'll notice them when running code interactively.
|
||||
|
||||
### Exercises
|
||||
|
||||
1. Create strings that contain the following values:
|
||||
|
@ -189,9 +192,9 @@ df |>
|
|||
|
||||
### `str_glue()` {#sec-glue}
|
||||
|
||||
If you are mixing many fixed and variable strings with `str_c()`, you'll notice that you type a lot of `"`s, making it hard to see the overall goal of the code. An alternative approach is provided by the [glue package](https://glue.tidyverse.org) via `str_glue()`[^strings-3]. You give it a single string that has a special feature: anything inside `{}` will be evaluated like it's outside of the quotes:
|
||||
If you are mixing many fixed and variable strings with `str_c()`, you'll notice that you type a lot of `"`s, making it hard to see the overall goal of the code. An alternative approach is provided by the [glue package](https://glue.tidyverse.org) via `str_glue()`[^strings-4]. You give it a single string that has a special feature: anything inside `{}` will be evaluated like it's outside of the quotes:
|
||||
|
||||
[^strings-3]: If you're not using stringr, you can also access it directly with `glue::glue()`.
|
||||
[^strings-4]: If you're not using stringr, you can also access it directly with `glue::glue()`.
|
||||
|
||||
```{r}
|
||||
df |> mutate(greeting = str_glue("Hi {name}!"))
|
||||
|
@ -211,9 +214,9 @@ df |> mutate(greeting = str_glue("{{Hi {name}!}}"))
|
|||
|
||||
`str_c()` and `str_glue()` work well with `mutate()` because their output is the same length as their inputs.
|
||||
What if you want a function that works well with `summarize()`, i.e. something that always returns a single string?
|
||||
That's the job of `str_flatten()`[^strings-4]: it takes a character vector and combines each element of the vector into a single string:
|
||||
That's the job of `str_flatten()`[^strings-5]: it takes a character vector and combines each element of the vector into a single string:
|
||||
|
||||
[^strings-4]: The base R equivalent is `paste()` used with the `collapse` argument.
|
||||
[^strings-5]: The base R equivalent is `paste()` used with the `collapse` argument.
|
||||
|
||||
```{r}
|
||||
str_flatten(c("x", "y", "z"))
|
||||
|
@ -344,11 +347,11 @@ df4 |>
|
|||
|
||||
### Diagnosing widening problems
|
||||
|
||||
`separate_wider_delim()`[^strings-5] requires a fixed and known set of columns.
|
||||
`separate_wider_delim()`[^strings-6] requires a fixed and known set of columns.
|
||||
What happens if some of the rows don't have the expected number of pieces?
|
||||
There are two possible problems, too few or too many pieces, so `separate_wider_delim()` provides two arguments to help: `too_few` and `too_many`. Let's first look at the `too_few` case with the following sample dataset:
|
||||
|
||||
[^strings-5]: The same principles apply to `separate_wider_position()` and `separate_wider_regex()`.
|
||||
[^strings-6]: The same principles apply to `separate_wider_position()` and `separate_wider_regex()`.
|
||||
|
||||
```{r}
|
||||
#| error: true
|
||||
|
@ -463,9 +466,9 @@ You'll learn how to find the length of a string, extract substrings, and handle
|
|||
str_length(c("a", "R for data science", NA))
|
||||
```
|
||||
|
||||
You could use this with `count()` to find the distribution of lengths of US babynames and then with `filter()` to look at the longest names, which happen to have 15 letters[^strings-6]:
|
||||
You could use this with `count()` to find the distribution of lengths of US babynames and then with `filter()` to look at the longest names, which happen to have 15 letters[^strings-7]:
|
||||
|
||||
[^strings-6]: Looking at these entries, we'd guess that the babynames data drops spaces or hyphens and truncates after 15 letters.
|
||||
[^strings-7]: Looking at these entries, we'd guess that the babynames data drops spaces or hyphens and truncates after 15 letters.
|
||||
|
||||
```{r}
|
||||
babynames |>
|
||||
|
@ -547,9 +550,9 @@ readr uses UTF-8 everywhere.
|
|||
This is a good default but will fail for data produced by older systems that don't use UTF-8.
|
||||
If this happens, your strings will look weird when you print them.
|
||||
Sometimes just one or two characters might be messed up; other times, you'll get complete gibberish.
|
||||
For example here are two inline CSVs with unusual encodings[^strings-7]:
|
||||
For example here are two inline CSVs with unusual encodings[^strings-8]:
|
||||
|
||||
[^strings-7]: Here I'm using the special `\x` to encode binary data directly into a string.
|
||||
[^strings-8]: Here I'm using the special `\x` to encode binary data directly into a string.
|
||||
|
||||
```{r}
|
||||
#| eval: false
|
||||
|
@ -630,10 +633,10 @@ str_to_upper(c("i", "ı"))
|
|||
str_to_upper(c("i", "ı"), locale = "tr")
|
||||
```
|
||||
|
||||
Sorting strings depends on the order of the alphabet, and the order of the alphabet is not the same in every language[^strings-8]!
|
||||
Sorting strings depends on the order of the alphabet, and the order of the alphabet is not the same in every language[^strings-9]!
|
||||
Here's an example: in Czech, "ch" is a compound letter that appears after `h` in the alphabet.
|
||||
|
||||
[^strings-8]: Sorting in languages that don't have an alphabet, like Chinese, is more complicated still.
|
||||
[^strings-9]: Sorting in languages that don't have an alphabet, like Chinese, is more complicated still.
|
||||
|
||||
```{r}
|
||||
str_sort(c("a", "c", "ch", "h", "z"))
|
||||
|
|
Loading…
Reference in New Issue