common is now words
This commit is contained in:
parent
fa8940ecab
commit
e56032efe3
14
strings.Rmd
14
strings.Rmd
|
@ -418,18 +418,18 @@ Remember that when you use a logical vector in a numeric context, `FALSE` become
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
# How many common words start with t?
|
# How many common words start with t?
|
||||||
sum(str_detect(common, "^t"))
|
sum(str_detect(words, "^t"))
|
||||||
# What proportion of common words end with a vowel?
|
# What proportion of common words end with a vowel?
|
||||||
mean(str_detect(common, "[aeiou]$"))
|
mean(str_detect(words, "[aeiou]$"))
|
||||||
```
|
```
|
||||||
|
|
||||||
When you have complex logical conditions (e.g. match a or b but not c unless d) it's often easier to combine multiple `str_detect()` calls with logical operators, rather than trying to create a single regular expression. For example, here are two ways to find all words that don't contain any vowels:
|
When you have complex logical conditions (e.g. match a or b but not c unless d) it's often easier to combine multiple `str_detect()` calls with logical operators, rather than trying to create a single regular expression. For example, here are two ways to find all words that don't contain any vowels:
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
# Find all words containing at least one vowel, and negate
|
# Find all words containing at least one vowel, and negate
|
||||||
no_vowels_1 <- !str_detect(common, "[aeiou]")
|
no_vowels_1 <- !str_detect(words, "[aeiou]")
|
||||||
# Find all words consisting only of consonants (non-vowels)
|
# Find all words consisting only of consonants (non-vowels)
|
||||||
no_vowels_2 <- str_detect(common, "^[^aeiou]+$")
|
no_vowels_2 <- str_detect(words, "^[^aeiou]+$")
|
||||||
all.equal(no_vowels_1, no_vowels_2)
|
all.equal(no_vowels_1, no_vowels_2)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -438,8 +438,8 @@ The results are identical, but I think the first approach is significantly easie
|
||||||
A common use of `str_detect()` is to select the elements that match a pattern. You can do this with logical subsetting, or the convenient `str_subset()` wrapper:
|
A common use of `str_detect()` is to select the elements that match a pattern. You can do this with logical subsetting, or the convenient `str_subset()` wrapper:
|
||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
common[str_detect(common, "x$")]
|
words[str_detect(words, "x$")]
|
||||||
str_subset(common, "x$")
|
str_subset(words, "x$")
|
||||||
```
|
```
|
||||||
|
|
||||||
A variation on `str_detect()` is `str_count()`: rather than a simple yes or no, it tells you how many matches there are in a string:
|
A variation on `str_detect()` is `str_count()`: rather than a simple yes or no, it tells you how many matches there are in a string:
|
||||||
|
@ -449,7 +449,7 @@ x <- c("apple", "banana", "pear")
|
||||||
str_count(x, "a")
|
str_count(x, "a")
|
||||||
|
|
||||||
# On average, how many vowels per word?
|
# On average, how many vowels per word?
|
||||||
mean(str_count(common, "[aeiou]"))
|
mean(str_count(words, "[aeiou]"))
|
||||||
```
|
```
|
||||||
|
|
||||||
Note that matches never overlap. For example, in `"abababa"`, how many times will the pattern `"aba"` match? Regular expressions say two, not three:
|
Note that matches never overlap. For example, in `"abababa"`, how many times will the pattern `"aba"` match? Regular expressions say two, not three:
|
||||||
|
|
Loading…
Reference in New Issue