More on strings/regexps

* Polish parens/precedence section
* Eliminate engine section
* Move flags later
This commit is contained in:
Hadley Wickham
2022-01-17 22:49:41 -06:00
parent 785760bfc7
commit fc869d3e75
2 changed files with 86 additions and 123 deletions

View File

@@ -382,6 +382,15 @@ There are three ways we could fix this:
This is pretty typical when working with strings --- there are often multiple ways to reach your goal, either making your pattern more complicated or by doing some preprocessing on your string.
If you get stuck trying one approach, it can often be useful to switch gears and tackle the problem from a different perspective.
Note that regular expression matches never overlap, and `str_count()` only starts looking for a new match after the end of the last match.
For example, in `"abababa"`, how many times will the pattern `"abaµ"` match?
Regular expressions say two, not three:
```{r}
str_count("abababa", "aba")
str_view_all("abababa", "aba")
```
### Replace matches
Sometimes there are inconsistencies in the formatting that are easier to fix before you start extracting; easier to make the data more regular and check your work than coming up with a more complicated regular expression in `str_*` and friends.