20
									
								
								factors.Rmd
									
									
									
									
									
								
							
							
						
						
									
										20
									
								
								factors.Rmd
									
									
									
									
									
								
							@@ -336,22 +336,21 @@ gss_cat %>%
 | 
				
			|||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Sometimes you just want to lump together all the small groups to make a plot or table simpler.
 | 
					Sometimes you just want to lump together all the small groups to make a plot or table simpler.
 | 
				
			||||||
That's the job of `fct_lump()`:
 | 
					That's the job of the `fct_lump_*()` family of functions.
 | 
				
			||||||
 | 
					`fct_lump_lowfreq()` is a simple starting point that progressively lumps the smallest groups categories into "Other", always keeping "Other" as the smallest category.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```{r}
 | 
					```{r}
 | 
				
			||||||
gss_cat %>%
 | 
					gss_cat %>%
 | 
				
			||||||
  mutate(relig = fct_lump(relig)) %>%
 | 
					  mutate(relig = fct_lump_lowfreq(relig)) %>%
 | 
				
			||||||
  count(relig)
 | 
					  count(relig)
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The default behaviour is to progressively lump together the smallest groups, ensuring that the aggregate is still the smallest group.
 | 
					In this case it's not very helpful: it is true that the majority of Americans in this survey are Protestant, but we'd probably like to see some more details!
 | 
				
			||||||
In this case it's not very helpful: it is true that the majority of Americans in this survey are Protestant, but we've probably over collapsed.
 | 
					Instead, we can use the `fct_lump_n()` to specify that we want exactly 10 groups:
 | 
				
			||||||
 | 
					 | 
				
			||||||
Instead, we can use the `n` parameter to specify how many groups (excluding other) we want to keep:
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
```{r}
 | 
					```{r}
 | 
				
			||||||
gss_cat %>%
 | 
					gss_cat %>%
 | 
				
			||||||
  mutate(relig = fct_lump(relig, n = 10)) %>%
 | 
					  mutate(relig = fct_lump_n(relig, n = 10)) %>%
 | 
				
			||||||
  count(relig, sort = TRUE) %>%
 | 
					  count(relig, sort = TRUE) %>%
 | 
				
			||||||
  print(n = Inf)
 | 
					  print(n = Inf)
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
@@ -360,7 +359,8 @@ gss_cat %>%
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
1.  How have the proportions of people identifying as Democrat, Republican, and Independent changed over time?
 | 
					1.  How have the proportions of people identifying as Democrat, Republican, and Independent changed over time?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
1.  How could you collapse `rincome` into a small set of categories?
 | 
					2.  How could you collapse `rincome` into a small set of categories?
 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Notice there are 9 groups (excluding other) in the `fct_lump` example above. Why not 10? (Hint: type `?fct_lump`, and find the default for the argument `other_level` is "Other".)
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					3.  Notice there are 9 groups (excluding other) in the `fct_lump` example above.
 | 
				
			||||||
 | 
					    Why not 10?
 | 
				
			||||||
 | 
					    (Hint: type `?fct_lump`, and find the default for the argument `other_level` is "Other".)
 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user