--- title: "数据可视化" subtitle: 《区域水环境污染数据分析实践》
Data analysis practice of regional water environment pollution author: 苏命、王为东
中国科学院大学资源与环境学院
中国科学院生态环境研究中心 date: today lang: zh format: revealjs: theme: dark slide-number: true chalkboard: buttons: true preview-links: auto lang: zh toc: true toc-depth: 1 toc-title: 大纲 logo: ./_extensions/inst/img/ucaslogo.png css: ./_extensions/inst/css/revealjs.css pointer: key: "p" color: "#32cd32" pointerSize: 18 revealjs-plugins: - pointer filters: - d2 --- ```{r} #| echo: false knitr::opts_chunk$set(echo = TRUE) source("../../coding/_common.R") library(tidyverse) library(palmerpenguins) library(ggthemes) ``` ## 查看数据 ```{r} penguins ``` ## 查看数据 ```{r} glimpse(penguins) ``` ## 效果 ```{r} #| echo: false #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins, with a #| best fit line of the relationship between these two variables #| overlaid. The plot displays a positive, fairly linear, and relatively #| strong relationship between these two variables. Species (Adelie, #| Chinstrap, and Gentoo) are represented with different colors and #| shapes. The relationship between body mass and flipper length is #| roughly the same for these three species, and Gentoo penguins are #| larger than penguins from the other two species. penguins |> ggplot(aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(aes(color = species, shape = species)) + geom_smooth(method = "lm") + labs( title = "Body mass and flipper length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Flipper length (mm)", y = "Body mass (g)", color = "Species", shape = "Species" ) + scale_color_colorblind() ``` ## 分层展示 ```{r} #| fig-alt: | #| A blank, gray plot area. ggplot(data = penguins) ``` ## 分层展示 ```{r} #| fig-alt: | #| The plot shows flipper length on the x-axis, with values that range from #| 170 to 230, and body mass on the y-axis, with values that range from 3000 #| to 6000. ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) ``` ## 分层展示 ```{r} #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins. The plot #| displays a positive, linear, and relatively strong relationship between #| these two variables. ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) + geom_point() ``` ## 分层展示 ```{r} #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins. The plot #| displays a positive, fairly linear, and relatively strong relationship #| between these two variables. Species (Adelie, Chinstrap, and Gentoo) #| are represented with different colors. ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species) ) + geom_point() ``` ## 分层展示 ```{r} #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins. Overlaid #| on the scatterplot are three smooth curves displaying the #| relationship between these variables for each species (Adelie, #| Chinstrap, and Gentoo). Different penguin species are plotted in #| different colors for the points and the smooth curves. ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species) ) + geom_point() + geom_smooth(method = "lm") ``` ## 分层展示 ```{r} #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins. Overlaid #| on the scatterplot is a single line of best fit displaying the #| relationship between these variables for each species (Adelie, #| Chinstrap, and Gentoo). Different penguin species are plotted in #| different colors for the points only. ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) + geom_point(mapping = aes(color = species)) + geom_smooth(method = "lm") ``` ## 分层展示 ```{r} #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins. Overlaid #| on the scatterplot is a single line of best fit displaying the #| relationship between these variables for each species (Adelie, #| Chinstrap, and Gentoo). Different penguin species are plotted in #| different colors and shapes for the points only. ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) + geom_point(mapping = aes(color = species, shape = species)) + geom_smooth(method = "lm") ``` ## 分层展示 ```{r} #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins, with a #| line of best fit displaying the relationship between these two variables #| overlaid. The plot displays a positive, fairly linear, and relatively #| strong relationship between these two variables. Species (Adelie, #| Chinstrap, and Gentoo) are represented with different colors and #| shapes. The relationship between body mass and flipper length is #| roughly the same for these three species, and Gentoo penguins are #| larger than penguins from the other two species. ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) + geom_point(aes(color = species, shape = species)) + geom_smooth(method = "lm") + labs( title = "Body mass and flipper length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Flipper length (mm)", y = "Body mass (g)", color = "Species", shape = "Species" ) + scale_color_colorblind() ``` ## 分层展示 ```{r} #| echo: false #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins, colored #| by bill depth. A smooth curve of the relationship between body mass #| and flipper length is overlaid. The relationship is positive, #| fairly linear, and moderately strong. ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) + geom_point(aes(color = bill_depth_mm)) + geom_smooth() ``` ## 分层展示 ```{r} ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = island) ) + geom_point() + geom_smooth(se = FALSE) ``` ## 分层展示 ```{r} ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) + geom_point() + geom_smooth() ggplot() + geom_point( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) + geom_smooth( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) ``` ## 分层展示 ```{r} ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g) ) + geom_point() ``` ## 分层展示 ```{r} ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) + geom_point() ``` ## 柱状图 ```{r} #| fig-alt: | #| A bar chart of frequencies of species of penguins: Adelie #| (approximately 150), Chinstrap (approximately 90), Gentoo #| (approximately 125). ggplot(penguins, aes(x = species)) + geom_bar() ``` ## 柱状图 ```{r} #| fig-alt: | #| A bar chart of frequencies of species of penguins, where the bars are #| ordered in decreasing order of their heights (frequencies): Adelie #| (approximately 150), Gentoo (approximately 125), Chinstrap #| (approximately 90). ggplot(penguins, aes(x = fct_infreq(species))) + geom_bar() ``` ## 直方图 ```{r} #| warning: false #| fig-alt: | #| A histogram of body masses of penguins. The distribution is unimodal #| and right skewed, ranging between approximately 2500 to 6500 grams. ggplot(penguins, aes(x = body_mass_g)) + geom_histogram(binwidth = 200) ``` ## 直方图 ```{r} #| warning: false #| layout-ncol: 2 #| fig-width: 3 #| fig-alt: | #| Two histograms of body masses of penguins, one with binwidth of 20 #| (left) and one with binwidth of 2000 (right). The histogram with binwidth #| of 20 shows lots of ups and downs in the heights of the bins, creating a #| jagged outline. The histogram with binwidth of 2000 shows only three bins. ggplot(penguins, aes(x = body_mass_g)) + geom_histogram(binwidth = 20) ggplot(penguins, aes(x = body_mass_g)) + geom_histogram(binwidth = 2000) ``` ## 密度图 ```{r} #| fig-alt: | #| A density plot of body masses of penguins. The distribution is unimodal #| and right skewed, ranging between approximately 2500 to 6500 grams. ggplot(penguins, aes(x = body_mass_g)) + geom_density() ``` ## 箱图 ```{r} #| warning: false #| fig-alt: | #| Side-by-side box plots of distributions of body masses of Adelie, #| Chinstrap, and Gentoo penguins. The distribution of Adelie and #| Chinstrap penguins' body masses appear to be symmetric with #| medians around 3750 grams. The median body mass of Gentoo penguins #| is much higher, around 5000 grams, and the distribution of the #| body masses of these penguins appears to be somewhat right skewed. ggplot(penguins, aes(x = species, y = body_mass_g)) + geom_boxplot() ``` ## 分组 ```{r} #| warning: false #| fig-alt: | #| A density plot of body masses of penguins by species of penguins. Each #| species (Adelie, Chinstrap, and Gentoo) is represented with different #| colored outlines for the density curves. ggplot(penguins, aes(x = body_mass_g, color = species)) + geom_density(linewidth = 0.75) ``` ## 分组 ```{r} #| warning: false #| fig-alt: | #| A density plot of body masses of penguins by species of penguins. Each #| species (Adelie, Chinstrap, and Gentoo) is represented in different #| colored outlines for the density curves. The density curves are also #| filled with the same colors, with some transparency added. ggplot(penguins, aes(x = body_mass_g, color = species, fill = species)) + geom_density(alpha = 0.5) ``` ## 分组 ```{r} #| fig-alt: | #| Bar plots of penguin species by island (Biscoe, Dream, and Torgersen) ggplot(penguins, aes(x = island, fill = species)) + geom_bar() ``` ## 分组 ```{r} #| fig-alt: | #| Bar plots of penguin species by island (Biscoe, Dream, and Torgersen) #| the bars are scaled to the same height, making it a relative frequencies #| plot ggplot(penguins, aes(x = island, fill = species)) + geom_bar(position = "fill") ``` ## 分组 ```{r} #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins. The plot #| displays a positive, linear, relatively strong relationship between #| these two variables. ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) + geom_point() ``` ## 分组 ```{r} #| warning: false #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins. The plot #| displays a positive, linear, relatively strong relationship between #| these two variables. The points are colored based on the species of the #| penguins and the shapes of the points represent islands (round points are #| Biscoe island, triangles are Dream island, and squared are Torgersen #| island). The plot is very busy and it's difficult to distinguish the shapes #| of the points. ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(aes(color = species, shape = island)) ``` ## 分面 ```{r} #| warning: false #| fig-width: 8 #| fig-asp: 0.33 #| fig-alt: | #| A scatterplot of body mass vs. flipper length of penguins. The shapes and #| colors of points represent species. Penguins from each island are on a #| separate facet. Within each facet, the relationship between body mass and #| flipper length is positive, linear, relatively strong. ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(aes(color = species, shape = species)) + facet_wrap(~island) ``` ## 分面 ```{r} #| warning: false ggplot( data = penguins, mapping = aes( x = bill_length_mm, y = bill_depth_mm, color = species, shape = species ) ) + geom_point() + labs(color = "Species") ``` ## 练习 ```{r} #| echo: false #| layout-ncol: 2 ggplot(penguins, aes(x = island, fill = species)) + geom_bar(position = "fill") ggplot(penguins, aes(x = species, fill = island)) + geom_bar(position = "fill") ``` ## 练习 ```{r} #| echo: false ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) + geom_point() ``` ## 练习 ```{r} #| echo: false ggplot(mpg, aes(x = class)) + geom_bar() ``` ## 练习 ```{r} #| echo: false ggplot(mpg, aes(x = cty, y = hwy)) + geom_point() ``` ## 练习 ```{r} #| echo: false ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy)) ``` ## 欢迎讨论!{.center} `r rmdify::slideend(wechat = FALSE, type = "public", tel = FALSE, thislink = "https://drwater.rcees.ac.cn/course/public/RWEP/@PUB/SD/")`