R可视化:ggplot2的基本原理和使用方法

Python017

R可视化:ggplot2的基本原理和使用方法,第1张

ggplot2是R语言第三方可视化扩展包,在某种程度上它基本代替了R可视化。该包是RStudio首席科学家Hadley Wickham读博期间的作品,它强大的画图逻辑使得它称为R最流行的包之一。更多知识分享请到 https://zouhua.top/

ggplot2 is based on the grammar of graphics, the idea that you can build every graph from the same few components: a data set, a set of geoms—visual marks that represent data points, and a coordinate system。

To display data values, map variables in the data set to aesthetic properties of the geom like size, color, and x and y locations

aesthetic map variables in data to graphic properties. mappings control the relationship between data and graphic properties.

Aesthetic mapping means "something you can see"

Each type of geom accepts only a subset of all aesthetics-refer to the geom help pages to see what mappings each geom accepts. Aesthetic mappings are set with the aes() function.

scales map values in the data space to values in the aesthetic space(color, size, shape ...). scales are reported on the plot using axes and legends. Control aesthetic mapping.

Scales are modified with a series of functions using a scale_<aesthetic>_<type>naming scheme

The following arguments are common to most scales in ggplot2:

geometric objects are the actual marks we put on a plot

A plot must have at least one geometric object, and there is no upper limit. adding a geom by using the + operator.

It's often useful to transform your data before plotting, and that's what statistical transformations do.

Every geom function has a default statistic:

The ggplot2 theme system handles non-data plot elements such as

Built-in themes include:

本系列课程要求大家有一定的R语言基础,对于完全零基础的同学,建议去听一下师兄的《生信必备技巧之——R语言基础教程》。本课程将从最基本的绘图开始讲解,深入浅出的带大家理解和运用强大而灵活的ggplot2包。内容包括如何利用ggplot2绘制散点图、线图柱状图、添加注解、修改坐标轴和图例等。

本次课程所用的配套书籍是: 《R Graphic Cookbooks》

除了以上的基本图形外,师兄还会给大家讲解箱线图、提琴图、热图、火山图、气泡图、桑基图、PCA图等各种常用的生信图形的绘制,还不赶紧加入收藏夹,跟着师兄慢慢学起来吧!

柱状图可能是最常用的一种数据可视化。它们通常用于显示数值(在y轴上),用于显示不同类别的数值(在x轴上)。例如,柱状图可以用来显示四种不同商品的价格。柱状图通常不适合显示一段时间内的价格,因为时间是一个连续的变量。

在制作柱状图时,您应该注意一个重要的区别:柱状图的高度有时表示数据集中的案例数,有时表示数据集中的值。记住这一区别——这可能会引起混淆,因为它们与数据的关系非常不同,但两者使用相同的术语。

拓展: position参数: 此处的position主要是指对图像的微调,最常见的应用是在分组的柱形图(bar)中,因为分组的柱形图会产生组内堆积和不堆积两种主要效果。