1,csv文件,可以用fread函数读取,命名,为dd
2,数据变为一列,如果没有ID这一列,全部都是性状,可以这样运行:melt(dd),达到的效果如下:
1.创建数据框a <- data.frame("geneid1"=rep("TabHLH1",3),"geneid2"=c("TabHLH2.1","TabHLH2.2","TabHLH2.3"),"geneid3"=rep("TabHLH3",3))
结果如下:
geneid1 geneid2 geneid3
1 TabHLH1 TabHLH2.1 TabHLH3
2 TabHLH1 TabHLH2.2 TabHLH3
3 TabHLH1 TabHLH2.3 TabHLH3
加载函数包
library(dplyr)
library(tidyr)
将第二列以“.”分列
b <- a %>% separate(geneid2, c("gene","id"), "[.]")
结果如下
geneid1 gene id geneid3
1 TabHLH1 TabHLH2 1 TabHLH3
2 TabHLH1 TabHLH2 2 TabHLH3
3 TabHLH1 TabHLH2 3 TabHLH3
#R中的matrix默认情况下是按列填写数字的df <- matrix(1:10, nrow = 5)
df
[,1] [,2]
[1,]16
[2,]27
[3,]38
[4,]49
[5,]5 10
# 如果数据为matrix结构,直接as.vector, 变为vector之后可以用as.matrix自己转换成10000*1的格式(感觉没必要再换了,除非要继续进行矩阵运算)
df <- matrix(sample(1:10,100*100,replace = TRUE), nrow = 100)
df_numeric <- as.vector(df)
df_numeric
# 如果数据为data.frame结构,先转换为matrix再转换为vector
df <- data.frame(x1 = sample(c("Normal","Unnormal"), 10, replace = TRUE),
x2 = sample(c("a","b"), 10,replace = TRUE),
x3 = sample(c(1, 2), 10,replace = TRUE))
df_char <- as.vector(as.matrix(df))
df_char # 如果数据中有字符,那转换之后必定全是字符结构
# 最粗暴的方法定义个10000的数组,然后一列列放进去(这里别用append)