r语言出现embesded

Python014

r语言出现embesded,第1张

由于数据可能在Windows下编辑过,保存的是UTF-16的格式用R读取可能会出现以下问题。这种情况有以下三种解决方案。

>sampInfo=read.table("/media/xxx/sampInfo_origin.txt", na.strings=c("", "NA"), sep="\t", header=T)Error in make.names(col.names, unique = TRUE) : invalid multibyte string at '<ff><fe>R'In addition: Warning messages:1: In read.table("/media/xxx/sampInfo_origin.txt", : line 1 appears to contain embedded nulls2: In read.table("/media/xxx/sampInfo_origin.txt", : line 2 appears to contain embedded nulls3: In read.table("/media/xxx/sampInfo_origin.txt", : line 3 appears to contain embedded nulls4: In read.table("/media/xxx/sampInfo_origin.txt", : line 4 appears to contain embedded nulls5: In read.table("/media/albert/xxx/sampInfo_origin.txt", : line 5 appears to contain embedded nulls

解决方法一:fileEncoding="UTF16LE"或者fileEncoding="UTF16"

>sampInfo=read.table("/media/xxx/sampInfo_origin.txt", fileEncoding="UTF16LE", sep="\t", header=T)>sampInfo=read.table("/media/xxx/sampInfo_origin.txt", fileEncoding="UTF16", sep="\t", header=T)>head(sampInfo) Run Sample_Name age ancestry arthropathymeds biologics das_score1 SRRxxx72 GSMxxx25 66 <NA> <NA> <NA> NA2 SRRxxx73 GSMxxx26 72 <NA> <NA> <NA> NA3 SRRxxx75 GSMxxx28 61 <NA> <NA> <NA> NA4 SRRxxx74 GSMxxx27 72 <NA> <NA> <NA> NA5 SRRxxx76 GSMxxx29 50 <NA> <NA> <NA> NA6 SRRxxx77 GSMxxx30 59 <NA> <NA> <NA> NA disease_activity donor gender leflumide nsaids othermeds phenotype1 <NA> C137 male <NA> <NA> <NA> Healthy2 <NA> C141 male <NA> <NA> <NA> Healthy3 <NA> C383 male <NA> <NA> <NA> Healthy4 <NA> C148 female <NA> <NA> <NA> Healthy5 <NA> C391 female <NA> <NA> <NA> Healthy6 <NA> C392 female <NA> <NA> <NA> Healthy classification status plaquenil rituximab steroids sulfasalazine tissue1 H H <NA> <NA><NA> <NA> Blood2 H H <NA> <NA><NA> <NA> Blood3 H H <NA> <NA><NA> <NA> Blood4 H H <NA> <NA><NA> <NA> Blood5 H H <NA> <NA><NA> <NA> Blood6 H H <NA> <NA><NA> <NA> Blood

解决方法二:在Excel中打开,另存为csv文件即可。

>sampInfo=read.csv("/media/xxx/sampInfo_origin.csv", comment.char = "#", sep=",", header=T)>head(sampInfo) Run Sample_Name age ancestry arthropathymeds biologics das_score1 SRRxxx72 GSMxxx25 66 <NA> <NA> <NA> NA2 SRRxxx73 GSMxxx26 72 <NA> <NA> <NA> NA3 SRRxxx75 GSMxxx28 61 <NA> <NA> <NA> NA4 SRRxxx74 GSMxxx27 72 <NA> <NA> <NA> NA5 SRRxxx76 GSMxxx29 50 <NA> <NA> <NA> NA6 SRRxxx77 GSMxxx30 59 <NA> <NA> <NA> NA disease_activity donor gender leflumide nsaids othermeds phenotype1 <NA> C137 male <NA> <NA> <NA> Healthy2 <NA> C141 male <NA> <NA> <NA> Healthy3 <NA> C383 male <NA> <NA> <NA> Healthy4 <NA> C148 female <NA> <NA> <NA> Healthy5 <NA> C391 female <NA> <NA> <NA> Healthy6 <NA> C392 female <NA> <NA> <NA> Healthy classification status plaquenil rituximab steroids sulfasalazine tissue1 H H <NA> <NA><NA> <NA> Blood2 H H <NA> <NA><NA> <NA> Blood3 H H <NA> <NA><NA> <NA> Blood4 H H <NA> <NA><NA> <NA> Blood5 H H <NA> <NA><NA> <NA> Blood6 H H <NA> <NA><NA> <NA> Blood

解决方法三:在linux系统里将sampInfo_origin.txt用gedit打开,另存为sampInfo_origin01.txt,“Character Encoding” 改为 UTF-8, “Line ending”改为“Unix/Linux”。

>sampInfo=read.table("/media/xxx

(供自己记录)

adj位置调整

ask询问

bg背景

bty图形边框风格,o四边都有边框,l左边和下边,7右边和上边,c上边、左边和下边,

cex设置点和字符的大小,axis坐标轴上标签字的大小,lab坐标轴上命名的大小,main标题的大小,sub副标题的大小,col颜色。

family字体的风格,

fg前景颜色

font图片字体的风格,字体,粗体,斜体

las坐标轴的运行关系,坐标轴上的字和坐标轴的关系,字会转

lend线的两端的样式

lty线的形式,直线、虚线

lwd线的粗细

Mai、mar、mex画布的大小

Mfcol、mfrow是来切分画布的,放几个fig在画布中,两个功能一样

pch是用来定义点的形状的,有25个形状

srt用来定义图中的文字的角度

Txk坐标轴上的刻度的大小,刻度的字体大小

Xaxt/yaxt不想要坐标轴的标签

Xlog/ylog是x轴和y轴设置为log值

Xpd把绘图区设置为整个画布

Fig表示图形的四个角的位置

New是在图中生成图

输入了函数对象名称,可以直接看到代码的,如要获得函数对象fivenum的代码,就只需要在Console中键入函数对象名称fivenum就可以得到如下结果:

function (x, na.rm = TRUE)

{

xna <- is.na(x)

if (na.rm)

x <- x[!xna]

else if (any(xna))

return(rep.int(NA, 5))

x <- sort(x)

n <- length(x)

if (n == 0)

rep.int(NA, 5)

else {

n4 <- floor((n + 3)/2)/2

d <- c(1, n4, (n + 1)/2, n + 1 - n4, n)

0.5 * (x[floor(d)] + x[ceiling(d)])

}

}