Coding Systems for Categorical Variables in Regression Analysis
stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis/
下面的示例都是基于以下生成的数据集:
# 代码高亮貌似没有R,看了一下只有Ruby比较相近……就这样将就以下吧,希望有懂的大佬给个指引怎么调整……
set.seed(999)
# 生成数据
df <- data.frame(class = c(rep("low",30),rep("mid",30),rep("high",30)),grade = c(rnorm(30,65,sd=10),rnorm(30,75,sd=5),rnorm(30,85,sd=5)))
# 数据
## 低组的均值
mean(df$grade[df$class == "low"]) # result: 61.58283
mean(df$grade[df$class == "mid"]) # result: 74.68185
mean(df$grade[df$class == "high"]) # result: 85.07108
print(mean(df$grade[df$class == "high"]) - mean(df$grade[df$class == "low"])) # result: 23.48825
print(mean(df$grade[df$class == "mid"]) - mean(df$grade[df$class == "low"])) # result: 13.09902
print(mean(df$grade))# result: 73.77858
print(mean(df$grade[df$class == "high"]) - mean(df$grade)) # result: 11.2925
print(mean(df$grade[df$class