Hey everyone,
I recently purchased a book that goes over some basic features of R. Please find some of the chapters for free in the attachment below. In addition, if you would like to download R, please go to the provided link:
http://cran.r-project.org/bin/windows/base/
SUMMARY STATISTICS
To get the mean, you simply type the following:
sapply(mydata, mean, na.rm=TRUE)
You should get:
weight BP disease
102.03 40.03 0.52
There are also numerous R functions designed to provide a range of descriptive statistics at once. For example,
summary(mydata)
weight BP disease
Min. : 30.0 Min. :13.00 Min. :0.00
1st Qu.: 69.0 1st Qu.:25.00 1st Qu.:0.00
Median : 86.0 Median :36.50 Median :1.00
Mean :102.0 Mean :40.03 Mean :0.52
3rd Qu.:136.2 3rd Qu.:55.00 3rd Qu.:1.00
Max. :195.0 Max. :79.00 Max. :1.00
install.packages("Hmisc")
library(Hmisc)
describe(mydata)
3 Variables 100 Observations
--------------------------------------------------------------------------------------------------------
weight
n missing unique Info Mean .05 .10 .25 .50 .75 .90 .95
100 0 73 1 102 43.75 51.90 69.00 86.00 136.25 178.30 190.00
lowest : 30 33 36 39 44, highest: 188 190 191 194 195
--------------------------------------------------------------------------------------------------------
BP
n missing unique Info Mean .05 .10 .25 .50 .75 .90 .95
100 0 51 1 40.03 17.0 18.0 25.0 36.5 55.0 68.0 72.1
lowest : 13 14 17 18 19, highest: 70 72 74 77 79
---------------------------------------------------------------------------------------------------------
disease
n missing unique Info Sum Mean
100 0 2 0.75 52 0.52
---------------------------------------------------------------------------------------------------------
install.packages("pastecs")
library(pastecs)
stat.desc(mydata)
weight BP disease
nbr.val 1.000000e+02 100.0000000 100.00000000
nbr.null 0.000000e+00 0.0000000 48.00000000
nbr.na 0.000000e+00 0.0000000 0.00000000
min 3.000000e+01 13.0000000 0.00000000
max 1.950000e+02 79.0000000 1.00000000
range 1.650000e+02 66.0000000 1.00000000
sum 1.020300e+04 4003.0000000 52.00000000
median 8.600000e+01 36.5000000 1.00000000
mean 1.020300e+02 40.0300000 0.52000000
SE.mean 4.619403e+00 1.8337716 0.05021167
CI.mean.0.95 9.165897e+00 3.6386006 0.09963085
var 2.133888e+03 336.2718182 0.25212121
std.dev 4.619403e+01 18.3377157 0.50211673
coef.var 4.527494e-01 0.4580993 0.96560910
install.packages("psych")
library(psych)
describe(mydata)
vars n mean sd median trimmed mad min max range skew kurtosis se
weight 1 100 102.03 46.19 86.0 98.80 43.74 30 195 165 0.57 -0.85 4.62
BP 2 100 40.03 18.34 36.5 38.83 20.02 13 79 66 0.49 -0.93 1.83
disease 3 100 0.52 0.50 1.0 0.52 0.00 0 1 1 -0.08 -2.01 0.05