Combined standard error function in R -
basically, have several experiments (site
s) spanning on course of several years, each year own mean , standard error (based on several replicates each), , want calculate grand mean , standard error each site
. grand mean seems straight-forward (average means?) grand standard error less intuitive me. how can create function calculate grand se use dplyr? simplified version of data below:
> print(tbl_df(df), n=40) source: local data frame [76 x 8] site year myc co2 n anpp anpp.se nyears 1 placerville 1991 ecm elev nlow 0.8100 0.14000 3 2 placerville 1991 ecm amb nlow 0.5400 0.07000 3 3 placerville 1992 ecm elev nlow 53.1200 11.83000 3 4 placerville 1992 ecm amb nlow 26.9000 3.28000 3 5 placerville 1993 ecm elev nlow 1068.3000 183.80000 3 6 placerville 1993 ecm amb nlow 619.0000 118.90000 3 7 placerville 1991 ecm elev nhigh 1.5700 0.26000 3 8 placerville 1991 ecm amb nhigh 1.2800 0.17000 3 9 placerville 1992 ecm elev nhigh 75.4300 10.29000 3 10 placerville 1992 ecm amb nhigh 56.2700 7.34000 3 11 placerville 1993 ecm elev nhigh 2118.9000 696.10000 3 12 placerville 1993 ecm amb nhigh 1235.8000 260.40000 3 13 jasper_face 1999 amb nlow 386.3371 34.92557 5 14 jasper_face 2000 amb nlow 551.2848 124.64485 5 15 jasper_face 2001 amb nlow 552.1139 56.65156 5 16 jasper_face 2002 amb nlow 410.7524 27.64737 5 17 jasper_face 2003 amb nlow 503.6037 57.68552 5 18 jasper_face 1999 amb nhigh 680.8551 67.99471 5 19 jasper_face 2000 amb nhigh 480.5723 33.52034 5 20 jasper_face 2001 amb nhigh 744.5131 125.32998 5 21 jasper_face 2002 amb nhigh 603.6049 62.19760 5 22 jasper_face 2003 amb nhigh 711.5993 142.04351 5 23 jasper_face 1999 elev nlow 488.5912 61.47564 5 24 jasper_face 2000 elev nlow 406.2773 32.90862 5 25 jasper_face 2001 elev nlow 543.3647 55.28956 5 26 jasper_face 2002 elev nlow 480.7108 65.24701 5 27 jasper_face 2003 elev nlow 473.6844 52.01606 5 28 jasper_face 1999 elev nhigh 638.0252 58.34743 5 29 jasper_face 2000 elev nhigh 505.2054 171.62024 5 30 jasper_face 2001 elev nhigh 655.1032 130.01279 5 31 jasper_face 2002 elev nhigh 677.7134 98.84845 5 32 jasper_face 2003 elev nhigh 926.3433 143.26525 5 33 merrit_island 1997 ecm amb nlow 137.0940 22.20700 4 34 merrit_island 1998 ecm amb nlow 296.4870 53.32100 4 35 merrit_island 1999 ecm amb nlow 350.9470 57.85000 4 36 merrit_island 2000 ecm amb nlow 494.6030 66.70200 4 37 merrit_island 1997 ecm elev nlow 203.7970 26.63300 4 38 merrit_island 1998 ecm elev nlow 467.8080 62.33200 4 39 merrit_island 1999 ecm elev nlow 586.8180 91.26500 4 40 merrit_island 2000 ecm elev nlow 866.3460 126.77000 4
i need implement function in r specify function in dplyr calculate grand mean , grand se each group, this:
tempse <- df %>% group_by(site,co2,n,nyears) %>% summarise(anpp=mean(anpp), sd=grand.sd(anpp.se))
edit: in case answer involves equation includes sample size: on dataset, column nyears
is number of years, number of measurements per site
, co2
treatment need average for. on other hand, within each year, each anpp
mean , anpp.se
is based on number of replicates or plots, sample size contained in se, not specified in column. of these 2 types of sample size 1 need?
thanks
if don't know sample sizes, impossible calculate grand mean or grand standard error. here small example: coin flipping, counting "heads" 1 , "tails" 0. mean of our first sample 0.45, mean of second sample 0.65. if 2 samples have same size, grand mean 0.55. if sample sizes 900 , 100, respectively, have 405+65 "heads", grand mean 0.47. if the sample sizes known, grand mean can computed follows:
- multiply each individual mean corresponding sample size.
- sum these numbers,
- divide sum sum of individual sample sizes.
to compute standard error, proceed follows:
- multiply square of each individual standard error corresponding sample size.
- to each of these numbers, add square of corresponding mean.
- multiply each of theses numbers corresponding sample size. (these sums of squares of sampled values.)
- sum these numbers. (now have sum of squares.)
- divide sum sum of individual sample sizes. (this gives mean of squares.)
- subtract square of grand mean. (-> variance)
- take square root of number. (-> standard deviation)
- divide number square root of sum of individual sample sizes.
writing r function should straightforward. need sample sizes, @ least common factor.
Comments
Post a Comment