Combined standard error function in R -


basically, have several experiments (sites) spanning on course of several years, each year own mean , standard error (based on several replicates each), , want calculate grand mean , standard error each site. grand mean seems straight-forward (average means?) grand standard error less intuitive me. how can create function calculate grand se use dplyr? simplified version of data below:

> print(tbl_df(df), n=40) source: local data frame [76 x 8]              site year myc  co2     n      anpp   anpp.se nyears 1    placerville 1991 ecm elev  nlow    0.8100   0.14000      3 2    placerville 1991 ecm  amb  nlow    0.5400   0.07000      3 3    placerville 1992 ecm elev  nlow   53.1200  11.83000      3 4    placerville 1992 ecm  amb  nlow   26.9000   3.28000      3 5    placerville 1993 ecm elev  nlow 1068.3000 183.80000      3 6    placerville 1993 ecm  amb  nlow  619.0000 118.90000      3 7    placerville 1991 ecm elev nhigh    1.5700   0.26000      3 8    placerville 1991 ecm  amb nhigh    1.2800   0.17000      3 9    placerville 1992 ecm elev nhigh   75.4300  10.29000      3 10   placerville 1992 ecm  amb nhigh   56.2700   7.34000      3 11   placerville 1993 ecm elev nhigh 2118.9000 696.10000      3 12   placerville 1993 ecm  amb nhigh 1235.8000 260.40000      3 13   jasper_face 1999   amb  nlow  386.3371  34.92557      5 14   jasper_face 2000   amb  nlow  551.2848 124.64485      5 15   jasper_face 2001   amb  nlow  552.1139  56.65156      5 16   jasper_face 2002   amb  nlow  410.7524  27.64737      5 17   jasper_face 2003   amb  nlow  503.6037  57.68552      5 18   jasper_face 1999   amb nhigh  680.8551  67.99471      5 19   jasper_face 2000   amb nhigh  480.5723  33.52034      5 20   jasper_face 2001   amb nhigh  744.5131 125.32998      5 21   jasper_face 2002   amb nhigh  603.6049  62.19760      5 22   jasper_face 2003   amb nhigh  711.5993 142.04351      5 23   jasper_face 1999  elev  nlow  488.5912  61.47564      5 24   jasper_face 2000  elev  nlow  406.2773  32.90862      5 25   jasper_face 2001  elev  nlow  543.3647  55.28956      5 26   jasper_face 2002  elev  nlow  480.7108  65.24701      5 27   jasper_face 2003  elev  nlow  473.6844  52.01606      5 28   jasper_face 1999  elev nhigh  638.0252  58.34743      5 29   jasper_face 2000  elev nhigh  505.2054 171.62024      5 30   jasper_face 2001  elev nhigh  655.1032 130.01279      5 31   jasper_face 2002  elev nhigh  677.7134  98.84845      5 32   jasper_face 2003  elev nhigh  926.3433 143.26525      5 33 merrit_island 1997 ecm  amb  nlow  137.0940  22.20700      4 34 merrit_island 1998 ecm  amb  nlow  296.4870  53.32100      4 35 merrit_island 1999 ecm  amb  nlow  350.9470  57.85000      4 36 merrit_island 2000 ecm  amb  nlow  494.6030  66.70200      4 37 merrit_island 1997 ecm elev  nlow  203.7970  26.63300      4 38 merrit_island 1998 ecm elev  nlow  467.8080  62.33200      4 39 merrit_island 1999 ecm elev  nlow  586.8180  91.26500      4 40 merrit_island 2000 ecm elev  nlow  866.3460 126.77000      4 

i need implement function in r specify function in dplyr calculate grand mean , grand se each group, this:

tempse <- df %>% group_by(site,co2,n,nyears) %>%    summarise(anpp=mean(anpp),   sd=grand.sd(anpp.se)) 

edit: in case answer involves equation includes sample size: on dataset, column nyearsis number of years, number of measurements per site , co2treatment need average for. on other hand, within each year, each anppmean , anpp.seis based on number of replicates or plots, sample size contained in se, not specified in column. of these 2 types of sample size 1 need?

thanks

if don't know sample sizes, impossible calculate grand mean or grand standard error. here small example: coin flipping, counting "heads" 1 , "tails" 0. mean of our first sample 0.45, mean of second sample 0.65. if 2 samples have same size, grand mean 0.55. if sample sizes 900 , 100, respectively, have 405+65 "heads", grand mean 0.47. if the sample sizes known, grand mean can computed follows:

  1. multiply each individual mean corresponding sample size.
  2. sum these numbers,
  3. divide sum sum of individual sample sizes.

to compute standard error, proceed follows:

  1. multiply square of each individual standard error corresponding sample size.
  2. to each of these numbers, add square of corresponding mean.
  3. multiply each of theses numbers corresponding sample size. (these sums of squares of sampled values.)
  4. sum these numbers. (now have sum of squares.)
  5. divide sum sum of individual sample sizes. (this gives mean of squares.)
  6. subtract square of grand mean. (-> variance)
  7. take square root of number. (-> standard deviation)
  8. divide number square root of sum of individual sample sizes.

writing r function should straightforward. need sample sizes, @ least common factor.


Comments

Popular posts from this blog

php - Zend Framework / Skeleton-Application / Composer install issue -

c# - Better 64-bit byte array hash -

python - PyCharm Type error Message -