r - What are the internals of merge followed by aggregation with data.table? -


one common use case data.table in work merge 2 tables followed aggregation intermediary result not needed further on.

example:

library(data.table)  set.seed(200) a.dt <- data.table(k1=2000+sample.int(12, 50, replace = true))[   , k2 := month.abb[(sample(12, .n, replace = true))]][     , .(val = rnbinom(1, 50, .8))     , = .(k1, k2)]  b.dt <- a.dt[    , .(k3 = letters[7:12]), = k1][     , adj1 := 1/rbeta(1, 2, 5)     , by= k3][     , adj2 := rbeta(.n, 5, .5)]  setkey(a.dt, k1) setkey(b.dt, k1)  result.dt <- a.dt[b.dt, allow.cartesian = true][   , .(weighted = sum(val * adj1 * adj2))   , = .(k3, k1, k2)] 

my question if intermediary data.table "a.dt[b.dt, allow.cartesian = true]" first allocated in full or directly piped following aggregation?

yours, alexander


Comments

Popular posts from this blog

c# - Better 64-bit byte array hash -

webrtc - Which ICE candidate am I using and why? -

php - Zend Framework / Skeleton-Application / Composer install issue -