r - What are the internals of merge followed by aggregation with data.table? -
one common use case data.table in work merge 2 tables followed aggregation intermediary result not needed further on.
example:
library(data.table) set.seed(200) a.dt <- data.table(k1=2000+sample.int(12, 50, replace = true))[ , k2 := month.abb[(sample(12, .n, replace = true))]][ , .(val = rnbinom(1, 50, .8)) , = .(k1, k2)] b.dt <- a.dt[ , .(k3 = letters[7:12]), = k1][ , adj1 := 1/rbeta(1, 2, 5) , by= k3][ , adj2 := rbeta(.n, 5, .5)] setkey(a.dt, k1) setkey(b.dt, k1) result.dt <- a.dt[b.dt, allow.cartesian = true][ , .(weighted = sum(val * adj1 * adj2)) , = .(k3, k1, k2)]
my question if intermediary data.table "a.dt[b.dt, allow.cartesian = true]" first allocated in full or directly piped following aggregation?
yours, alexander
Comments
Post a Comment