r - Calculate aggregated variations from expected values (not std deviations) -


i am trying use model pricing accuracy of group of sellers in network.

my data set (pricing) looks this:

  transactionid sellerid expectedprice actualprice pricediff 1          1001      251           200         210        10 2          1002      101           200         300       100 3          1003      251           400         190      -210 4          1004      251           300         300         0 5          1005      101           250         250         0 6          1006      350           200         210        10 7          1007      401           400         400         0 

note: not trying standard deviation calculation since not trying calculate variance mean, rather variance expected value column differ depending on transaction.

i comfortable inserting new columns absolute variances expected value table using:

pricing$diffabs <- abs(pricing$pricediff) 

which results in following:

transactionid   sellerid    expectedprice   actualprice pricediff   diffabs 1001            251         200             210         10          10 1002            101         200             300         100         100 1003            251         400             190         -210            210 1004            251         300             300         0           0 1005            101         250             250         0           0 1006            350         200             210         10          10 1007            401         400             400         0           0 

how 1 calculate variance score each seller be:

the sum of abs(pricing$diff) grouped @ "sellerid" divided number of observations (count) of "sellerid" in data.

the output expect following:

sellerid    count   sumofdiffabs    variation 251         3       220             73.33333333 101         2       100             50 350         1       10              10 401         1       0               0 

the other topics deal variances in r @ aggregated level seem deal standard deviation or variances mean, such this:

calculating grouped variance frequency table in r

the aggregate function works me when using simple function standard deviation, not have figure out how insert count function. throwing me off, variance deviation not mean, column result in table.

m =  matrix(c(1001,251,200,210,10,1002,101,200,300,100,1003,251,400,190,-210,1004,251,300,300,0,1005,101,250,250,0,1006,350,200,210,10,1007,401,400,400,0),ncol = 5,nrow=7,byrow=true) colnames(m) = c("transactionid","sellerid","expectedprice","actualprice","pricediff") pricing = as.data.frame(m) pricing$diffabs <- abs(pricing$pricediff) pricing    transactionid sellerid expectedprice actualprice pricediff diffabs            1001      251           200         210        10      10            1002      101           200         300       100     100            1003      251           400         190      -210     210            1004      251           300         300         0       0            1005      101           250         250         0       0            1006      350           200         210        10      10            1007      401           400         400         0       0 

so here result:

library(data.table) pricing = as.data.table(pricing) f <- function(x) {list( count=length(x))} result <- pricing[ , c(f(diffabs),     sumofdiffabs=sum(diffabs),variation=mean(diffabs)),by=sellerid] result    sellerid count sumofdiffabs variation 1:      251     3          220  73.33333 2:      101     2          100  50.00000 3:      350     1           10  10.00000 4:      401     1            0   0.00000 

Comments