i am trying use model pricing accuracy of group of sellers in network.
my data set (pricing) looks this:
transactionid sellerid expectedprice actualprice pricediff 1 1001 251 200 210 10 2 1002 101 200 300 100 3 1003 251 400 190 -210 4 1004 251 300 300 0 5 1005 101 250 250 0 6 1006 350 200 210 10 7 1007 401 400 400 0
note: not trying standard deviation calculation since not trying calculate variance mean, rather variance expected value column differ depending on transaction.
i comfortable inserting new columns absolute variances expected value table using:
pricing$diffabs <- abs(pricing$pricediff)
which results in following:
transactionid sellerid expectedprice actualprice pricediff diffabs 1001 251 200 210 10 10 1002 101 200 300 100 100 1003 251 400 190 -210 210 1004 251 300 300 0 0 1005 101 250 250 0 0 1006 350 200 210 10 10 1007 401 400 400 0 0
how 1 calculate variance score each seller be:
the sum of abs(pricing$diff)
grouped @ "sellerid" divided number of observations (count) of "sellerid" in data.
the output expect following:
sellerid count sumofdiffabs variation 251 3 220 73.33333333 101 2 100 50 350 1 10 10 401 1 0 0
the other topics deal variances in r @ aggregated level seem deal standard deviation or variances mean, such this:
calculating grouped variance frequency table in r
the aggregate function works me when using simple function standard deviation, not have figure out how insert count function. throwing me off, variance deviation not mean, column result in table.
m = matrix(c(1001,251,200,210,10,1002,101,200,300,100,1003,251,400,190,-210,1004,251,300,300,0,1005,101,250,250,0,1006,350,200,210,10,1007,401,400,400,0),ncol = 5,nrow=7,byrow=true) colnames(m) = c("transactionid","sellerid","expectedprice","actualprice","pricediff") pricing = as.data.frame(m) pricing$diffabs <- abs(pricing$pricediff) pricing transactionid sellerid expectedprice actualprice pricediff diffabs 1001 251 200 210 10 10 1002 101 200 300 100 100 1003 251 400 190 -210 210 1004 251 300 300 0 0 1005 101 250 250 0 0 1006 350 200 210 10 10 1007 401 400 400 0 0
so here result:
library(data.table) pricing = as.data.table(pricing) f <- function(x) {list( count=length(x))} result <- pricing[ , c(f(diffabs), sumofdiffabs=sum(diffabs),variation=mean(diffabs)),by=sellerid] result sellerid count sumofdiffabs variation 1: 251 3 220 73.33333 2: 101 2 100 50.00000 3: 350 1 10 10.00000 4: 401 1 0 0.00000
Comments
Post a Comment