r - Creating Groups with Dplyr's "group_by" then Using Stringr or Set Operations to Find Differences Between Groups -


i use dplyr , stringr if possible, or @ least stay within tidyverse achieve following:

i need group data caseworker , client , compare "task" , "task2" find categories in "task2" not in "task", along associated total time "task2" category.

"task" can have categories not in "task2", i'm interested in finding categories in "task2" not in "task". great able create new columns show specific entries in "task2" , not in "task", along associated "time" value.

the end result should show 4 new columns client chris, 1 "iron shirt" , 1 column associated "time" of 45, , column "do homework" , column "time" of 21. there 2 new columns client eric, 1 "iron shirt" , 1 associated time of 12.

 caseworker<-c("john","john","john","john","john","john","john","john", "john","kim","kim")    client<-c("chris","chris","chris","chris","chris","chris","chris","chris","chris","eric","eric")  task<-c("feed cat","feed cat","feed cat","make dinner","make dinner","make dinner","buy groceries","buy groceries","buy groceries","do homework","do homework")  task2<-c("feed cat","iron shirt","iron shirt","do homework","do homework","do homework","make dinner","feed cat","feed cat","do homework","iron shirt")  time<-c(20,34,11,10,5,6,55,30,20,10,12)  df<-data.frame(caseworker,client,task,task2,time) 

we can try

library(dplyr) library(tidyr) df %>%     group_by(caseworker, client) %>%    filter(task2 %in% setdiff(task2, task)) %>%     group_by(task2, add=true) %>%     summarise(time = sum(time)) %>%     spread(task2, time) #   caseworker client `do homework` `iron shirt` #*     <fctr> <fctr>         <dbl>        <dbl> #1       john  chris            21           45 #2        kim   eric            na           12 

Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -