python - Add successive rows in Pandas if they match on some columns -

- August 15, 2015

i have dataframe following one:

id  url     seconds 1   email       9 1   email       3 1   app         5 1   app         9 1   faceboook   50 1   faceboook   7 1   faceboook   39 1   faceboook   10 1   email       39 1   email       5 1   email       57 1   faceboook   7 1   faceboook   32 1   faceboook   3 2   app         11 2   app         10 2   email       56 2   faceboook   9 2   faceboook   46 2   faceboook   16 2   email       21

i want sum 'seconds' column successive views of same url same id. that's result i'm looking for:

id  url     seconds 1   email       12 1   app         14 1   faceboook   106 1   email       101 1   faceboook   42 2   app         21 2   email       56 2   faceboook   71 2   email       21

df.groupby(['id', 'url']).sum() not work in case sum cases of same url same id, not successive ones.

any ideas?

you can use groupby series created compare ne column url , shifted, last use cumsum boolean mask:

print ((df.url.ne(df.url.shift())).cumsum()) 0     1 1     1 2     2 3     2 4     3 5     3 6     3 7     3 8     4 9     4 10    4 11    5 12    5 13    5 14    6 15    6 16    7 17    8 18    8 19    8 20    9 name: url, dtype: int32

print (df['seconds'].groupby([(df.url.ne(df.url.shift())).cumsum(), df.id, df.url]).sum())  url  id  url       1    1   email         12 2    1   app           14 3    1   faceboook    106 4    1   email        101 5    1   faceboook     42 6    2   app           21 7    2   email         56 8    2   faceboook     71 9    2   email         21 name: seconds, dtype: int64  print (df['seconds'].groupby([(df.url.ne(df.url.shift())).cumsum(), df.id, df.url])                     .sum()                     .reset_index(level=0, drop=true)                     .reset_index())     id        url  seconds 0   1      email       12 1   1        app       14 2   1  faceboook      106 3   1      email      101 4   1  faceboook       42 5   2        app       21 6   2      email       56 7   2  faceboook       71 8   2      email       21

Search This Blog

QR

python - Add successive rows in Pandas if they match on some columns -

Comments

Post a Comment

Popular posts from this blog

java - .class files under target/classes folder Maven -

linux - Could not find a package configuration file provided by "Qt5Svg" -

simple.odata.client - Simple OData Client Unlink -