arrays - Gawk distinct and sum column -


i new linux , usage of awk , couldn't find answer following question:

i want use awk , file structured that:

date id size 2016-11-09 688 47 2016-11-09 688 56 2016-11-09 31640 55 

now want sum size each line has date , id , export .csv file. file should that:

date,id,size 2016-11-09,688,103 2016-11-09,31640 55 

i need help, because not figure out how on own, thank you.

if input sorted date , id in sample should use this:

$ cat tst.awk begin { ofs="," } nr==1 { $1=$1; print; next } { curr = $1 ofs $2 } (curr != prev) && (nr > 2) { print prev, sum; sum=0 } { prev = curr; sum += $3 } end { print prev, sum }  $ awk -f tst.awk file date,id,size 2016-11-09,688,103 2016-11-09,31640,55 

rather saving whole file in memory. note approach produce output in same order input whereas for .. in .. loop in end section print output in random (hash) order.


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -