scala - Spark saveAsTextFIle goes in endless loop -


i'm running spark in stand-alone mode on single machine. have rdd name of productuservectors, this

[("11342",map(..)),("21435",map(..)),...]

the number of rows in normalisedvectors 8164. wanted possible pair combinations between rows of rdd , compute score based on maps in each row. used cartesian possible pairs, , i'm filtering them shown below

scala> val normalisedvectors = productuservector.map(line=>utilinst.normalisevector(line)).sortby(_._1.toint) scala> val combinedrdd = normalisedvectors.cartesian(normalisedvectors).filter(line=>line._1._1.toint > line._2._1.toint && utilinst.filterstyleatp(line._1._1,line._2._1)) scala> val scoresrdd = combinedrdd.map(line=>utilinst.getscore(line)).filter(line=>line._3 > 0) scala> val finalrdd = scoresrdd.map(line=> (line._1,list((line._2,line._3)))).reducebykey(_ ++ _) scala> finalrdd.saveastextfile(outputpath) 

i have set driver memory @ 8gb , executor memory @ 2gb. here, utilinst , it's functions used filter pairs results of cartesian of original rdd. however, output shows goes endless loop shown logs below

16/11/17 18:50:14 info configuration.deprecation: mapred.tip.id deprecated. instead, use mapreduce.task.id 16/11/17 18:50:14 info configuration.deprecation: mapred.task.id deprecated. instead, use mapreduce.task.attempt.id 16/11/17 18:50:14 info configuration.deprecation: mapred.task.is.map deprecated. instead, use mapreduce.task.ismap 16/11/17 18:50:14 info configuration.deprecation: mapred.task.partition deprecated. instead, use mapreduce.task.partition 16/11/17 18:50:14 info configuration.deprecation: mapred.job.id deprecated. instead, use mapreduce.job.id 16/11/17 18:50:31 info executor.executor: finished task 3.0 in stage 0.0 (tid 3). 1491 bytes result sent driver 16/11/17 18:50:31 info executor.executor: finished task 5.0 in stage 0.0 (tid 5). 1491 bytes result sent driver 16/11/17 18:50:31 info scheduler.tasksetmanager: finished task 5.0 in stage 0.0 (tid 5) in 17339 ms on localhost (1/6) 16/11/17 18:50:31 info scheduler.tasksetmanager: finished task 3.0 in stage 0.0 (tid 3) in 17346 ms on localhost (2/6) 16/11/17 18:50:31 info executor.executor: finished task 1.0 in stage 0.0 (tid 1). 1491 bytes result sent driver 16/11/17 18:50:31 info scheduler.tasksetmanager: finished task 1.0 in stage 0.0 (tid 1) in 17423 ms on localhost (3/6) 16/11/17 18:50:32 info executor.executor: finished task 0.0 in stage 0.0 (tid 0). 1491 bytes result sent driver 16/11/17 18:50:32 info executor.executor: finished task 2.0 in stage 0.0 (tid 2). 1491 bytes result sent driver 16/11/17 18:50:32 info scheduler.tasksetmanager: finished task 0.0 in stage 0.0 (tid 0) in 18092 ms on localhost (4/6) 16/11/17 18:50:32 info scheduler.tasksetmanager: finished task 2.0 in stage 0.0 (tid 2) in 18063 ms on localhost (5/6) 16/11/17 18:50:32 info executor.executor: finished task 4.0 in stage 0.0 (tid 4). 1491 bytes result sent driver 16/11/17 18:50:32 info scheduler.tasksetmanager: finished task 4.0 in stage 0.0 (tid 4) in 18073 ms on localhost (6/6) 16/11/17 18:50:32 info scheduler.taskschedulerimpl: removed taskset 0.0, tasks have completed, pool  16/11/17 18:50:32 info scheduler.dagscheduler: shufflemapstage 0 (union @ iterateusers.scala:84) finished in 18.125 s 16/11/17 18:50:32 info scheduler.dagscheduler: looking newly runnable stages 16/11/17 18:50:32 info scheduler.dagscheduler: running: set() 16/11/17 18:50:32 info scheduler.dagscheduler: waiting: set(resultstage 1) 16/11/17 18:50:32 info scheduler.dagscheduler: failed: set() 16/11/17 18:50:32 info scheduler.dagscheduler: submitting resultstage 1 (shuffledrdd[11] @ reducebykey @ iterateusers.scala:87), has no missing parents 16/11/17 18:50:32 info memory.memorystore: block broadcast_2 stored values in memory (estimated size 2.9 kb, free 4.1 gb) 16/11/17 18:50:32 info memory.memorystore: block broadcast_2_piece0 stored bytes in memory (estimated size 1819.0 b, free 4.1 gb) 16/11/17 18:50:32 info storage.blockmanagerinfo: added broadcast_2_piece0 in memory on 127.0.0.1:60497 (size: 1819.0 b, free: 4.1 gb) 16/11/17 18:50:32 info spark.sparkcontext: created broadcast 2 broadcast @ dagscheduler.scala:1012 16/11/17 18:50:32 info scheduler.dagscheduler: submitting 6 missing tasks resultstage 1 (shuffledrdd[11] @ reducebykey @ iterateusers.scala:87) 16/11/17 18:50:32 info scheduler.taskschedulerimpl: adding task set 1.0 6 tasks 16/11/17 18:50:32 info scheduler.tasksetmanager: starting task 0.0 in stage 1.0 (tid 6, localhost, partition 0, any, 5126 bytes) 16/11/17 18:50:32 info scheduler.tasksetmanager: starting task 1.0 in stage 1.0 (tid 7, localhost, partition 1, any, 5126 bytes) 16/11/17 18:50:32 info scheduler.tasksetmanager: starting task 2.0 in stage 1.0 (tid 8, localhost, partition 2, any, 5126 bytes) 16/11/17 18:50:32 info scheduler.tasksetmanager: starting task 3.0 in stage 1.0 (tid 9, localhost, partition 3, any, 5126 bytes) 16/11/17 18:50:32 info scheduler.tasksetmanager: starting task 4.0 in stage 1.0 (tid 10, localhost, partition 4, any, 5126 bytes) 16/11/17 18:50:32 info scheduler.tasksetmanager: starting task 5.0 in stage 1.0 (tid 11, localhost, partition 5, any, 5126 bytes) 16/11/17 18:50:32 info executor.executor: running task 0.0 in stage 1.0 (tid 6) 16/11/17 18:50:32 info executor.executor: running task 5.0 in stage 1.0 (tid 11) 16/11/17 18:50:32 info executor.executor: running task 1.0 in stage 1.0 (tid 7) 16/11/17 18:50:32 info executor.executor: running task 3.0 in stage 1.0 (tid 9) 16/11/17 18:50:32 info executor.executor: running task 2.0 in stage 1.0 (tid 8) 16/11/17 18:50:32 info executor.executor: running task 4.0 in stage 1.0 (tid 10) 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 6 ms 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 5 ms 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 5 ms 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 5 ms 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 6 ms 16/11/17 18:50:32 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 5 ms 16/11/17 18:50:32 info executor.executor: finished task 3.0 in stage 1.0 (tid 9). 1512 bytes result sent driver 16/11/17 18:50:32 info executor.executor: finished task 1.0 in stage 1.0 (tid 7). 1512 bytes result sent driver 16/11/17 18:50:32 info executor.executor: finished task 4.0 in stage 1.0 (tid 10). 1512 bytes result sent driver 16/11/17 18:50:32 info scheduler.tasksetmanager: finished task 3.0 in stage 1.0 (tid 9) in 277 ms on localhost (1/6) 16/11/17 18:50:32 info scheduler.tasksetmanager: finished task 1.0 in stage 1.0 (tid 7) in 283 ms on localhost (2/6) 16/11/17 18:50:32 info scheduler.tasksetmanager: finished task 4.0 in stage 1.0 (tid 10) in 279 ms on localhost (3/6) 16/11/17 18:50:37 info executor.executor: finished task 2.0 in stage 1.0 (tid 8). 1512 bytes result sent driver 16/11/17 18:50:37 info executor.executor: finished task 0.0 in stage 1.0 (tid 6). 1512 bytes result sent driver 16/11/17 18:50:37 info scheduler.tasksetmanager: finished task 0.0 in stage 1.0 (tid 6) in 5120 ms on localhost (4/6) 16/11/17 18:50:37 info scheduler.tasksetmanager: finished task 2.0 in stage 1.0 (tid 8) in 5114 ms on localhost (5/6) 16/11/17 18:50:37 info executor.executor: finished task 5.0 in stage 1.0 (tid 11). 1512 bytes result sent driver 16/11/17 18:50:37 info scheduler.tasksetmanager: finished task 5.0 in stage 1.0 (tid 11) in 5241 ms on localhost (6/6) 16/11/17 18:50:37 info scheduler.taskschedulerimpl: removed taskset 1.0, tasks have completed, pool  16/11/17 18:50:37 info scheduler.dagscheduler: resultstage 1 (count @ iterateusers.scala:88) finished in 5.254 s 16/11/17 18:50:37 info scheduler.dagscheduler: job 0 finished: count @ iterateusers.scala:88, took 23.534860 s 8164 16/11/17 18:50:37 info rdd.unionrdd: removing rdd 10 persistence list 16/11/17 18:50:37 info storage.blockmanager: removing rdd 10 16/11/17 18:50:37 info spark.sparkcontext: starting job: sortby @ iterateusers.scala:91 16/11/17 18:50:37 info spark.mapoutputtrackermaster: size of output statuses shuffle 0 191 bytes 16/11/17 18:50:37 info scheduler.dagscheduler: got job 1 (sortby @ iterateusers.scala:91) 6 output partitions 16/11/17 18:50:37 info scheduler.dagscheduler: final stage: resultstage 3 (sortby @ iterateusers.scala:91) 16/11/17 18:50:37 info scheduler.dagscheduler: parents of final stage: list(shufflemapstage 2) 16/11/17 18:50:37 info scheduler.dagscheduler: missing parents: list() 16/11/17 18:50:37 info scheduler.dagscheduler: submitting resultstage 3 (mappartitionsrdd[15] @ sortby @ iterateusers.scala:91), has no missing parents 16/11/17 18:50:37 info memory.memorystore: block broadcast_3 stored values in memory (estimated size 4.4 kb, free 4.1 gb) 16/11/17 18:50:37 info memory.memorystore: block broadcast_3_piece0 stored bytes in memory (estimated size 2.5 kb, free 4.1 gb) 16/11/17 18:50:37 info storage.blockmanagerinfo: added broadcast_3_piece0 in memory on 127.0.0.1:60497 (size: 2.5 kb, free: 4.1 gb) 16/11/17 18:50:37 info spark.sparkcontext: created broadcast 3 broadcast @ dagscheduler.scala:1012 16/11/17 18:50:37 info scheduler.dagscheduler: submitting 6 missing tasks resultstage 3 (mappartitionsrdd[15] @ sortby @ iterateusers.scala:91) 16/11/17 18:50:37 info scheduler.taskschedulerimpl: adding task set 3.0 6 tasks 16/11/17 18:50:37 info scheduler.tasksetmanager: starting task 0.0 in stage 3.0 (tid 12, localhost, partition 0, any, 5210 bytes) 16/11/17 18:50:37 info scheduler.tasksetmanager: starting task 1.0 in stage 3.0 (tid 13, localhost, partition 1, any, 5210 bytes) 16/11/17 18:50:37 info scheduler.tasksetmanager: starting task 2.0 in stage 3.0 (tid 14, localhost, partition 2, any, 5210 bytes) 16/11/17 18:50:37 info scheduler.tasksetmanager: starting task 3.0 in stage 3.0 (tid 15, localhost, partition 3, any, 5210 bytes) 16/11/17 18:50:37 info scheduler.tasksetmanager: starting task 4.0 in stage 3.0 (tid 16, localhost, partition 4, any, 5210 bytes) 16/11/17 18:50:37 info scheduler.tasksetmanager: starting task 5.0 in stage 3.0 (tid 17, localhost, partition 5, any, 5210 bytes) 16/11/17 18:50:37 info executor.executor: running task 0.0 in stage 3.0 (tid 12) 16/11/17 18:50:37 info executor.executor: running task 4.0 in stage 3.0 (tid 16) 16/11/17 18:50:37 info executor.executor: running task 3.0 in stage 3.0 (tid 15) 16/11/17 18:50:37 info executor.executor: running task 1.0 in stage 3.0 (tid 13) 16/11/17 18:50:37 info executor.executor: running task 2.0 in stage 3.0 (tid 14) 16/11/17 18:50:37 info executor.executor: running task 5.0 in stage 3.0 (tid 17) 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 1 ms 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 1 ms 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:50:37 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:50:38 info executor.executor: finished task 5.0 in stage 3.0 (tid 17). 1818 bytes result sent driver 16/11/17 18:50:38 info executor.executor: finished task 4.0 in stage 3.0 (tid 16). 1818 bytes result sent driver 16/11/17 18:50:38 info executor.executor: finished task 3.0 in stage 3.0 (tid 15). 1728 bytes result sent driver 16/11/17 18:50:38 info executor.executor: finished task 0.0 in stage 3.0 (tid 12). 1724 bytes result sent driver 16/11/17 18:50:38 info executor.executor: finished task 2.0 in stage 3.0 (tid 14). 1727 bytes result sent driver 16/11/17 18:50:38 info executor.executor: finished task 1.0 in stage 3.0 (tid 13). 1734 bytes result sent driver 16/11/17 18:50:38 info scheduler.tasksetmanager: finished task 5.0 in stage 3.0 (tid 17) in 117 ms on localhost (1/6) 16/11/17 18:50:38 info scheduler.tasksetmanager: finished task 4.0 in stage 3.0 (tid 16) in 120 ms on localhost (2/6) 16/11/17 18:50:38 info scheduler.tasksetmanager: finished task 3.0 in stage 3.0 (tid 15) in 123 ms on localhost (3/6) 16/11/17 18:50:38 info scheduler.tasksetmanager: finished task 0.0 in stage 3.0 (tid 12) in 130 ms on localhost (4/6) 16/11/17 18:50:38 info scheduler.tasksetmanager: finished task 2.0 in stage 3.0 (tid 14) in 128 ms on localhost (5/6) 16/11/17 18:50:38 info scheduler.tasksetmanager: finished task 1.0 in stage 3.0 (tid 13) in 130 ms on localhost (6/6) 16/11/17 18:50:38 info scheduler.taskschedulerimpl: removed taskset 3.0, tasks have completed, pool  16/11/17 18:50:38 info scheduler.dagscheduler: resultstage 3 (sortby @ iterateusers.scala:91) finished in 0.133 s 16/11/17 18:50:38 info scheduler.dagscheduler: job 1 finished: sortby @ iterateusers.scala:91, took 0.154474 s 16/11/17 18:50:38 info rdd.shuffledrdd: removing rdd 11 persistence list 16/11/17 18:50:38 info storage.blockmanager: removing rdd 11 16/11/17 18:50:44 info storage.blockmanagerinfo: removed broadcast_3_piece0 on 127.0.0.1:60497 in memory (size: 2.5 kb, free: 4.1 gb) 16/11/17 18:50:44 info storage.blockmanagerinfo: removed broadcast_2_piece0 on 127.0.0.1:60497 in memory (size: 1819.0 b, free: 4.1 gb) 16/11/17 18:51:37 info storage.blockmanagerinfo: removed broadcast_1_piece0 on 127.0.0.1:60497 in memory (size: 3.1 kb, free: 4.1 gb) 16/11/17 18:52:48 info output.fileoutputcommitter: file output committer algorithm version 1 16/11/17 18:52:48 info spark.sparkcontext: starting job: saveastextfile @ iterateusers.scala:99 16/11/17 18:52:48 info scheduler.dagscheduler: registering rdd 13 (sortby @ iterateusers.scala:91) 16/11/17 18:52:48 info scheduler.dagscheduler: registering rdd 22 (map @ iterateusers.scala:98) 16/11/17 18:52:48 info scheduler.dagscheduler: got job 2 (saveastextfile @ iterateusers.scala:99) 36 output partitions 16/11/17 18:52:48 info scheduler.dagscheduler: final stage: resultstage 7 (saveastextfile @ iterateusers.scala:99) 16/11/17 18:52:48 info scheduler.dagscheduler: parents of final stage: list(shufflemapstage 6) 16/11/17 18:52:48 info scheduler.dagscheduler: missing parents: list(shufflemapstage 6) 16/11/17 18:52:48 info scheduler.dagscheduler: submitting shufflemapstage 5 (mappartitionsrdd[13] @ sortby @ iterateusers.scala:91), has no missing parents 16/11/17 18:52:50 info memory.memorystore: block broadcast_4 stored values in memory (estimated size 33.5 mb, free 4.1 gb) 16/11/17 18:52:50 info memory.memorystore: block broadcast_4_piece0 stored bytes in memory (estimated size 4.0 mb, free 4.1 gb) 16/11/17 18:52:50 info storage.blockmanagerinfo: added broadcast_4_piece0 in memory on 127.0.0.1:60497 (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:52:50 info memory.memorystore: block broadcast_4_piece1 stored bytes in memory (estimated size 4.0 mb, free 4.1 gb) 16/11/17 18:52:50 info storage.blockmanagerinfo: added broadcast_4_piece1 in memory on 127.0.0.1:60497 (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:52:50 info memory.memorystore: block broadcast_4_piece2 stored bytes in memory (estimated size 4.0 mb, free 4.0 gb) 16/11/17 18:52:50 info storage.blockmanagerinfo: added broadcast_4_piece2 in memory on 127.0.0.1:60497 (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:52:50 info memory.memorystore: block broadcast_4_piece3 stored bytes in memory (estimated size 2.9 mb, free 4.0 gb) 16/11/17 18:52:50 info storage.blockmanagerinfo: added broadcast_4_piece3 in memory on 127.0.0.1:60497 (size: 2.9 mb, free: 4.1 gb) 16/11/17 18:52:50 info spark.sparkcontext: created broadcast 4 broadcast @ dagscheduler.scala:1012 16/11/17 18:52:50 info scheduler.dagscheduler: submitting 6 missing tasks shufflemapstage 5 (mappartitionsrdd[13] @ sortby @ iterateusers.scala:91) 16/11/17 18:52:50 info scheduler.taskschedulerimpl: adding task set 5.0 6 tasks 16/11/17 18:52:50 info scheduler.tasksetmanager: starting task 0.0 in stage 5.0 (tid 18, localhost, partition 0, any, 5207 bytes) 16/11/17 18:52:50 info scheduler.tasksetmanager: starting task 1.0 in stage 5.0 (tid 19, localhost, partition 1, any, 5207 bytes) 16/11/17 18:52:50 info scheduler.tasksetmanager: starting task 2.0 in stage 5.0 (tid 20, localhost, partition 2, any, 5207 bytes) 16/11/17 18:52:50 info scheduler.tasksetmanager: starting task 3.0 in stage 5.0 (tid 21, localhost, partition 3, any, 5207 bytes) 16/11/17 18:52:50 info scheduler.tasksetmanager: starting task 4.0 in stage 5.0 (tid 22, localhost, partition 4, any, 5207 bytes) 16/11/17 18:52:50 info scheduler.tasksetmanager: starting task 5.0 in stage 5.0 (tid 23, localhost, partition 5, any, 5207 bytes) 16/11/17 18:52:50 info executor.executor: running task 0.0 in stage 5.0 (tid 18) 16/11/17 18:52:50 info executor.executor: running task 1.0 in stage 5.0 (tid 19) 16/11/17 18:52:50 info executor.executor: running task 2.0 in stage 5.0 (tid 20) 16/11/17 18:52:50 info executor.executor: running task 3.0 in stage 5.0 (tid 21) 16/11/17 18:52:50 info executor.executor: running task 4.0 in stage 5.0 (tid 22) 16/11/17 18:52:50 info executor.executor: running task 5.0 in stage 5.0 (tid 23) 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 1 ms 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 2 ms 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:02 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 1 ms 16/11/17 18:53:02 info executor.executor: finished task 2.0 in stage 5.0 (tid 20). 1883 bytes result sent driver 16/11/17 18:53:02 info executor.executor: finished task 0.0 in stage 5.0 (tid 18). 1883 bytes result sent driver 16/11/17 18:53:02 info scheduler.tasksetmanager: finished task 2.0 in stage 5.0 (tid 20) in 12006 ms on localhost (1/6) 16/11/17 18:53:02 info scheduler.tasksetmanager: finished task 0.0 in stage 5.0 (tid 18) in 12011 ms on localhost (2/6) 16/11/17 18:53:02 info executor.executor: finished task 5.0 in stage 5.0 (tid 23). 1883 bytes result sent driver 16/11/17 18:53:02 info scheduler.tasksetmanager: finished task 5.0 in stage 5.0 (tid 23) in 12019 ms on localhost (3/6) 16/11/17 18:53:02 info executor.executor: finished task 4.0 in stage 5.0 (tid 22). 1883 bytes result sent driver 16/11/17 18:53:02 info scheduler.tasksetmanager: finished task 4.0 in stage 5.0 (tid 22) in 12027 ms on localhost (4/6) 16/11/17 18:53:02 info executor.executor: finished task 3.0 in stage 5.0 (tid 21). 1883 bytes result sent driver 16/11/17 18:53:02 info scheduler.tasksetmanager: finished task 3.0 in stage 5.0 (tid 21) in 12044 ms on localhost (5/6) 16/11/17 18:53:02 info executor.executor: finished task 1.0 in stage 5.0 (tid 19). 1883 bytes result sent driver 16/11/17 18:53:02 info scheduler.tasksetmanager: finished task 1.0 in stage 5.0 (tid 19) in 12059 ms on localhost (6/6) 16/11/17 18:53:02 info scheduler.taskschedulerimpl: removed taskset 5.0, tasks have completed, pool  16/11/17 18:53:02 info scheduler.dagscheduler: shufflemapstage 5 (sortby @ iterateusers.scala:91) finished in 12.061 s 16/11/17 18:53:02 info scheduler.dagscheduler: looking newly runnable stages 16/11/17 18:53:02 info scheduler.dagscheduler: running: set() 16/11/17 18:53:02 info scheduler.dagscheduler: waiting: set(shufflemapstage 6, resultstage 7) 16/11/17 18:53:02 info scheduler.dagscheduler: failed: set() 16/11/17 18:53:02 info scheduler.dagscheduler: submitting shufflemapstage 6 (mappartitionsrdd[22] @ map @ iterateusers.scala:98), has no missing parents 16/11/17 18:53:05 info memory.memorystore: block broadcast_5 stored values in memory (estimated size 33.5 mb, free 4.0 gb) 16/11/17 18:53:05 info memory.memorystore: block broadcast_5_piece0 stored bytes in memory (estimated size 4.0 mb, free 4.0 gb) 16/11/17 18:53:05 info storage.blockmanagerinfo: added broadcast_5_piece0 in memory on 127.0.0.1:60497 (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:53:05 info memory.memorystore: block broadcast_5_piece1 stored bytes in memory (estimated size 4.0 mb, free 4.0 gb) 16/11/17 18:53:05 info storage.blockmanagerinfo: added broadcast_5_piece1 in memory on 127.0.0.1:60497 (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:53:05 info memory.memorystore: block broadcast_5_piece2 stored bytes in memory (estimated size 4.0 mb, free 4.0 gb) 16/11/17 18:53:05 info storage.blockmanagerinfo: added broadcast_5_piece2 in memory on 127.0.0.1:60497 (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:53:05 info memory.memorystore: block broadcast_5_piece3 stored bytes in memory (estimated size 2.9 mb, free 4.0 gb) 16/11/17 18:53:05 info storage.blockmanagerinfo: added broadcast_5_piece3 in memory on 127.0.0.1:60497 (size: 2.9 mb, free: 4.1 gb) 16/11/17 18:53:05 info spark.sparkcontext: created broadcast 5 broadcast @ dagscheduler.scala:1012 16/11/17 18:53:05 info scheduler.dagscheduler: submitting 36 missing tasks shufflemapstage 6 (mappartitionsrdd[22] @ map @ iterateusers.scala:98) 16/11/17 18:53:05 info scheduler.taskschedulerimpl: adding task set 6.0 36 tasks 16/11/17 18:53:05 info scheduler.tasksetmanager: starting task 0.0 in stage 6.0 (tid 24, localhost, partition 0, any, 5411 bytes) 16/11/17 18:53:05 info scheduler.tasksetmanager: starting task 1.0 in stage 6.0 (tid 25, localhost, partition 1, any, 5420 bytes) 16/11/17 18:53:05 info scheduler.tasksetmanager: starting task 2.0 in stage 6.0 (tid 26, localhost, partition 2, any, 5420 bytes) 16/11/17 18:53:05 info scheduler.tasksetmanager: starting task 3.0 in stage 6.0 (tid 27, localhost, partition 3, any, 5420 bytes) 16/11/17 18:53:05 info scheduler.tasksetmanager: starting task 4.0 in stage 6.0 (tid 28, localhost, partition 4, any, 5420 bytes) 16/11/17 18:53:05 info scheduler.tasksetmanager: starting task 5.0 in stage 6.0 (tid 29, localhost, partition 5, any, 5420 bytes) 16/11/17 18:53:05 info scheduler.tasksetmanager: starting task 6.0 in stage 6.0 (tid 30, localhost, partition 6, any, 5420 bytes) 16/11/17 18:53:05 info scheduler.tasksetmanager: starting task 7.0 in stage 6.0 (tid 31, localhost, partition 7, any, 5411 bytes) 16/11/17 18:53:05 info executor.executor: running task 1.0 in stage 6.0 (tid 25) 16/11/17 18:53:05 info executor.executor: running task 0.0 in stage 6.0 (tid 24) 16/11/17 18:53:05 info executor.executor: running task 4.0 in stage 6.0 (tid 28) 16/11/17 18:53:05 info executor.executor: running task 2.0 in stage 6.0 (tid 26) 16/11/17 18:53:05 info executor.executor: running task 3.0 in stage 6.0 (tid 27) 16/11/17 18:53:05 info executor.executor: running task 5.0 in stage 6.0 (tid 29) 16/11/17 18:53:05 info executor.executor: running task 6.0 in stage 6.0 (tid 30) 16/11/17 18:53:05 info executor.executor: running task 7.0 in stage 6.0 (tid 31) 16/11/17 18:53:13 info storage.blockmanagerinfo: removed broadcast_4_piece0 on 127.0.0.1:60497 in memory (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:53:13 info storage.blockmanagerinfo: removed broadcast_4_piece3 on 127.0.0.1:60497 in memory (size: 2.9 mb, free: 4.1 gb) 16/11/17 18:53:13 info storage.blockmanagerinfo: removed broadcast_4_piece2 on 127.0.0.1:60497 in memory (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:53:13 info storage.blockmanagerinfo: removed broadcast_4_piece1 on 127.0.0.1:60497 in memory (size: 4.0 mb, free: 4.1 gb) 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 1 ms 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 1 ms 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 1 ms 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 0 ms 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: started 0 remote fetches in 1 ms 16/11/17 18:53:30 info storage.shuffleblockfetcheriterator: getting 6 non-empty blocks out of 6 blocks 

it gets stuck in last storage.shuffleblockfetcheriterator phase endlessly while storing finalrdd text file. have no idea why it's happening. resolve highly appreciated.


Comments

Popular posts from this blog

Formatting string according to pattern without regex in php -

c - zlib and gdi32 with OpenSSL? -

java - inputmismatch exception -