spark Container killed by YARN for exceeding memory limits -


i'm running spark in aws emr. it's simple computation of pagerank, dataset 8gb

i use 6 m3.xlarge cluster,each 16gb memory

here configuration

spark.executor.instances   4 spark.executor.cores       8 spark.driver.memory    10473m spark.executor.memory   9658m 

i'm new spark, have no common sense how memory-consuming it's......is cluster small? , memory hard limit? or , small memory ok computing slow....

here code (in our homework we're not allowed use graphx, use plain function)

import org.apache.spark.sparkconf; import org.apache.spark.sparkcontext; import org.apache.spark.rdd.rdd;  object pagerank {   def main(args: array[string]): unit = {     val conf = new sparkconf().setappname("pagerank")      val sc = new sparkcontext(conf)      val iternum = 10     val file = sc.textfile("hdfs:///input")       val = file.flatmap { line => line.split("\t") }.distinct()     val contributor = file.map ( line => line.split("\t")(0)).distinct()     val dangling = all.subtract(contributor)      val graphdangling = dangling.cartesian(all).groupbykey()      val graph0 = file.map(line => { val temp = line.split("\t"); (temp(0), temp(1)) }).distinct().groupbykey()     val graph = graph0.union(graphdangling)     graph.cache()      var ranks = graph.mapvalues { x => 1.0 }      var = 0      (i <- 0 iternum) {       val contrireceive = graph.join(ranks).values.flatmap {         case (followees, rank) =>           {             val size = followees.size;             followees.map(followee => (followee, rank / size))           }       }       ranks = contrireceive.reducebykey(_ + _).mapvalues { x => 0.15 + 0.85 * x }     }      val result = ranks.map{case(user,rank)=>user+"\t"+rank}      ranks.saveastextfile("hdfs:///pagerank-output")      sc.stop()   }  } 

i'm not sure rdd , memory management. there many intermediate rdd, have explicitly release them free resource? if so, how ....just assign null , gc deal it?


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -