spark Container killed by YARN for exceeding memory limits -
i'm running spark in aws emr. it's simple computation of pagerank, dataset 8gb i use 6 m3.xlarge cluster,each 16gb memory here configuration spark.executor.instances 4 spark.executor.cores 8 spark.driver.memory 10473m spark.executor.memory 9658m i'm new spark, have no common sense how memory-consuming it's......is cluster small? , memory hard limit? or , small memory ok computing slow.... here code (in our homework we're not allowed use graphx, use plain function) import org.apache.spark.sparkconf; import org.apache.spark.sparkcontext; import org.apache.spark.rdd.rdd; object pagerank { def main(args: array[string]): unit = { val conf = new sparkconf().setappname("pagerank") val sc = new sparkcontext(conf) val iternum = 10 val file = sc.textfile("hdfs:///input") val = file.flatmap { line => line.split("\t") }.distinct() val contributor = file.map ( line => line.split(...