scala - Spark: how to not use aws credentials explicitly in Spark application -


in spark application, have aws credentials passed in via command line arguments.

spark.sparkcontext.hadoopconfiguration.set("fs.s3.awsaccesskeyid", awsaccesskeyid) spark.sparkcontext.hadoopconfiguration.set("fs.s3.awssecretaccesskey", awssecretaccesskey) spark.sparkcontext.hadoopconfiguration.set("fs.s3.impl", "org.apache.hadoop.fs.s3native.natives3filesystem") 

however, in cluster mode explicitly passing credential between nodes huge security issue since these credentials being passed text.

how make application work iamrole or other proper approach doesn't need 2 lines of code in spark app:

spark.sparkcontext.hadoopconfiguration.set("fs.s3.awsaccesskeyid", awsaccesskeyid) spark.sparkcontext.hadoopconfiguration.set("fs.s3.awssecretaccesskey", awssecretaccesskey) 

you can add following config in core-site.xml of hadoop conf , cannot add in code base

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration>   <property>   <name>fs.s3n.awsaccesskeyid</name>   <value>my_aws_access_key_id_here</value>   </property>   <property>   <name>fs.s3n.awssecretaccesskey</name>   <value>my_aws_secret_access_key_here</value>   </property> </configuration> 

to use above file export hadoop_conf_dir=~/private/.aws/hadoop_conf before running spark or conf/spark-env.sh

and iam role there bug open in spark 1.6 https://issues.apache.org/jira/browse/spark-16363


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -