amazon s3 - Spark AVRO S3 read not working for partitioned data -


when read specific file works:

val filepath= "s3n://bucket_name/f1/f2/avro/dt=2016-10-19/hr=19/000000"         val df = spark.read.avro(filepath) 

but if point folder read date partitioned data fails:

val filepath="s3n://bucket_name/f1/f2/avro/dt=2016-10-19/"

i error:

exception in thread "main" org.apache.hadoop.fs.s3.s3exception: org.jets3t.service.s3serviceexception: s3 head request failed '/f1%2ff2%2favro%2fdt%3d2016-10-19' - responsecode=403, responsemessage=forbidden @ org.apache.hadoop.fs.s3native.jets3tnativefilesystemstore.handleserviceexception(jets3tnativefilesystemstore.java:245) @ org.apache.hadoop.fs.s3native.jets3tnativefilesystemstore.retrievemetadata(jets3tnativefilesystemstore.java:119) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ org.apache.hadoop.io.retry.retryinvocationhandler.invokemethod(retryinvocationhandler.java:186) @ org.apache.hadoop.io.retry.retryinvocationhandler.invoke(retryinvocationhandler.java:102) @ org.apache.hadoop.fs.s3native.$proxy7.retrievemetadata(unknown source) @ org.apache.hadoop.fs.s3native.natives3filesystem.getfilestatus(natives3filesystem.java:414) @ org.apache.hadoop.fs.filesystem.exists(filesystem.java:1397) @ org.apache.spark.sql.execution.datasources.datasource$$anonfun$12.apply(datasource.scala:374) @ org.apache.spark.sql.execution.datasources.datasource$$anonfun$12.apply(datasource.scala:364) @ scala.collection.traversablelike$$anonfun$flatmap$1.apply(traversablelike.scala:241) @ scala.collection.traversablelike$$anonfun$flatmap$1.apply(traversablelike.scala:241) @ scala.collection.immutable.list.foreach(list.scala:381) @ scala.collection.traversablelike$class.flatmap(traversablelike.scala:241) @ scala.collection.immutable.list.flatmap(list.scala:344) @ org.apache.spark.sql.execution.datasources.datasource.resolverelation(datasource.scala:364) @ org.apache.spark.sql.dataframereader.load(dataframereader.scala:149) @ org.apache.spark.sql.dataframereader.load(dataframereader.scala:132) @ com.databricks.spark.avro.package$avrodataframereader$$anonfun$avro$2.apply(package.scala:34) @ com.databricks.spark.avro.package$avrodataframereader$$anonfun$avro$2.apply(package.scala:34) @ basics3avro$.main(basics3avro.scala:55) @ basics3avro.main(basics3avro.scala) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ com.intellij.rt.execution.application.appmain.main(appmain.java:147) 

am missing anything?

what newer, maintained, s3a client report?


Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -