Spark 1.4 image for Google Cloud? -


with bdutil, latest version of tarball can find on spark 1.3.1:

gs://spark-dist/spark-1.3.1-bin-hadoop2.6.tgz

there few new dataframe features in spark 1.4 want use. chance spark 1.4 image available bdutil, or workaround?

update:

following suggestion angus davis, downloaded , pointed spark-1.4.1-bin-hadoop2.6.tgz, deployment went well; however, run error when calling sqlcontext.parquetfile(). cannot explain why exception possible, googlehadoopfilesystem should subclass of org.apache.hadoop.fs.filesystem. continue investigate on this.

caused by: java.lang.classcastexception: com.google.cloud.hadoop.fs.gcs.googlehadoopfilesystem cannot cast org.apache.hadoop.fs.filesystem @ org.apache.hadoop.fs.filesystem.createfilesystem(filesystem.java:2595) @ org.apache.hadoop.fs.filesystem.access$200(filesystem.java:91) @ org.apache.hadoop.fs.filesystem$cache.getinternal(filesystem.java:2630) @ org.apache.hadoop.fs.filesystem$cache.get(filesystem.java:2612) @ org.apache.hadoop.fs.filesystem.get(filesystem.java:370) @ org.apache.hadoop.fs.filesystem.get(filesystem.java:169) @ org.apache.hadoop.fs.filesystem.get(filesystem.java:354) @ org.apache.hadoop.fs.path.getfilesystem(path.java:296) @ org.apache.hadoop.hive.metastore.warehouse.getfs(warehouse.java:112) @ org.apache.hadoop.hive.metastore.warehouse.getdnspath(warehouse.java:144) @ org.apache.hadoop.hive.metastore.warehouse.getwhroot(warehouse.java:159) @ org.apache.hadoop.hive.metastore.warehouse.getdefaultdatabasepath(warehouse.java:177) @ org.apache.hadoop.hive.metastore.hivemetastore$hmshandler.createdefaultdb_core(hivemetastore.java:504) @ org.apache.hadoop.hive.metastore.hivemetastore$hmshandler.createdefaultdb(hivemetastore.java:523) @ org.apache.hadoop.hive.metastore.hivemetastore$hmshandler.init(hivemetastore.java:397) @ org.apache.hadoop.hive.metastore.hivemetastore$hmshandler.<init>(hivemetastore.java:356) @ org.apache.hadoop.hive.metastore.retryinghmshandler.<init>(retryinghmshandler.java:54) @ org.apache.hadoop.hive.metastore.retryinghmshandler.getproxy(retryinghmshandler.java:59) @ org.apache.hadoop.hive.metastore.hivemetastore.newhmshandler(hivemetastore.java:4944) @ org.apache.hadoop.hive.metastore.hivemetastoreclient.<init>(hivemetastoreclient.java:171) 

asked separate question exception here

update:

the error turned out spark defect; resolution/workaround provided in above question.

thanks!

haiying

if local workaround acceptable, can copy spark-1.4.1-bin-hadoop2.6.tgz apache mirror bucket control. can edit extensions/spark/spark-env.sh , change spark_hadoop2_tarball_uri='<your copy of spark 1.4.1>' (make service account running vms has permission read tarball).

note haven't done any testing see if spark 1.4.1 works out of box right now, i'd interested in hearing experience if decide give go.


Comments

Popular posts from this blog

php - Zend Framework / Skeleton-Application / Composer install issue -

c# - Better 64-bit byte array hash -

python - PyCharm Type error Message -