python - Set up spark using an external virtual machine -
i not huge computer person many others on here, majored in math matlab main computer knowledge. have got involved apache spark through excellent edx course offered berkeley.
the method used setting spark provided in great step step guide, involved: downloading oracle vm virtual box ubuntu 32bit vm, through use of vagrant (again i'm not hugely computer-y not 100% sure how worked or is) connect ipython notebook. enabled me have access spark on internet , code in python pyspark, want do.
everything going until second lab exercise, became apparent windows laptop has insufficient free memory (just 3 gb , 4 years old) after continually froze , crashed when trying work large datasets.
it not possible have vm in vm apparently have spent of today looking alternative ways of setting spark no avail; guides aimed @ more computer knowledge have.
my (likely naive) idea rent external machine can interface through windows laptop before virtual machine operates outside of memory of laptop i.e. in cloud (using of ubuntu, windows, etc.). want move oracle vm virtual box outside source rid computer of memory burdens , use ipython notebook before.
how can set virtual machine use computational side of spark in ipython notebook?
or there alternate method simple follow?
don't run vms. instead:
- download latest spark version. (1.4.1 @ moment.)
- extract archive.
- run
bin/pyspark.cmd
.
it's not ipython notebook, can run python code against local spark instance.
if want beefier instance, same on beefy remote machine. example ec2 m4.2xlarge
$0.5 per hour 8 cores , 30 gb of ram.
Comments
Post a Comment