DevOps – Setup Spark Cluster


Provision 4 boxes

There are few options to achieve that:

  • Use Docker to simulate it – for testing, I like this option better!
  • Use AWS Spot Instances (much cheaper)
  • Use physical boxes

Install Hadoop and Java

Check out DevOps – Setup Hadoop Cluster to set it up.

Install Spark

Spark All Nodes Configurations

Spark Master Specific Configurations

You can go to spark_master_public_dns:8080 in your browser to check if all Worker nodes are online.

Install Jupyter and run RDD


DevOps – Setup Spark Cluster

log in

Use demo/demo public access

reset password

Back to
log in
Choose A Format
Personality quiz
Trivia quiz
Poll
Story
List
Meme
Video
Audio
Image