We will be creating a single node Spark cluster on our desktop.
It's an analytics engine for processing large scale data stored in variety of file system and database.
Gigahex allows you to install and manage multiple Spark and Hadoop sandbox clusters on desktop, for faster development and testing..
- Download and install Docker. For docker installation instructions, check the official guide
- Download and install Gigahex. Currently it supports MacOS only.
- Once you've started the Gigahex app, navigate to clusters and click on Add Cluster. Choose the cluster image version and the service as Spark Standalone.
- Share a directory from your workstation, with the container that will be
running the container. As an example, share/mount your home directory
/Users/donald/workspacewith the container at
Note: By default the cluster has the username as
- Click on save and then click on Start the cluster button, to run the single node cluster. Now wait for the cluster to be up and running.
- Click on the terminal icon to get the command to open interactive shell in the cluster.