Welcome to the Tetra Concepts LLC Spark Meetup training!
http://docs.docker.com/mac/started/
sudo docker pull kevinfaro/tetra_spark
sudo docker run -it --rm kevinfaro/tetra_spark
now lets check to make sure it is all wired up right - type into the spark shell:
val sentences = sc.textFile("data/sentences.txt.gz")
sentences.count()
you should see it return 19353
see the docker branch
see the vagrant branch
We have 2 data files to use
- data/enron_small.json.gz
- data/sentences.txt.gz