Categories: Big DataDockers

Dockers – How to Get Started with Spark on Windows

This article represents tips on how to get started with Apache Spark on Windows using Dockers. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.

If you are familiar with Dockers, the instructions below would help you get started with Spark in no time.

  • Download the Spark from https://spark.apache.org/downloads.html page. Remember to select a package type with option such as “Pre-built…”. Once the zipped files are downloaded, unzip the files under the location “C:\Users\<Username>”
  • Build Java8 image and start the container. Follow the instructions on this page, http://vitalflux.com/dockers-how-to-get-started-with-java8-dev-environment/.
  • Once the container is started, go to the folder where you would find spark files. Go to bin folder. The path could look like cd /mnt/Users/<Your_Username>/spark-1.6.0-bin-hadoop2.6/bin.
  • Execute the command “./pyspark” and you will get started with following screenshot:

 

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com

Recent Posts

Linear Regression T-test: Formula, Example

Last updated: 7th May, 2024 Linear regression is a popular statistical method used to model…

34 mins ago

Feature Engineering in Machine Learning: Python Examples

Last updated: 3rd May, 2024 Have you ever wondered why some machine learning models perform…

4 days ago

Feature Selection vs Feature Extraction: Machine Learning

Last updated: 2nd May, 2024 The success of machine learning models often depends on the…

5 days ago

Model Selection by Evaluating Bias & Variance: Example

When working on a machine learning project, one of the key challenges faced by data…

5 days ago

Bias-Variance Trade-off in Machine Learning: Examples

Last updated: 1st May, 2024 The bias-variance trade-off is a fundamental concept in machine learning…

6 days ago

Mean Squared Error vs Cross Entropy Loss Function

Last updated: 1st May, 2024 As a data scientist, understanding the nuances of various cost…

6 days ago