This article represents information and code/scripts which could be used to get started with Cloudera using Dockers. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.
Following are the key points described later in this article:
To run the cloudera in docker container, one would require to do following configuration to the Docker machine. Open Oracle VM Virtualbox Manager. Stop the default machine. Then, change the settings as shown below.
Increase core size to 2
Increase Memory Size to 8GB
If not done, running “cloudera-manager –express” throws following error:
Memory related error while starting Cloudera manager service
docker pull cloudera/quickstart:latest
FROM cloudera/quickstart:latest
Save the file as cloudera.df and then, use following command to build the image:
docker build -t cloudera -f cloudera.df .
The image is tagged as cloudera.
tar xzf cloudera-quickstart-vm-*-docker.tar.gz
docker import - cloudera/quickstart:latest < cloudera-quickstart-vm-*-docker/*.tar
docker run --privileged=true -ti -d -p 8888:8888 -p 80:80 -p 7180:7180 --name $1 --hostname=quickstart.cloudera -v /c/Users:/mnt/Users $cd_image /usr/bin/docker-quickstart
Note that image is named/tagged as cloudera. You could as well check “docker images” command to find the tag name of Cloudera image and use it in place of “cloudera”. Also, note the port such as 7180, 8888 mapped from guest to host.
Execute following command to start the Cloudera service assuming the you started the container with name as “cdh”. Use the scripts below to start “cdh” cloudera container.
docker exec -ti cdh /home/cloudera/cloudera-manager --express
With above command, Cloudera starts as shown in following diagram.
Cloudera starts in a docker container
Open a browser and access following command: http://192.168.99.100:7180/. It would open up the login page for Cloudera Manager. Enter the login/password as cloudera/cloudera and you are all set!
Following is the script which could be used to install/build the image and run the cloudera container.
FROM cloudera/quickstart:latest
#!/bin/sh
if [ $# == 0 ]; then
echo "This script expect container name argument. Example: ./runCloudera.sh cdh"
exit 100
fi
docker stop $1;docker rm $1
# Build Cloudera image if it does not exists
#
cd_image="cloudera"
cd_df="cloudera.df"
if [ `docker images $cd_image | wc -l` -lt 2 ]; then
echo "Docker Image $cd_image do not exist..."
echo "Builing docker image $cd_image"
if [ -f $cd_df ]; then
docker build -t $cd_image -f $cd_df .
else
echo "Can't find Dockerfile $cd_df in the current location"
exit 200
fi
fi
docker run --privileged=true -ti -d -p 8888:8888 -p 80:80 -p 7180:7180 --name $1 --hostname=quickstart.cloudera -v /c/Users:/mnt/Users $cd_image /usr/bin/docker-quickstart
Open a Docker terminal, place both the files within a folder and execute the command such as “./runCLoudera.sh cdh”. This would build the image and start the container namely “cdh”.
Last updated: 25th Jan, 2025 Have you ever wondered how to seamlessly integrate the vast…
Hey there! As I venture into building agentic MEAN apps with LangChain.js, I wanted to…
Software-as-a-Service (SaaS) providers have long relied on traditional chatbot solutions like AWS Lex and Google…
Retrieval-Augmented Generation (RAG) is an innovative generative AI method that combines retrieval-based search with large…
The combination of Retrieval-Augmented Generation (RAG) and powerful language models enables the development of sophisticated…
Have you ever wondered how to use OpenAI APIs to create custom chatbots? With advancements…