Categories: Big Data

Big Data – Team to Hire for Big Data Practice

This article represents thoughts on Big data team composition and different considerations to make in order to hire and build an effective Big Data team. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.
A Big data team would need to cover following two key areas for becoming an effective team ready to deliver on key Big Data initiatives.
  • Data engineering
  • Data science

 

Data Engineering Team

You would want to build a team who plays key role in some of the following areas:

  • Data processing (Hadoop Map/Reduce)
  • Data storage (HDFS/HBase)
  • Data coordination (Zookeeper)
  • Data monitoring/management

For above skills, following are different job roles that would match.

  • Hadoop Engineer: This guy should be able to take care of aspects such as data processing, data storage etc.
  • Hadoop/Big Data Admin: This person should be responsible to manage the Hadoop infrastructure.

 

Data Science Team

Following are key skills that would form the part of job description of a data scientist:

  • Machine learning
  • Mathematics & statistics

To be able to work in above area, the data scientist may be required to one or more of the following tools/libraries:

  • R platform
  • Hive or PIG
  • Java/Python libraries

One may use one of the following approaches in order to build a data science team:

  • Look for the employees within the company having one or both of the above skills or experience with one or above languages. This may be a little tricky as you would be required to arrange for training sessions for these guys to get upto speed with above topics in data science.
  • Hire from outside (lateral hire), a person having both of the above skills or a set of people (two to start with) well versed with machine-learning and maths & statistics skills. This may prove a bit tough and expensive as well. However, it may be good idea to hire at least one senior guy and provide them with a team of two-three junior resources/freshers.
  • Hire a set of freshers (MCAs or Msc Maths/Statistics) who have been doing machine learning as well as mathematics & statistics in their school or college. This may work out well as the freshers should take no time to get up and running as this is what they might have been doing in the school/college.

 

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com

Recent Posts

Feature Engineering in Machine Learning: Python Examples

Last updated: 3rd May, 2024 Have you ever wondered why some machine learning models perform…

7 hours ago

Feature Selection vs Feature Extraction: Machine Learning

Last updated: 2nd May, 2024 The success of machine learning models often depends on the…

1 day ago

Model Selection by Evaluating Bias & Variance: Example

When working on a machine learning project, one of the key challenges faced by data…

1 day ago

Bias-Variance Trade-off in Machine Learning: Examples

Last updated: 1st May, 2024 The bias-variance trade-off is a fundamental concept in machine learning…

2 days ago

Mean Squared Error vs Cross Entropy Loss Function

Last updated: 1st May, 2024 As a data scientist, understanding the nuances of various cost…

2 days ago

Cross Entropy Loss Explained with Python Examples

Last updated: 1st May, 2024 In this post, you will learn the concepts related to…

2 days ago