Categories: Big Data

How Can I Become A Data Scientist?

This article represents thoughts, primarily, on how to become a data scientist. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.

Following are the key points related with different aspects of data scientist, that are described later in this article:

  • Key skills of a data scientist
  • Key roles & responsibilities of a data scientist
  • What would it take me to become a data scientist?
  • What would I create as a Data Scientist?
Key Skills of a Data Scientist
  • Mathematics & Statistics Knowledge: A data scientist would do a great job if he/she has a strong mathematics and statistical background. This skill would be useful to perform data analytics.
  • Strong Computing Skills: A data scientist need to be good at data munging, meaning scraping, parsing, and processing data; This is the useful in preparing the data for analysis. Data scientists are good at taking the data and produce a consumable data-driven apps or data products.
  • Data Visualization Skills: Once prepared the data for analysis, the data scientist need to be good at communicating the result to the business teams. For this purpose, he need to use one or more visualization techniques to explain the results out of data analysis.
  • Data Preservation: A data scientist need to be aware of how to store and manage the data after the analytics and visualization is done.

 

Key Roles & Responsibilities of a Data Scientist

The data scientists play some of the following roles (each matching to an existing IT professional role) and this is why it makes it tricky for a person to become a data scientist or fire a data scientist as he may be required to be skilled in more than one area such as some of the following. Thus, if you are already one of the following, it should be easier to get started on the journey of data science.

  • Business intelligence (BI) professionals: A BI professional works with data warehouse and visualization dashboard. BI professionals need to get good at consuming data and come out with data products. The key is that data schanges very rapidly and they need to accommodate for that aspect.
  • Database administration (DBA): DBAs would be required to work with different form of data (primarily unstructured data) and not only the one which could get stored in databases.
  • Statisticians: Statisticians would be required to get good at working with large data-sets.
  • Visualization experts
  • Machine learning: This one is very close to data scientist. The person good at maching learning would be required to get good at data munging (data preparation).

If you look at above, it may seem like hiring a team to solve the data science problem and it may not be feasible for one person to acquire all the skills.

Following are some of the key activities (responsibilities) that a data scientist perform:

  • Prepare data for statistical analysis: This involves some of the tasks such as data gathering, data cleaning, data restructuring, transforming data, combining the data, merging the data, verifying data, extracting etc. In this case, his programming skills come handy
  • Run the statistical analysis. Here, his mathematical and statistical knowledge comes handy.
  • Interpret and communicate the results. In this case, his visualization skills come handy.

In addition to above, in order to do great job with above, a data scientist would be require to understand the business domain knowledge (represented by Substantive Expertise in diagram below) associated with the data. Following diagram represents key aspects of a data scientist.

 

What would it take for me to become a Data Scientist?

If you are one of the following, read further to understand what may get needed to become a data scientist:

  • Software Engineer/Web Developer: If you are a web developer or a programmer, it may be a fresh start for you. You would be required to learn the fundamentals of some of the following to start the journey of data science:
    • Mathematics and statistics
    • Machine learning
    • Data visualization
  • Statistician: If you are a statistician, you may need to learn different aspects of data munging (scraping, parsing, processing) skills.
  • Business analysts: Business analysts may need to learn one or more data algorithms to be able to do a good job.
  • DBAs: DBAs need to learn on how to work with unstructured data.

 

What would I create as a Data Scientist?

Data scientist is primarily about creating data products that could be used by others to use the data for their own analysis or visualizations. Data products help communicate the results to others. Following are some of the examples:

  • Data-driven apps such as spell-checker
  • Interactive visualizations
  • Online databases
Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Recent Posts

Agentic Reasoning Design Patterns in AI: Examples

In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…

2 months ago

LLMs for Adaptive Learning & Personalized Education

Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…

3 months ago

Sparse Mixture of Experts (MoE) Models: Examples

With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…

3 months ago

Anxiety Disorder Detection & Machine Learning Techniques

Anxiety is a common mental health condition that affects millions of people around the world.…

3 months ago

Confounder Features & Machine Learning Models: Examples

In machine learning, confounder features or variables can significantly affect the accuracy and validity of…

3 months ago

Credit Card Fraud Detection & Machine Learning

Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…

3 months ago