Good familiarity with data science is key to getting on board with Big Data implementations. Almost all software services provider has added another link for Big Data for their services offerings. Most of them have an understanding that a Hadoop team comprising of technical team familiar with Hadoop technology stack shall be able to successfully implement Big Data project. However, this is far from the reality.
One of the keys to successful Big Data implementation projects is “Data Science“. And, another aspect is “Data Framework“. The two when done jointly would get a team do successful Big Data implementation.
What is Data Science?
Data Science, simply speaking, is understanding meta-data & relationships about the data. The key to data science is “Ontology“. Ontology is nothing but defining concepts and their relationships out of a set of data. Simply speaking, when you read a paragraph, you try and understand the key concepts in term of few words or terminologies and, try to relate them in your mind for better understanding and future reference. This is dealt under Ontology.
One another concept is Resource Description Framework (RDF). RDF helps define the concepts & relationship (Ontology) in form of Triples and helps develop Taxonomy (hierarchical relationship) between the data.
The challenge with data scientist is to relate the ontology with business objectives and suggest software engineer to plan for development (map-reduce algorithm) appropriately. The data scientist, thus, would have to work with both, business analyst and the software engineers.
Take a look at following statement:
Scotts garments to set up new units in Karnataka, Maha
Following are some triples that can be derived from above data:
Subject: Scott garments, Predicate: to set up, Object: new units
Subject: Scott garments, Predicate: to invest in, Object: Karnataka, Maha
Subject: New units, Predicate: to be set up in, Object: Karnataka, Maha
Above data can be useful to so many categories of people such as students, recruitment consultants, investors etc. However, to be able to decipher above out of a statement requires understanding on concepts such as Ontology, RDF, Data Taxonomy etc.
In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…
Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…
With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…
Anxiety is a common mental health condition that affects millions of people around the world.…
In machine learning, confounder features or variables can significantly affect the accuracy and validity of…
Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…