Category Archives: Big Data

Learn R – What are Vectors – Code Examples

vector

This article represents high level concepts in relation with Vector data type in R programming language along with code samples. For those new to R language, it should be noted that R provides a console-based platform to perform analysis on data. R can be seen as a programming language for data scientist. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: What are Vectors? Vectors – Code Examples   What are Vectors? Vector, in R, can be defined as a collection of things of same data type. Simply speaking, it …

Continue reading

Posted in Big Data. Tagged with , .

Big Data – Top 6 Frameworks Required to Get Started

This article represents top 6 software frameworks (or tools) to get started with Big Data POC projects. This article may be of interest to those who are beginning with Big Data and want to understand about tools/frameworks required to get started with their Big Data POC projects. The article presents only the  bare minimum set of frameworks that are required to get started. I am sure there could be more to this list. However, my objective is to cover only the minimum set. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are key functional areas in Big Data …

Continue reading

Posted in Big Data. Tagged with .

Data Science – Commonly Used Plot Parameters in R Programming

This article represents some of the commonly used plot parameters across different plot commands, while you are working with different kind of plots in R. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: What are some of the common plots (commands) in R? Commonly Used Plot Parameters   What are some of the common plots (commands) in R? Following represents some of the plots (commands) used in R language for different purposes. I shall be writing different blog on different use-cases where one should use one or more …

Continue reading

Posted in Big Data. Tagged with , .

Data Science – Why Learn R?

This article represents thoughts on why it is OK to learn yet another programming language named as R for doing data analysis. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the some of the key points described later in this article: Why can’t I use Java/C etc for data analysis? Key Aspects of Data Analysis vis-a-vis R Language Why R fundamentally? Advantages & Disadvantages of R   Why can’t I use Java/C etc for data analysis? I have worked a lot with Java/C/PHP/C++ etc in my career. From whatever I have known about R by now, …

Continue reading

Posted in Big Data. Tagged with , .

How Can I Become A Data Scientist?

data-scientist

This article represents thoughts, primarily, on how to become a data scientist. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points related with different aspects of data scientist, that are described later in this article: Key skills of a data scientist Key roles & responsibilities of a data scientist What would it take me to become a data scientist? What would I create as a Data Scientist? Key Skills of a Data Scientist Mathematics & Statistics Knowledge: A data scientist would do a great job if he/she has a strong mathematics and statistical background. …

Continue reading

Posted in Big Data. Tagged with .

Big Data – How to Get Started with Data Science

This article represents my opinion on what would it take to get started with Data Science. As I started exploring Big Data, one thing that became clear is that I may not be successful with Big Data unless I have learnt and applied Data Science to make sense out of Big Data (the data with 3Vs: Volume, Velocity, Variety). This is where I started to find out on How to Get Started with Data Science. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are the key points described later in this article: Data Science is NOT Easy …

Continue reading

Posted in Big Data. Tagged with .

Big Data – Top 8 Use Cases for Beginners

This article represents top 8 Big Data use cases that beginners could get started with, and create one or more proof-of-concept (POC) projects around these use cases. I compiled the list after digging enough at various places on web, videos, webinar etc. Different use cases mentioned below are only briefly discussed and each of them shall be explained later in separate articles. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos. Following are use cases are described later in this article: Social sentiment analysis Customer interaction analysis Pattern matching Publicly available data analysis Web pages data analysis Clinical data …

Continue reading

Posted in Big Data.

How to Start a Big Data Practice

This article represents key aspects of starting up Big Data practice in your organization. Currently, I have started working in the same area and this blog is the result of my research. Hope you find it useful. Please feel free to comment/suggest if I missed to mention one or more important points. Also, sorry for the typos.   Big Data Center of Excellence (COE) It may be a good idea to plan around setting up a Big Data Center of Excellence (COE)whose main objective would be take a holistic approach towards following two key aspects of Big Data from different perspectives such as setting up team, evaluating tools & frameworks, …

Continue reading

Posted in Big Data. Tagged with .

ShriGB – A Semantic Financial Search Engine

ShriGB, as the name goes, is about extracting valuable insights (“Shri” – respect) from large/big data (“GB”) . The project is aimed to leverage semantic web & big data technologies to extract meaningful insights from unstructured financial data lying across the web.  The data is mostly present in raw form and is useful to some sections of society although, can be used by different section of people for different reasons. Lets take a look at following example: Dabur to set up manufacturing units in Uttaranchal The above data can mean some of the following: More jobs are going to be created in Uttaranchal region This may lead to boost in …

Continue reading

Posted in Big Data, Semantic Web. Tagged with .

Big Data & Predictive Modelling

Talk about big data and things that appear first in an engineer’s mind is Hadoop & related technology. The key thing that is getting missed time and again by many developers’ working on Big Data is a sense of reading/understanding/learning the data and designing algorithms to achieve different objectives such as derivations, predictions etc.   One of the key aspect of data science which is also key to Big Data is Predictive Modelling. I wanted to do some quick research and develop an understanding around this topic. However, while researching, it was found that the topic does include some complex underlying mathematical models which will surely be very hard to …

Continue reading

Posted in Big Data. Tagged with .

Key to Big Data: Data Science & Data Framework

Good familiarity with data science is key to getting on board with Big Data implementations. Almost all software services provider has added another link for Big Data for their services offerings. Most of them have an understanding that a Hadoop team comprising of technical team familiar with Hadoop technology stack shall be able to successfully implement Big Data project. However, this is far from the reality. One of the keys to successful Big Data implementation projects is “Data Science“. And, another aspect is “Data Framework“. The two when done jointly would get a team do successful Big Data implementation. What is Data Science? Data Science, simply speaking, is understanding meta-data …

Continue reading

Posted in Big Data. Tagged with , , .

Ok Glass, Show the Best Buy – Can that be the Killer Glassware?

Could this be the killer glassware app for Google Glass? Could this help boost the google glass adoption among consumers? Well, there has been smartphone applications using which one scans the Barcode of the product on the shelf and get the details about it. But, with google glass, it would be as easy as user looking at a product on the shelf and saying, “ok glass, show the best buy”. This would get him the most appropriate competitive products along with shop detail based on various factors some of which are listed below. Keep on reading… Let’s try and understand what might show up on google glass if someone says, …

Continue reading

Posted in Big Data, Google Glass. Tagged with , , .

Google Glass & Big Data – Boon for Crime Control

A class of bloggers & writers have been writing about the google glass hurting the privacy. Thus, this may pose barrier to widespread acceptance of google glass device. However, google glass shall surely act as a boon to crime control and sooner than later, government will get on board for acceptance for glass device for police personnel.   Google Glass for Capturing Pictures from Crime Spot However, to think of one of the out-of-box benefits provided by google glass, which is “take a picture”, this may prove to be a boon to police department across the globe. Imagine police personnel start wearing a cool glass device. They could easily capture …

Continue reading

Posted in Big Data, Google Glass. Tagged with , , .

Google Glass & Enterprise Adoption

With Google Glass Mirror API been published, all sorts of ideas have started floating around the internet. One such idea that I have been wondering upon is, how would enterprise adopt the Google glass. That means whether an enterprise would want to buy google glasses for its employees in the same way that some companies have been providing iPads to their employees in current scenario. There are multiple different reasons which may lead enterprise to adopt the google glass to certain class of employees to start with. Lets take a look at some of the scenarios. 1. Whiteboarding Pictures: As IT organizations have started moving to adoption of Agile SCRUM …

Continue reading

Posted in Big Data, Google Glass. Tagged with , , .

Big Data is NOT Just about Hadoop Stack Implementation

That is something any one can with a decent technical skill and Java experience could do it. Big Data has lot to do with Data science. And, to stand out as a Big Data solution provider in the IT marketplace, one needs to have a team of Data scientist who work with technologist to implement Big data solution suggested by them. Thus, following is how the Big Data team may look like? Project/Delivery Manager Data Scientist Technical Architect (Hadoop) Technical team including team/tech lead, developers, testers etc Build/Configuration Engineer: This may be important owing to the Big Data typical cluster configurations requirement and the complexities surrounding it. What is a …

Continue reading

Posted in Big Data. Tagged with , .

Google Glass to Revolutionize Big Data

Google glass project, once in full swing and with full acceptance by consumers, will turn out to be a biggest source of data which could be treated best by applying big data technologies. Simply speaking, Big Data is data set having following characteristics: Volume Velocity Variety Veracity That said, Google Glass will add variety of data in greater volume at much greater velocity. Some of the existing big data technologies that can be thought to help great deal to store and process data acquired by Google Glass are following: Hadoop (HDFS & MapReduce) HBase for non-relational database to work with data stored with Hadoop Hive for business analytics Solr (Lucene) …

Continue reading

Posted in Big Data, Google Glass. Tagged with , , , .