Categories: Distributed Systems

Lessons from Google on Distributed Storage System

The article lists down the lessons learnt by Google Engg. team while they implemented Google BigTable, a distributed storage system, which is used to manage structured data of more than 60 Google products or so. Read further about Google BigTable on this page.

KISS Principle for Simpler Design & Coding

With distributed systems bound to be complex and the related codebase expected to evolve over a period of time, it may be good idea to keep the design and coding simple for ease of code maintenance and debugging. One could apply the KISS principle by breaking down the problem into smaller pieces and do the design and coding appropriately. Read more about KISS principle on some of the following pages:

 

YAGNI Principle to Avoid Complexity

While working with distributed systems, one might want to delay adding new features until it is clear how the new features will be used. This is similar to what is mentioned by YAGNI principle – “Always implement things when you actually need them, never when you just foresee that you need them.”

 

Failures are bound to happen. Plan for it.

Large distributed systems are vulnerable to many types of failures, some of which are listed below. One should, therefore, plan to take care of each one of them in a diligent manner and not make any assumptions whatsoever.

  • Memory and network corruption
  • Large clock skew
  • Hung machines
  • Extended and asymmetric network partitions
  • Bugs in related systems
  • Overflow of disk quotas
  • Planned and un- planned hardware maintenance.

 

Monitoring is the Key

While working with distributed systems, one may want to setup proper system-level monitoring to do a regular check on some of the following aspects:

  • Lock contention
  • Slow writes
  • Hung accesses to one or more tables
  • Clusters
Nidhi Rai

Nidhi has been been actively blogging in different technologies such as AI / machine learning and internet technologies. Her field of interest includes AI / ML, Java, mobile technologies, UI programming such as HTML, CSS, Javascript (Angular/ReactJS etc), open-source and other related technologies.

Share
Published by
Nidhi Rai

Recent Posts

Retrieval Augmented Generation (RAG) & LLM: Examples

Last updated: 25th Jan, 2025 Have you ever wondered how to seamlessly integrate the vast…

3 weeks ago

How to Setup MEAN App with LangChain.js

Hey there! As I venture into building agentic MEAN apps with LangChain.js, I wanted to…

3 weeks ago

Build AI Chatbots for SAAS Using LLMs, RAG, Multi-Agent Frameworks

Software-as-a-Service (SaaS) providers have long relied on traditional chatbot solutions like AWS Lex and Google…

4 weeks ago

Creating a RAG Application Using LangGraph: Example Code

Retrieval-Augmented Generation (RAG) is an innovative generative AI method that combines retrieval-based search with large…

1 month ago

Building a RAG Application with LangChain: Example Code

The combination of Retrieval-Augmented Generation (RAG) and powerful language models enables the development of sophisticated…

1 month ago

Building an OpenAI Chatbot with LangChain

Have you ever wondered how to use OpenAI APIs to create custom chatbots? With advancements…

1 month ago