Designing Data Intensive Applications

This article will take 2 minutes to read.

Designing Data-Intensive Applications

Part of my series of notes on papers. This is a book, however, so I will only consider the part that is available online, part 1. Will consider obtaining the rest at the end.

Tags: diseño y arquitectura de software [Link to article](https://link

Introduction

The Internet was done so well that most people think of it as a natural resource like the Pacific Ocean, rather than something that was man-made. When was the last time a technology with a scale like that was so error-free?

Alan Kay, in interview with Dr Dobb’s Journal (2012)

Basic building blocks:

  • Databases
    • Store data to find it later
  • Caches
    • Remember the result of an expensive operation.
  • Indexes
    • Allow users to search by keyword or filter
  • Stream Processing
    • Send message to other process to be handled asynchronously.
  • Batch Processing
    • Periodically crunch large amount of accumulated Data

Useful abstractions for design, different trade-offs depending on implementation.

Thinking about Data Systems

Why lump all abstractions like message queues and DBs in same category? Distinctions have become blurred.

You can make complex/composite data systems from smaller components.

  • How do you ensure data remains correct and complete, even with internal errors?
  • How do you provide consistently good performance, even when parts degrade?
  • How do you scale?
  • What’s a good API?

Main ideas behind Data-intensive Applications:

  • Reliability
    • System should work correctly in face of adversity.
    • Tolerate user mistakes or unexpected use.
    • Prevents unauthorized access and abuse.
    • “continuing to work correctly even when things go wrong”
    • Fault is not failure
      • Fault:
        • One component deviating from spec.
      • Failure:
        • Whole system stops providing required service.
    • It’s possible to increase faults to reduce failure
  • Scalability
    • Should easily grow.
  • Maintainability
    • Many people should be able to work productively.

Notes mentioning this note


Here are all the notes in this garden, along with their links, visualized as a graph.