Was reading this during a long flight. Interesting review of the design decisions behind applications dealing with much (very much!) data.
Martin Kleppmann, Designing Data-Intensive Applications, 2017.
Contents:
Foundations of Data Systems
Chapter 1 Reliable, Scalable, and Maintainable Applications
Thinking About Data Systems
Reliability
Scalability
Maintainability
Chapter 2 Data Models and Query Languages
Relational Model Versus Document Model
Query Languages for Data
Graph-Like Data Models
Chapter 3 Storage and Retrieval
Data Structures That Power Your Database
Transaction Processing or Analytics?
Column-Oriented Storage
Chapter 4 Encoding and Evolution
Formats for Encoding Data
Modes of Dataflow
Distributed Data
Chapter 5 Replication
Leaders and Followers
Problems with Replication Lag
Multi-Leader Replication
Leaderless Replication
Chapter 6 Partitioning
Partitioning and Replication
Partitioning of Key-Value Data
Partitioning and Secondary Indexes
Rebalancing Partitions
Request Routing
Chapter 7 Transactions
The Slippery Concept of a Transaction
Weak Isolation Levels
Serializability
Chapter 8 The Trouble with Distributed Systems
Faults and Partial Failures
Unreliable Networks
Unreliable Clocks
Knowledge, Truth, and Lies
Chapter 9 Consistency and Consensus
Consistency Guarantees
Linearizability
Ordering Guarantees
Distributed Transactions and Consensus
Derived Data
Chapter 10 Batch Processing
Batch Processing with Unix Tools
MapReduce and Distributed Filesystems
Beyond MapReduce
Chapter 11 Stream Processing
Transmitting Event Streams
Databases and Streams
Processing Streams
Chapter 12 The Future of Data Systems
Data Integration
Unbundling Databases
Aiming for Correctness
Doing the Right Thing
No comments:
Post a Comment