Database Internals: A Deep Dive into How Distributed Data Systems Work
A**S
Very knowledgeable and detail oriented but concise at the same time
Strong recommend to anyone from a beginner to an expert
J**H
Worth the effort
This book took me a few years to get through. It is much more low-level than something like DDIA. Having worked on database code for the past couple of years, that context was crucial in helping me understand the book. The book is great because it covers all of the important ideas in databases and talks about the tradeoffs of using various algorithms. Succinct, detailed, and quality.
F**K
Packed with details, great read
Can't believe I forgot to write a review for this one!Partly it's probably because I usually have less to say (or more precisely it's harder for me to be properly articulate) about things I like than I do about the ones I don't. And boy did I like Database Internals! I'll try my best to explain why, the book and the author surely deserve it.Being a back-end engineer, the main reason for picking this one up was to better understand the distributed databases that I may end up in (or have already had) contact with. With that in mind, I planned on just skimming the first part of the book but imagine my surprise when I found myself Googling BW and LSM trees and going through papers comparing this and that algorithm and their impacts on memory, storage and CPU caches in multicore systems. The geek got suckered in! With my curiosity circuits pleasantly warmed by the first part, I moved on to the second part of the book - the main dish - where a similar scenario unfolded: again I swallowed up whatever was served and ended up digging for more and adding scores of books and papers to my to-read list.All in all, Database Internals reads felt a lot like a trip to the zoo or a local museum: chock full of data structures and algorithms used by modern-day databases (and distributed systems in general), the book will showcase each item with sufficient details for you to grasp what they're about and then provide you with enough bibliography and reference material to last you a lifetime... or at least a couple of years.
N**A
Summarized Recent Overview of Storage & Distributed Systems
Mastery in systems abstraction comes through a philosophical pivot. While an enthusiastic beginner considers successful "use cases", an experienced traveler - through her implicit awareness of futility against entropy - often only considers failure and just tries her best. As more systems, and more of every system, are being dictated by the twin forces of economics and architectural modernism, a much higher percentage of design and development efforts in software should be dedicated to understanding fundamentals (CPU registers, branch prediction etc.) and essential complexities (multi-node consensus, replication failures etc.). This book is a good start.Database Internals is divided into two parts - the first deals with database storage. Especially good sections put a 9-cell flash-light on how many recent architectures are indeed built to tackle complexity bottom-up. i.e., LSM (log-structured merge) trees nicely complement the "write amplification" of Solid-State Disks. The discussion on the canonical B-tree and its multiple siblings (especially Bw-tree) is very well done. The functional difference between locks and latches would be enlightening even for experienced database practitioners - locks are used to manage transactions, latches to guard the *physical* storage representation.The second half of the book focusing on distributed systems is more uneven in quality. It is, however, a great start of economized discussion of about 50 "Best Papers" on Leader Election, Failure/Crash detection, Replication and how distributed systems friendly "consensus protocols", rather than atomic ones like 2-phase commit work better. In many ways, distributed systems have veered from monarchy (single, immutable leader deciding everything, including the next leader) to a true republic (leader is still almost omnipotent, but is regularly replaced by the constituents). The comparative analysis of Paxos, ZAB and Raft - with clear sequence diagrams - is very well done.The quality of writing is good, though could have been helped with more ruthless editing. The area covered is simply too broad, other than the intersect of SSDs and Modern DB architecture which is very deep and very good. Still the book easily deserves at least 4-stars for the enthusiasm and for its good attempt to convey distributed systems pedagogy to general practitioners. Pair it with Martin Kleppmann's "Designing Data Intensive Applications" and Ken Birman's "Guide to Reliable Distributed Systems".
T**S
I've been looking for this book for years
I've been looking for a book that covers these topics for a long time. Even just working with different databases on a day-to-day basis it's incredibly helpful to understand how components of each database actually work. Furthermore the topics covered in this book span a very wide array of different topics and techniques which are incredibly handy for distributed systems. It's really hard to find this much information in a single book. Usuaully you'd have to know each of the topics you're interested in and buy an entire book on that topic. This book packs a pretty in depth view on several topics related to database systems into one book without needless fluff.I highly recommend this book not only to people working on distributed data systems, but to anyone working with databases. This is one of my most frequently referenced books I own.
A**L
One of the Best Books out there
This is one of the best texts covering Database internals. Databases are used everyday, and understanding what happens under the hood is daunting task. This book takes a pragmatic approach on the topic, starting with basics and then taking a deeper dive into how the basic data structures and concept come together. IMHO, this book shall appeal to both Database developer's and engineer's who want to understand how databases work. This book is must have to for the engineer's who really want to get into Database development. Otherwise also this book is a must have reference in general. I personally liked the attention to details in the book on what really matter's when writing a real database. The concepts are equally applicable to SQL and NoSQL databases.
Trustpilot
3 weeks ago
4 days ago