data warehouse
a massive, high-end library where information is carefully curated, organized, and archived for deep research
While they both store data, they serve fundamentally different masters: databases are for running the business; data warehouses are for analyzing the business.
DBs
- are OLTP (Online Transactional Processing) optimized for read/write operations
- normalized data to minimize redundancy. updating a record happens once in a single place.
- often stores data from a single app or process.
Data Warehouses
- are OLAP (Online Analytical Processing) optimized for read-only complex queries.
- de-normalized data, often in a star or snowflake schema. It prioritizes retrieval speed over storage, so it can group related data together.
- stores data from multiple sources
You can't use a DB for everything because if you try to run a massive analytical report on a prod DB you may lock the tables and consume so much CPU that the app might crash. A warehouse can be a separate playground to crunch numbers for hours without affecting the rest of the system.