Data lakes are analytical data systems just as a data warehouse. But where a data warehouse usually holds relational data, a data-lake can hold data sets that are embedded in files.
This is often used with machine learning, for example image classifications. When you have a learning set, this could be part of the data-lake.
Unstructured data examples
- Vectors
- Images
- Videos
They follow the same principles as a data warehouse and also use data-pipelines with the ETL process.
Reference
Kleppmann, Martin, and Chris Riccomini. Designing Data-Intensive Applications: The Big Ideas behind Reliable, Scalable, and Maintainable Systems. Second Edition, O’Reilly Media, 2026.