Conceptually, a data lake is a flat data store to collect data in its original form, without the need to enforce a predefined schema. Instead, new schemas or views are created “on demand”, providing a far more agile and flexible architecture while enabling new types of analytical insights. AWS provides many of the building blocks required to help organizations implement a data lake.
This session introduces key concepts for a data lake and presents aspects related to its implementation. We will discuss critical success factors, pitfalls to avoid as well as operational aspects such as security, governance, search, indexing and metadata management. We will also provide insight on how AWS enables a data lake architecture.