One of the most interesting articles explaining De-duplication from a DBA’s perspective.
Storage for DBAs: Data deduplication – or “dedupe” – is a technology which falls under the umbrella of data reduction, i.e. reducing the amount of capacity required to store data. In very simple terms it involves looking for repeating patterns and replacing them with a marker: as long as the marker requires less space than the pattern it replaces, you have achieved a reduction in capacity. Deduplication can happen anywhere: on storage, in memory, over networks, even in database design – for example, the standard database star or snowflake schema. However, in this article we’re going to stick to talking about dedupe on storage, because this is where I believe there is a myth that needs debunking: databases are not a great use case for dedupe.
Deduplication Basics: Inline or Post-Process
View original post 1,290 more words