Which process divides the data into chunks for EMC Avamar deduplication data flow processing?
A . Compression
B . Hash cache checking
C . SHA-1 Hashing
D . Sticky-byte factoring
Answer: D
Explanation:
Assume the file HAS been changed, and therefore the hash of the file no longer matches the hash in the file cache. From here, the data will be broken into variable segment lengths through Avamar’s patented sticky byte factoring process. These segments are then compressed and hashed, and the resulting hash is written to a HASH CACHE file locally on the client. This file is also loaded into RAM when a backup is initiated.
http://www.egroup-us.com/2011/09/emc-avamar-how-source-global-dedupe-magic-happens/
Avamar, has a rather nifty technology called Sticky Byte Factoring which allows it to identify the changed information inside a file by breaking the file into variable length objects, this leads to much greater efficiency than fixed size approaches as changes made early in a fixed length sequence affect all subsequent blocks/ chunks/objects/whatever in that sequence. This in turn changes all the fingerprints taken following the changed data which means you end up with a lot of new blocks/chunks/objects/whatever even if the data really hasn’t changed all that much. Sticky Byte Factoring on the other hand can tell what exactly has changed, not just that things have changed.
http://storagezilla.typepad.com/storagezilla/2007/09/the-future-is-v.html