QuicksearchDisclaimerThe individual owning this blog works for Oracle in Germany. The opinions expressed here are his own, are not necessarily reviewed in advance by anyone but the individual author, and neither Oracle nor any other party necessarily agrees with them.
|
To dedup or not to dedup - that results in a lot of questionsMonday, February 8. 2010Trackbacks
Trackback specific URI for this entry
No Trackbacks
Comments
Display comments as
(Linear | Threaded)
Did somebody try to compare the difference in power consumption (if there is some) on non-ZFS-deduped / ZFS-deduped systems? Simply, if it's not more (power/storage_response_time) efficient to add extra HDD into zpool instead, forget about "Do I have enough spare IOPs?", and simply go and take the data directly from the HDD. And yes, it should depend on the stored content (dedup percentage vs. storage capacity vs. etc.). Thanks.
ZFS computes the checksums anyway. The difference with hash-only dedup is just the lookup to a table, with hash-and-compare you need an addtional IOPS, but that isn't much of a problem as i explain later.
For reads it makes no difference, if the data is deduplicated or not. As deduplicated data is only cached once for potentially many blocks, the cache in storage arrays will be used more efficiently, thus potentially resulting in a lower IOPS count getting to the disks The spare IOPs problem isn't a problem with ZFS hash-only, a ignorable one for ZFS hash-and-compare (you would do the IOPS without dedup anyway, you have a read io instead of a write io, and just in the case of a false positive you have to issue a write io, but the probability of that is pretty low, you can't lose, but you can win a lot) and a big one for weak-checksum dedup, as you have to the problem to check many dedup candidates.
Interesting read and it inspired me to check upon the dedupe features in TSM6.1. Seems that it uses SHA-1, non-compare.
However, regarding the performance impact of synchronous dedupe: Even if the hash table is in memory, shouldn't every alteration be flushed to persistant storage(disk or NVRAM)? If that is the case my immediate thought is that synchronous dedupe may come at a significant performance penalty or at least would require a lot extra of NVRAM and computing power.
Do you mean Andy or lparvirt? I guess lparvirt. So here is what I think: Maybe lparvirt oversaw that ZFS is able to leverage SSDs for L2ARC and therefor the hastable is already in nonvolatile Read Cache. What speeds up the dedupe is the performance on querying the hashtable not writing it. So having a copy of the hashtable in L2ARC should give you viable performance speedup.
|
+1The LKSF bookThe book with the consolidated Less known Solaris Tutorials is available for download here
Web 2.0Contact
Networking xing.com My photos CommentsMon, 21.05.2012 04:44
Hi Greg,
With regards to IO
PS I have seen terrible result
s using a 60GB SATA2 SSD with
USB2.0 - USB2 really cho [...]
about ZFS Dedup Internals
Sat, 19.05.2012 09:50
There is no impact to boot/imp
ort times, as the DDT is loade
d as needed ... so the pool is
imported as fast as wit [...]
about Tracks
Tue, 15.05.2012 19:46
Very nice, I like the way the
eye is taken right into the pi
cture. Did you use any filter
s not to make the green [...]
Buttons![]() This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Germany License
![]() ![]() ![]() Blog Administration |