Is Amazon AWS/EBS snapshotting just LVM, or what?

Ken D'Ambrosio ken at jots.org
Thu Sep 28 13:32:40 EDT 2017


I would say it's unlikely to be LVM, because LVM is content-ignorant; it 
snapshots the entire volume, which is inefficient, and when you're 
Amazon, you care a LOT about being efficient.  Instead, I imagine 
they're using some content-aware CoW solution such as ZFS.  But, 
whatever mechanism, I agree with your opinion: I doubt that their 
solution -- almost certainly CoW of some sort -- stands a chance of 
being more than even slightly impactful.

$.02, YMMV and other assorted disclaimers,

-Ken


On 2017-09-28 13:16, Joshua Judson Rosen wrote:
> I'm working on a project that uses Amazon AWS-provided VPS instances,
> and the other guy on the project is telling me that "snapshotting
> hourly may degrade performance",
> and I'm trying to determine where that's actually true. My gut feeling
> is that it sounds kind of bogus.
> 
>> From the information I've been able to find about how Amazon's stuff 
>> works (either in terms
> of how it's _implemented_ [for which I'm finding basically no insight]
> or how it's _characterized_
> [in the engineering sense, not the literary sense]...), it really
> sounds a _lot_ like Amazon
> is just using LVM snapshots, e.g. from 
> <https://aws.amazon.com/ebs/faqs/>:
> 
> 	"snapshots can be done in real time while the volume is attached and 
> in use.
> 	 However, snapshots only capture data that has been written to your
> Amazon EBS volume,
> 	 which might exclude any data that has been locally cached by your
> application or OS."
> 
> 	"By design, an EBS Snapshot of an entire 16 TB volume should take no
> longer than the time
> 	 it takes to snapshot an entire 1 TB volume. However, the actual time
> taken to create
> 	 a snapshot depends on several factors including the amount of data
> that has changed
> 	 since the last snapshot of the EBS volume."
> 
> ... though I'm not entirely sure how to interpret that last bit about
> "time taken to create a snapshot
> depends on... the amount of data that has changed since the last 
> snapshot";
> the _first half of that statement_ reads as "creating a snapshot is
> constant time",
> which basically screams to me "copy-on-write just like LVM, and
> they're probably implemented
> in terms of LVM".
> 
> Any insight here as to whether my gut is correct on this, or whether
> I'm actually likely
> to notice an impact from hourly snapshots of, say, a 200-GB volume?
> How about a 1-TB volume?
> 
> The only thing I'm seeing from Amazon that seems to _vaguely_ support
> (maybe) the notion
> that `snapshotting too often' would be something to worry about is
> this bit from elsewhere
> in that same FAQ page (under the heading of "performance", whereas the
> others were
> under the heading of "snapshots" and a subheading of "performance
> consistency of my HDD-backed volumes":
> 
> 	Another factor is taking a snapshot which will decrease expected
> write performance
> 	down to the baseline rate, until the snapshot completes.
> 
> ... and, taken in the context of the previously-cited notes about
> snapshots being
> `not base on volume-size but maybe influenced by
> changed-since-last-snapshot set size'
> (and in the context of the explanations they give for HDD-backed vs.
> SSD-backed storage),
> I'm basically reading that as:
> 
> 	`if you're using HDD-backed storage then it's because you care about
> *throughput*
> 	 more than *response time* and are likely to be monitoring throughput,
> 	 and if you're monitoring throughput you may notice a *momentary dip
> in throughput*
> 	 as the *HDDs* need to seek around to find the volume boundaries and
> set up the COW records.'
> 
> Even if you don't have any insight into what's actually happening
> under the covers at Amazon,
> does my reading of all of this sound right to you?
> 
> And, perhaps more interestingly, are these same caveats from Amazon
> generally applicable to LVM?


More information about the gnhlug-discuss mailing list