Is Amazon AWS/EBS snapshotting just LVM, or what?
Joshua Judson Rosen
rozzin at hackerposse.com
Thu Sep 28 14:21:05 EDT 2017
On 09/28/2017 02:14 PM, mark wrote:
> AWS/EBS is not LVM under the covers, it's more like NFS; and snapshots are more like VMware & how it does snapshots.
I have never used VMWare and have no idea how it does anything. Can you provide more insight on what that means?
> The OS cache exclusion refers to read-ahead and write caching going on in RAM.
Yes, I got that. The reason I included that in the citation was actually that I took it
as supporting my "this looks like atomic COW snapshotting" conclusion, because that's
exactly what I'm accustomed to getting through LVM (snapshotting a block device
captures all of the blocks that *have actually been written* at the time of the snapshot).
> On Sep 28, 2017 1:17 PM, "Joshua Judson Rosen" <rozzin at hackerposse.com <mailto:rozzin at hackerposse.com>> wrote:
>
> I'm working on a project that uses Amazon AWS-provided VPS instances,
> and the other guy on the project is telling me that "snapshotting hourly may degrade performance",
> and I'm trying to determine where that's actually true. My gut feeling is that it sounds kind of bogus.
>
> >From the information I've been able to find about how Amazon's stuff works (either in terms
> of how it's _implemented_ [for which I'm finding basically no insight] or how it's _characterized_
> [in the engineering sense, not the literary sense]...), it really sounds a _lot_ like Amazon
> is just using LVM snapshots, e.g. from <https://aws.amazon.com/ebs/faqs/ <https://aws.amazon.com/ebs/faqs/>>:
>
> "snapshots can be done in real time while the volume is attached and in use.
> However, snapshots only capture data that has been written to your Amazon EBS volume,
> which might exclude any data that has been locally cached by your application or OS."
>
> "By design, an EBS Snapshot of an entire 16 TB volume should take no longer than the time
> it takes to snapshot an entire 1 TB volume. However, the actual time taken to create
> a snapshot depends on several factors including the amount of data that has changed
> since the last snapshot of the EBS volume."
>
> ... though I'm not entirely sure how to interpret that last bit about "time taken to create a snapshot
> depends on... the amount of data that has changed since the last snapshot";
> the _first half of that statement_ reads as "creating a snapshot is constant time",
> which basically screams to me "copy-on-write just like LVM, and they're probably implemented
> in terms of LVM".
>
> Any insight here as to whether my gut is correct on this, or whether I'm actually likely
> to notice an impact from hourly snapshots of, say, a 200-GB volume? How about a 1-TB volume?
>
> The only thing I'm seeing from Amazon that seems to _vaguely_ support (maybe) the notion
> that `snapshotting too often' would be something to worry about is this bit from elsewhere
> in that same FAQ page (under the heading of "performance", whereas the others were
> under the heading of "snapshots" and a subheading of "performance consistency of my HDD-backed volumes":
>
> Another factor is taking a snapshot which will decrease expected write performance
> down to the baseline rate, until the snapshot completes.
>
> ... and, taken in the context of the previously-cited notes about snapshots being
> `not base on volume-size but maybe influenced by changed-since-last-snapshot set size'
> (and in the context of the explanations they give for HDD-backed vs. SSD-backed storage),
> I'm basically reading that as:
>
> `if you're using HDD-backed storage then it's because you care about *throughput*
> more than *response time* and are likely to be monitoring throughput,
> and if you're monitoring throughput you may notice a *momentary dip in throughput*
> as the *HDDs* need to seek around to find the volume boundaries and set up the COW records.'
>
> Even if you don't have any insight into what's actually happening under the covers at Amazon,
> does my reading of all of this sound right to you?
>
> And, perhaps more interestingly, are these same caveats from Amazon generally applicable to LVM?
>
> --
> Connect with me on GNU social network: <https://status.hackerposse.com/rozzin <https://status.hackerposse.com/rozzin>>
> Not on the network? Ask me for an invitation to the nhcrossing.com <http://nhcrossing.com> social hub
> _______________________________________________
> gnhlug-discuss mailing list
> gnhlug-discuss at mail.gnhlug.org <mailto:gnhlug-discuss at mail.gnhlug.org>
> http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ <http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/>
>
--
"Don't be afraid to ask (λf.((λx.xx) (λr.f(rr))))."
More information about the gnhlug-discuss
mailing list