Is Amazon AWS/EBS snapshotting just LVM, or what?

Tom Buskey tom at buskey.name
Thu Sep 28 13:46:12 EDT 2017


I work with OpenStack.  It manages images in Glance which sit above its
object storage, Swift.

On the POC clouds, you can use LVM as a backend for Glance.  Snapshotting
is *very* slow.  30 minutes for a snap of a 80GB VM that's shutdown.

You can use other storage backends in OpenStack that are faster.  A full
non LVM Swift.  Ceph and glusterfs are common choices where performance
matters.  They wouldn't be using ZFS but probably something using their S3
object store.




On Thu, Sep 28, 2017 at 1:32 PM, Ken D'Ambrosio <ken at jots.org> wrote:

> I would say it's unlikely to be LVM, because LVM is content-ignorant; it
> snapshots the entire volume, which is inefficient, and when you're
> Amazon, you care a LOT about being efficient.  Instead, I imagine
> they're using some content-aware CoW solution such as ZFS.  But,
> whatever mechanism, I agree with your opinion: I doubt that their
> solution -- almost certainly CoW of some sort -- stands a chance of
> being more than even slightly impactful.
>
> $.02, YMMV and other assorted disclaimers,
>
> -Ken
>
>
> On 2017-09-28 13:16, Joshua Judson Rosen wrote:
> > I'm working on a project that uses Amazon AWS-provided VPS instances,
> > and the other guy on the project is telling me that "snapshotting
> > hourly may degrade performance",
> > and I'm trying to determine where that's actually true. My gut feeling
> > is that it sounds kind of bogus.
> >
> >> From the information I've been able to find about how Amazon's stuff
> >> works (either in terms
> > of how it's _implemented_ [for which I'm finding basically no insight]
> > or how it's _characterized_
> > [in the engineering sense, not the literary sense]...), it really
> > sounds a _lot_ like Amazon
> > is just using LVM snapshots, e.g. from
> > <https://aws.amazon.com/ebs/faqs/>:
> >
> >       "snapshots can be done in real time while the volume is attached
> and
> > in use.
> >        However, snapshots only capture data that has been written to your
> > Amazon EBS volume,
> >        which might exclude any data that has been locally cached by your
> > application or OS."
> >
> >       "By design, an EBS Snapshot of an entire 16 TB volume should take
> no
> > longer than the time
> >        it takes to snapshot an entire 1 TB volume. However, the actual
> time
> > taken to create
> >        a snapshot depends on several factors including the amount of data
> > that has changed
> >        since the last snapshot of the EBS volume."
> >
> > ... though I'm not entirely sure how to interpret that last bit about
> > "time taken to create a snapshot
> > depends on... the amount of data that has changed since the last
> > snapshot";
> > the _first half of that statement_ reads as "creating a snapshot is
> > constant time",
> > which basically screams to me "copy-on-write just like LVM, and
> > they're probably implemented
> > in terms of LVM".
> >
> > Any insight here as to whether my gut is correct on this, or whether
> > I'm actually likely
> > to notice an impact from hourly snapshots of, say, a 200-GB volume?
> > How about a 1-TB volume?
> >
> > The only thing I'm seeing from Amazon that seems to _vaguely_ support
> > (maybe) the notion
> > that `snapshotting too often' would be something to worry about is
> > this bit from elsewhere
> > in that same FAQ page (under the heading of "performance", whereas the
> > others were
> > under the heading of "snapshots" and a subheading of "performance
> > consistency of my HDD-backed volumes":
> >
> >       Another factor is taking a snapshot which will decrease expected
> > write performance
> >       down to the baseline rate, until the snapshot completes.
> >
> > ... and, taken in the context of the previously-cited notes about
> > snapshots being
> > `not base on volume-size but maybe influenced by
> > changed-since-last-snapshot set size'
> > (and in the context of the explanations they give for HDD-backed vs.
> > SSD-backed storage),
> > I'm basically reading that as:
> >
> >       `if you're using HDD-backed storage then it's because you care
> about
> > *throughput*
> >        more than *response time* and are likely to be monitoring
> throughput,
> >        and if you're monitoring throughput you may notice a *momentary
> dip
> > in throughput*
> >        as the *HDDs* need to seek around to find the volume boundaries
> and
> > set up the COW records.'
> >
> > Even if you don't have any insight into what's actually happening
> > under the covers at Amazon,
> > does my reading of all of this sound right to you?
> >
> > And, perhaps more interestingly, are these same caveats from Amazon
> > generally applicable to LVM?
> _______________________________________________
> gnhlug-discuss mailing list
> gnhlug-discuss at mail.gnhlug.org
> http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.gnhlug.org/pipermail/gnhlug-discuss/attachments/20170928/86d31e3f/attachment.html 


More information about the gnhlug-discuss mailing list