<div dir="ltr">I work with OpenStack. It manages images in Glance which sit above its object storage, Swift.<div><br></div><div>On the POC clouds, you can use LVM as a backend for Glance. Snapshotting is *very* slow. 30 minutes for a snap of a 80GB VM that's shutdown.<div><br></div><div>You can use other storage backends in OpenStack that are faster. A full non LVM Swift. Ceph and glusterfs are common choices where performance matters. They wouldn't be using ZFS but probably something using their S3 object store.</div><div><br></div><div><br></div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Sep 28, 2017 at 1:32 PM, Ken D'Ambrosio <span dir="ltr"><<a href="mailto:ken@jots.org" target="_blank">ken@jots.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I would say it's unlikely to be LVM, because LVM is content-ignorant; it<br>
snapshots the entire volume, which is inefficient, and when you're<br>
Amazon, you care a LOT about being efficient. Instead, I imagine<br>
they're using some content-aware CoW solution such as ZFS. But,<br>
whatever mechanism, I agree with your opinion: I doubt that their<br>
solution -- almost certainly CoW of some sort -- stands a chance of<br>
being more than even slightly impactful.<br>
<br>
$.02, YMMV and other assorted disclaimers,<br>
<br>
-Ken<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
On 2017-09-28 13:16, Joshua Judson Rosen wrote:<br>
> I'm working on a project that uses Amazon AWS-provided VPS instances,<br>
> and the other guy on the project is telling me that "snapshotting<br>
> hourly may degrade performance",<br>
> and I'm trying to determine where that's actually true. My gut feeling<br>
> is that it sounds kind of bogus.<br>
><br>
>> From the information I've been able to find about how Amazon's stuff<br>
>> works (either in terms<br>
> of how it's _implemented_ [for which I'm finding basically no insight]<br>
> or how it's _characterized_<br>
> [in the engineering sense, not the literary sense]...), it really<br>
> sounds a _lot_ like Amazon<br>
> is just using LVM snapshots, e.g. from<br>
> <<a href="https://aws.amazon.com/ebs/faqs/" rel="noreferrer" target="_blank">https://aws.amazon.com/ebs/<wbr>faqs/</a>>:<br>
><br>
> "snapshots can be done in real time while the volume is attached and<br>
> in use.<br>
> However, snapshots only capture data that has been written to your<br>
> Amazon EBS volume,<br>
> which might exclude any data that has been locally cached by your<br>
> application or OS."<br>
><br>
> "By design, an EBS Snapshot of an entire 16 TB volume should take no<br>
> longer than the time<br>
> it takes to snapshot an entire 1 TB volume. However, the actual time<br>
> taken to create<br>
> a snapshot depends on several factors including the amount of data<br>
> that has changed<br>
> since the last snapshot of the EBS volume."<br>
><br>
> ... though I'm not entirely sure how to interpret that last bit about<br>
> "time taken to create a snapshot<br>
> depends on... the amount of data that has changed since the last<br>
> snapshot";<br>
> the _first half of that statement_ reads as "creating a snapshot is<br>
> constant time",<br>
> which basically screams to me "copy-on-write just like LVM, and<br>
> they're probably implemented<br>
> in terms of LVM".<br>
><br>
> Any insight here as to whether my gut is correct on this, or whether<br>
> I'm actually likely<br>
> to notice an impact from hourly snapshots of, say, a 200-GB volume?<br>
> How about a 1-TB volume?<br>
><br>
> The only thing I'm seeing from Amazon that seems to _vaguely_ support<br>
> (maybe) the notion<br>
> that `snapshotting too often' would be something to worry about is<br>
> this bit from elsewhere<br>
> in that same FAQ page (under the heading of "performance", whereas the<br>
> others were<br>
> under the heading of "snapshots" and a subheading of "performance<br>
> consistency of my HDD-backed volumes":<br>
><br>
> Another factor is taking a snapshot which will decrease expected<br>
> write performance<br>
> down to the baseline rate, until the snapshot completes.<br>
><br>
> ... and, taken in the context of the previously-cited notes about<br>
> snapshots being<br>
> `not base on volume-size but maybe influenced by<br>
> changed-since-last-snapshot set size'<br>
> (and in the context of the explanations they give for HDD-backed vs.<br>
> SSD-backed storage),<br>
> I'm basically reading that as:<br>
><br>
> `if you're using HDD-backed storage then it's because you care about<br>
> *throughput*<br>
> more than *response time* and are likely to be monitoring throughput,<br>
> and if you're monitoring throughput you may notice a *momentary dip<br>
> in throughput*<br>
> as the *HDDs* need to seek around to find the volume boundaries and<br>
> set up the COW records.'<br>
><br>
> Even if you don't have any insight into what's actually happening<br>
> under the covers at Amazon,<br>
> does my reading of all of this sound right to you?<br>
><br>
> And, perhaps more interestingly, are these same caveats from Amazon<br>
> generally applicable to LVM?<br>
</div></div><div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>
gnhlug-discuss mailing list<br>
<a href="mailto:gnhlug-discuss@mail.gnhlug.org">gnhlug-discuss@mail.gnhlug.org</a><br>
<a href="http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/" rel="noreferrer" target="_blank">http://mail.gnhlug.org/<wbr>mailman/listinfo/gnhlug-<wbr>discuss/</a><br>
</div></div></blockquote></div><br></div>