<div dir="ltr">I work with OpenStack.  It manages images in Glance which sit above its object storage, Swift.<div><br></div><div>On the POC clouds, you can use LVM as a backend for Glance.  Snapshotting is *very* slow.  30 minutes for a snap of a 80GB VM that&#39;s shutdown.<div><br></div><div>You can use other storage backends in OpenStack that are faster.  A full non LVM Swift.  Ceph and glusterfs are common choices where performance matters.  They wouldn&#39;t be using ZFS but probably something using their S3 object store.</div><div><br></div><div><br></div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Sep 28, 2017 at 1:32 PM, Ken D&#39;Ambrosio <span dir="ltr">&lt;<a href="mailto:ken@jots.org" target="_blank">ken@jots.org</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I would say it&#39;s unlikely to be LVM, because LVM is content-ignorant; it<br>

snapshots the entire volume, which is inefficient, and when you&#39;re<br>

Amazon, you care a LOT about being efficient.  Instead, I imagine<br>

they&#39;re using some content-aware CoW solution such as ZFS.  But,<br>

whatever mechanism, I agree with your opinion: I doubt that their<br>

solution -- almost certainly CoW of some sort -- stands a chance of<br>

being more than even slightly impactful.<br>

<br>

$.02, YMMV and other assorted disclaimers,<br>

<br>

-Ken<br>

<div class="HOEnZb"><div class="h5"><br>

<br>

On 2017-09-28 13:16, Joshua Judson Rosen wrote:<br>

&gt; I&#39;m working on a project that uses Amazon AWS-provided VPS instances,<br>

&gt; and the other guy on the project is telling me that &quot;snapshotting<br>

&gt; hourly may degrade performance&quot;,<br>

&gt; and I&#39;m trying to determine where that&#39;s actually true. My gut feeling<br>

&gt; is that it sounds kind of bogus.<br>

&gt;<br>

&gt;&gt; From the information I&#39;ve been able to find about how Amazon&#39;s stuff<br>

&gt;&gt; works (either in terms<br>

&gt; of how it&#39;s _implemented_ [for which I&#39;m finding basically no insight]<br>

&gt; or how it&#39;s _characterized_<br>

&gt; [in the engineering sense, not the literary sense]...), it really<br>

&gt; sounds a _lot_ like Amazon<br>

&gt; is just using LVM snapshots, e.g. from<br>

&gt; &lt;<a href="https://aws.amazon.com/ebs/faqs/" rel="noreferrer" target="_blank">https://aws.amazon.com/ebs/<wbr>faqs/</a>&gt;:<br>

&gt;<br>

&gt;       &quot;snapshots can be done in real time while the volume is attached and<br>

&gt; in use.<br>

&gt;        However, snapshots only capture data that has been written to your<br>

&gt; Amazon EBS volume,<br>

&gt;        which might exclude any data that has been locally cached by your<br>

&gt; application or OS.&quot;<br>

&gt;<br>

&gt;       &quot;By design, an EBS Snapshot of an entire 16 TB volume should take no<br>

&gt; longer than the time<br>

&gt;        it takes to snapshot an entire 1 TB volume. However, the actual time<br>

&gt; taken to create<br>

&gt;        a snapshot depends on several factors including the amount of data<br>

&gt; that has changed<br>

&gt;        since the last snapshot of the EBS volume.&quot;<br>

&gt;<br>

&gt; ... though I&#39;m not entirely sure how to interpret that last bit about<br>

&gt; &quot;time taken to create a snapshot<br>

&gt; depends on... the amount of data that has changed since the last<br>

&gt; snapshot&quot;;<br>

&gt; the _first half of that statement_ reads as &quot;creating a snapshot is<br>

&gt; constant time&quot;,<br>

&gt; which basically screams to me &quot;copy-on-write just like LVM, and<br>

&gt; they&#39;re probably implemented<br>

&gt; in terms of LVM&quot;.<br>

&gt;<br>

&gt; Any insight here as to whether my gut is correct on this, or whether<br>

&gt; I&#39;m actually likely<br>

&gt; to notice an impact from hourly snapshots of, say, a 200-GB volume?<br>

&gt; How about a 1-TB volume?<br>

&gt;<br>

&gt; The only thing I&#39;m seeing from Amazon that seems to _vaguely_ support<br>

&gt; (maybe) the notion<br>

&gt; that `snapshotting too often&#39; would be something to worry about is<br>

&gt; this bit from elsewhere<br>

&gt; in that same FAQ page (under the heading of &quot;performance&quot;, whereas the<br>

&gt; others were<br>

&gt; under the heading of &quot;snapshots&quot; and a subheading of &quot;performance<br>

&gt; consistency of my HDD-backed volumes&quot;:<br>

&gt;<br>

&gt;       Another factor is taking a snapshot which will decrease expected<br>

&gt; write performance<br>

&gt;       down to the baseline rate, until the snapshot completes.<br>

&gt;<br>

&gt; ... and, taken in the context of the previously-cited notes about<br>

&gt; snapshots being<br>

&gt; `not base on volume-size but maybe influenced by<br>

&gt; changed-since-last-snapshot set size&#39;<br>

&gt; (and in the context of the explanations they give for HDD-backed vs.<br>

&gt; SSD-backed storage),<br>

&gt; I&#39;m basically reading that as:<br>

&gt;<br>

&gt;       `if you&#39;re using HDD-backed storage then it&#39;s because you care about<br>

&gt; *throughput*<br>

&gt;        more than *response time* and are likely to be monitoring throughput,<br>

&gt;        and if you&#39;re monitoring throughput you may notice a *momentary dip<br>

&gt; in throughput*<br>

&gt;        as the *HDDs* need to seek around to find the volume boundaries and<br>

&gt; set up the COW records.&#39;<br>

&gt;<br>

&gt; Even if you don&#39;t have any insight into what&#39;s actually happening<br>

&gt; under the covers at Amazon,<br>

&gt; does my reading of all of this sound right to you?<br>

&gt;<br>

&gt; And, perhaps more interestingly, are these same caveats from Amazon<br>

&gt; generally applicable to LVM?<br>

</div></div><div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>

gnhlug-discuss mailing list<br>

<a href="mailto:gnhlug-discuss@mail.gnhlug.org">gnhlug-discuss@mail.gnhlug.org</a><br>

<a href="http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/" rel="noreferrer" target="_blank">http://mail.gnhlug.org/<wbr>mailman/listinfo/gnhlug-<wbr>discuss/</a><br>

</div></div></blockquote></div><br></div>