<p dir="ltr">AWS/EBS is not LVM under the covers, it's more like NFS; and snapshots are more like VMware & how it does snapshots. The OS cache exclusion refers to read-ahead and write caching going on in RAM.</p>
<p dir="ltr">Mark</p>
<div class="gmail_quote">On Sep 28, 2017 1:17 PM, "Joshua Judson Rosen" <<a href="mailto:rozzin@hackerposse.com">rozzin@hackerposse.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I'm working on a project that uses Amazon AWS-provided VPS instances,<br>
and the other guy on the project is telling me that "snapshotting hourly may degrade performance",<br>
and I'm trying to determine where that's actually true. My gut feeling is that it sounds kind of bogus.<br>
<br>
>From the information I've been able to find about how Amazon's stuff works (either in terms<br>
of how it's _implemented_ [for which I'm finding basically no insight] or how it's _characterized_<br>
[in the engineering sense, not the literary sense]...), it really sounds a _lot_ like Amazon<br>
is just using LVM snapshots, e.g. from <<a href="https://aws.amazon.com/ebs/faqs/" rel="noreferrer" target="_blank">https://aws.amazon.com/ebs/<wbr>faqs/</a>>:<br>
<br>
"snapshots can be done in real time while the volume is attached and in use.<br>
However, snapshots only capture data that has been written to your Amazon EBS volume,<br>
which might exclude any data that has been locally cached by your application or OS."<br>
<br>
"By design, an EBS Snapshot of an entire 16 TB volume should take no longer than the time<br>
it takes to snapshot an entire 1 TB volume. However, the actual time taken to create<br>
a snapshot depends on several factors including the amount of data that has changed<br>
since the last snapshot of the EBS volume."<br>
<br>
... though I'm not entirely sure how to interpret that last bit about "time taken to create a snapshot<br>
depends on... the amount of data that has changed since the last snapshot";<br>
the _first half of that statement_ reads as "creating a snapshot is constant time",<br>
which basically screams to me "copy-on-write just like LVM, and they're probably implemented<br>
in terms of LVM".<br>
<br>
Any insight here as to whether my gut is correct on this, or whether I'm actually likely<br>
to notice an impact from hourly snapshots of, say, a 200-GB volume? How about a 1-TB volume?<br>
<br>
The only thing I'm seeing from Amazon that seems to _vaguely_ support (maybe) the notion<br>
that `snapshotting too often' would be something to worry about is this bit from elsewhere<br>
in that same FAQ page (under the heading of "performance", whereas the others were<br>
under the heading of "snapshots" and a subheading of "performance consistency of my HDD-backed volumes":<br>
<br>
Another factor is taking a snapshot which will decrease expected write performance<br>
down to the baseline rate, until the snapshot completes.<br>
<br>
... and, taken in the context of the previously-cited notes about snapshots being<br>
`not base on volume-size but maybe influenced by changed-since-last-snapshot set size'<br>
(and in the context of the explanations they give for HDD-backed vs. SSD-backed storage),<br>
I'm basically reading that as:<br>
<br>
`if you're using HDD-backed storage then it's because you care about *throughput*<br>
more than *response time* and are likely to be monitoring throughput,<br>
and if you're monitoring throughput you may notice a *momentary dip in throughput*<br>
as the *HDDs* need to seek around to find the volume boundaries and set up the COW records.'<br>
<br>
Even if you don't have any insight into what's actually happening under the covers at Amazon,<br>
does my reading of all of this sound right to you?<br>
<br>
And, perhaps more interestingly, are these same caveats from Amazon generally applicable to LVM?<br>
<br>
--<br>
Connect with me on GNU social network: <<a href="https://status.hackerposse.com/rozzin" rel="noreferrer" target="_blank">https://status.hackerposse.<wbr>com/rozzin</a>><br>
Not on the network? Ask me for an invitation to the <a href="http://nhcrossing.com" rel="noreferrer" target="_blank">nhcrossing.com</a> social hub<br>
______________________________<wbr>_________________<br>
gnhlug-discuss mailing list<br>
<a href="mailto:gnhlug-discuss@mail.gnhlug.org">gnhlug-discuss@mail.gnhlug.org</a><br>
<a href="http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/" rel="noreferrer" target="_blank">http://mail.gnhlug.org/<wbr>mailman/listinfo/gnhlug-<wbr>discuss/</a><br>
</blockquote></div>