GFS and SANs

Jared Watkins jared at watkins.net
Wed Aug 4 22:52:01 EDT 2004


Jeff Macdonald wrote:

>Hi,
>And now for something Linux related. Earlier this year Redhat released
>GFS as GPL'd stuff. I understand that GFS is a distributed file system
>with redundancy and all that. What I don't understand is what is meant
>by SAN. I believe it stands for Storage Area Network. In some
>documentation I've read it seems that a SAN is a box with disks and
>high speed connectors to those disks. In some cases it seems to be a
>collection of machines on a common high speed network that have disks
>that look like a single entity. Come someone help explain what GFS is
>and what is meant by SAN?
>
>  
>
Depending on who you ask.. you are bound to get all different answers to 
this question.  Before I tell you what I think a SAN is.. I'll tell you 
a bit about where I'm coming from.  At my current company I was tasked 
with designing.. testing.. and implementing a mid sized SAN.  As this 
was a high budget item we needed to get it right the first time.  So.. 
with all the normal delays of dealing with upper management types and 
multiple vendors... I spent the better part of two years evaluating 
(bake off style mostly) about a dozen storage vendors and three software 
SAN virtualization systems.  This included two actual complete setups 
for testing... all hardware in the same room... set it up.. try to break 
it and see what happens sort of testing.  I'm now about 6 months post 
install and managing the daily maintenance and growth of the SAN.

The simplest definition of a SAN... is that you have disk arrays 
connected in some sort of network.. loops in the 'old' days and 
point-to-point fabrics more recently.  This network can use copper (old) 
or optical cables.  Optical is less error prone...   and right now runs 
at either 1Gb or 2Gb with 10Gb coming.  There are usually switches (or 
hubs)..  where you plug in your storage and any servers that need access 
to that storage.  The idea is that you make raid sets... and divide them 
into scsi luns which are presented out to the fabric for systems to 
use.  You have issues of lun masking to deal with.. so servers only have 
access to the luns they 'own' and have permission to access. 

If GFS were used in a SAN environment... you would assign the same LUN 
to multiple machines... and GFS would prevent them from stepping on each 
other as they do IO to the same shared disk.

It gets more complicated than that of course.. but that's the basic 
idea...  most of the mid+ level storage arrays have advanced features 
like snapshots.. cross cabinet mirroring... long distance replication 
either over FC or IP.  One key difference between FC and IP networks is 
that FC is not routable...  it is simply a collection of point to point 
connections. 

One common problem you run into when dealing with this stuff.. is that 
each vendor tries to lock you into using only their storage.  They all 
have features and ways of accomplishing the same functional tasks that 
will not interoperate with other hardware.  So what my company pursued 
is a way around this with software..  The system I just deployed uses a 
load balanced pair of massive linux boxes that sit inband..  in the 
middle between the storage and the servers.  From this vantage point.. 
they are able to see and control access to all the storage.. and 
abstract the access to it.  With this setup... storage is storage... the 
servers only know what these datamanagers show it.  The backend storage 
does not need any special (read expensive) software features.. it only 
needs to present its disks/arrays out to the management boxes.   That 
buys you vendor independence.. and a richer feature set than any single 
storage box can offer.  Downtime due to disk upgrades/growth is 
eliminated completely.. and you can build a fully redundant.. no SPF 
system that even includes long distance.. block level replication over a 
network.

That's just my take on what a SAN is.. and should be...  but there are 
lots of simpler setups that can be called a SAN.

Jared






More information about the gnhlug-discuss mailing list