Lan + DMZ + LargeNumOfFiles = headaches AKA: plz halp and donate ur brain!!
Ben Scott
dragonhawk at gmail.com
Fri Sep 5 11:31:12 EDT 2008
On Thu, Sep 4, 2008 at 6:42 PM, Flaherty, Patrick <pflaherty at wsi.com> wrote:
> I have a cluster of machine producing 20k small files (30kbytes or so)
> inside our lan. After the files are created, they are pushed to a few
> web servers in the DMZ using ftp.
If practical, you may want to experiment with scripting something
that runs on the source host to archive the many small files into one
big file, transfers the one big file, then runs on the target host to
unpack the archive. (Operations on one big file are generally much
faster than many smaller ones. It's sometimes better to take that hit
twice locally (once on each host), rather than once on the network.
(Local disk and filesystem being faster than network.))
> FTP seems to fall down when scaling out to more than a web server or
> two, many retries and transfer failures.
I'll second Andy's suggestion of rsync, which is much better at this
kind of thing.
You could even combine it with the one-big-archive-file idea. Use
rsync to replicate a directory with the file(s) to the other hosts,
and have the other host(s) monitor that directory for files to unpack.
Just a Small Matter of Programming. ;-)
> An ideal solution would be an NFS/CIFS share internal to the lan
> replicated readonly to an NFS/CIFS share in the DMZ.
I'd recommend against that king of thing. Neither NFS nor CIFS are
"firewall friendly", so if you have a DMZ, you'll want to avoid trying
to have them cross the firewall. (There are things that can be done
to make it work, but they still all tend to weaken a firewall more
than I really prefer.)
-- Ben
More information about the gnhlug-discuss
mailing list