Tag: Purdue
GlusterFS
by Alex on Oct.13, 2009, under Purdue
At work we’ve been evaluating different distributed file systems in our spare time. Currently, we use one large, centralized filer and have seen problems being able to push as many input/output operations through as we’d like. While that’s mostly a backend disk problem, wouldn’t it be great to have a storage system that grew as we added more cluster nodes?
In that hope, we tested some pretty alpha-level pNFS code, some Hadoop, and now GlusterFS. All these seems to have some faults, but this is what I found for Gluster…
Downloading the RPMs from the main FTP repository and installing them was pretty painless on RHEL5. The documentation is pretty spare and misleading, but eventually whipping up these config files made it all go:
#glusterfsd.vol volume posix type storage/posix option directory /glusterfs end-volume volume locks type features/locks subvolumes posix end-volume volume brick type performance/io-threads option thread-count 8 subvolumes locks end-volume volume server type protocol/server option transport-type tcp option auth.addr.brick.allow * subvolumes brick end-volume
and
volume remote1 type protocol/client option transport-type tcp option remote-host foobar-0.example.com option remote-subvolume brick end-volume volume remote2 type protocol/client option transport-type tcp option remote-host foobar-2.example.com option remote-subvolume brick end-volume volume remote3 type protocol/client option transport-type tcp option remote-host pfnstest-003.example.com option remote-subvolume brick end-volume volume remote4 type protocol/client option transport-type tcp option remote-host foobar-4.example.com option remote-subvolume brick end-volume volume remote5 type protocol/client option transport-type tcp option remote-host foobar-5.example.com option remote-subvolume brick end-volume volume remote6 type protocol/client option transport-type tcp option remote-host foobar-6.example.com option remote-subvolume brick end-volume volume replicate1 type cluster/replicate subvolumes remote1 remote2 remote3 end-volume volume replicate2 type cluster/replicate subvolumes remote4 remote5 remote6 end-volume volume distribute type cluster/distribute subvolumes replicate1 replicate2 end-volume volume writebehind type performance/write-behind option window-size 4MB subvolumes distribute end-volume volume cache type performance/io-cache option cache-size 1024MB subvolumes writebehind end-volume
The backing storage for this GlusterFS was an Ext3 file system carved out of LVM and housed on HP SATA disk trays. Mounting up that file system and running some simplistic tests, I found that using large file sizes the file system performance was about at the maximum network speed and that using small file sizes the performance was in the 5-10MB/s range. Not bad for an hour or two’s worth of effect.
Another Install Day
by Alex on Jul.19, 2009, under Purdue
It appears our last cluster was such a hit that once again my department at Purdue is going to do a single day, multi-hundred node cluster installation. Last year, the work went very quickly and we had the cluster racked before lunch. This year, we asked for fewer volunteers and added an extra step in the process, installing 10GigE network adapters. Hopefully that will slow things down enough so the VIPs can actually see us working.
As always, the marketing folks put together a short little clip to push the day:
The racking and stacking is pretty much just a matter of man power. Getting the software onto the systems is done using RHEL’s Anaconda and several scripts. All the magic was described by some of my coworkers in a paper for the USENIX’s LISA 2008 Conference.. A copy can be found here. It’s pretty fun stuff.
OSG Storage Forum – Fermilab
by Alex on Jul.05, 2009, under Purdue
Yeah, I’m a tad behind on updating my blog… So, just this week several of us from work went to Fermilab to participate in the OSG Storage Forum. Mostly, it was a couple day talk for interested parties participating in the grid to talk about their various storage solutions. The biggest presentations came from the folks involved with the CMS and ATLAS projects, stemming from the work at the LHC. To support physicists around the nation and the globe, there are a whole series of sites dedicated to providing access to the terabytes of data flowing from the LHC; all of it just waiting to be analyzed.
The biggest reason I went was to hear about other people using Hadoop as a replacement for a piece of software called dCache… Oh yeah, and it was held in the most awesome office building in the world:

That’s pretty much the interesting bits of that visit. Though, I thought Fermilab was pretty nifty.
TeraGrid 2009 – Arlington, VA
by Alex on Jul.05, 2009, under Purdue
During June, I traveled to Arlington, VA, for the Teragrid 2009 conference for work. This is certainly an interesting conference. For those that don’t know, the Teragrid is “an open scientific discovery infrastructure combining leadership class resources at eleven partner sites to create an integrated, persistent computational resource.” In other words, a fairly big project put together by the NSF to create a national cyberinfrastructure to serve the needs of science.
I was part of the student program, which included 120 other high school, undergraduate, and graduate students from all around the nation. It was certainly interesting listening to the talks given by middle and high school educators about how they are trying to integrate computer simulation and visualization into the classroom.
During the conference, I mostly had two goals: listen to talks and present some of the work being done at Purdue. I had a small poster in the poster session about deploying Hadoop and how Purdue envisions using the Hadoop Distributed File System to support high throughput computing (which, I hear we’re known for doing quite well). Also, we had a small talk in the Education and Outreach Track about how Purdue is using the SC Cluster Challenge to support undergraduate exploration of High Performance Computing. Thankfully, both my little poster and the talk drew a lot of positive attention.
Sadly, the Teragrid conference seems very focused around talking about the science being done on the grid and not so much about the technologies that have been deployed and developed to make “the grid.” (Not to say the science isn’t important, just most of it is way over my head.) The best parts of the conference were finding other site admins to talk with and the poster session.
And, here’s a picture of the capital building from when we ventured away from the hotel on the last day:

Challenge of Clusters
by Alex on Nov.11, 2008, under Purdue
So, I have not posted much this semester, mostly because I’ve been working on either school work or a project for Supercomputing 2008. Throughout summer, I’ve been attempting to squeeze the Wave Propagation Program to run on a SiCortex. Pesky program, though not nearly as strange as some of the other science codes my friends have had to jam onto our SiCortex. (Though, I hear some other teams may or may not have had just as much trouble getting the codes to run on the platforms they were coded for!)
Have you not heard about the Cluster Challenge?! Well, there’s a good write up at HPC Wire. Of course, the MIT team is a little scary, but the rumors are they choose to port their code to something that isn’t a general purpose processor at all.. We’ll be eagerly waiting and seeing how that turns out!
As for our effort, we may be a man short, but I think we’ve got everything covered. Depending on the data sets we are give to run, this will either be a cake-walk or a challenge to the finish. Thankfully, it appears our machine is up to the task. For a lot of the details of our machines, check out this press release at Campus Technology.
If you’re in the Austin area, feel free to hit me up and see how things are progressing. If not, check out this cool game to hold you over: Rack-a-Node.