Page 1 of 3 123 LastLast
Results 1 to 16 of 44

Thread: Clustering commodity PC hardware - A web log

  1. #1
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Cool Clustering commodity PC hardware - A web log

    Hi all,

    I recently bought myself a small cluster of x86 machines for rebuilding into a fully operating and useful system, for a personal project, and David and I agree that it's primetime material for a thread at HEXUS, so I'm going to republish my cluster 'blog here to spark some discussion about clustering. So that includes the easiest way to do it, the interconnect to choose, what hardware can be used, why you'd do it, how much it costs to run, software you'd run on one and anything else related to its construction.

    So if you're interested in my cluster, or building your own, post away and I'll endeavour to help. There's more than just myself on these forums that have experience building and running them, so I hope the thread attracts those folks in (you know who you are!).

    What follows is the first few posts to my cluster 'blog over the last week or so, and I'll cross publish in this thread whenever I update it in the future.
    MOLLY AND POPPY!

  2. #2
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Lightbulb January 16, 2005

    Hello!

    If you read my main weblog you'll know that I recently bought a small cluster. I've setup this web log to document my adventures in rebuilding it from scratch and getting it up and running. It's not a high performance cluster by today's standards, but it's competent enough to do useful work and it'll be something cool to put on my CV. I'll start from the beginning (it's still in pieces in my living room) and hopefully by the end, this 'blog will be a useful reference document for anyone looking to build their own.

    Posted by Rys at January 16, 2005 12:57 PM
    MOLLY AND POPPY!

  3. #3
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Lightbulb January 16, 2005

    Hardware and initial setup work

    Cluster Hardware

    The cluster houses around 11GHz of compute power across eight compute nodes, using Intel Pentium III processors. Each node is a IBM Netfinity 5100 server in a 5U rack chassis and has a pair of 667MHz Slot 1 Coppermine P3s, 1GB of PC133 ECC memory, 9GB SCSI and a Myrinet 1000 card for inter-compute node communications. Onboard 100Mbit Ethernet provides connectivity to the management node. The management (head) node is another Netfinity 5100 in the desktop chassis with the same specification as the compute nodes, with an extra 40GB IDE RAID volume, sans Myrinet card.

    A Cisco C2916M-XL Ethernet switch provides the TCP/IP interconnect and a Myricom M2L-SW16 is the Myrinet interconnect switch.

    Initial Setup

    Initial setup has consisted of reclaiming the configuration on the Cisco switch, following this document to reset the master password and let me reconfigure the VLAN1 and first eight Ethernet ports to match my existing home network setup. The next step is to check the configuration on the M2L-SW16 for obvious weirdness that would cause problems with the operating system loaded onto the compute nodes. I imagine the original switch configuration will be fine, but I'd like to check, even if it's just to make sure the switch is actually alive and working.

    Software wise, Rocks looks like being my weapon of choice. There's a custom RedHat 7.2 installation on the cluster at the moment, although it's unclear how usable it is, although it'll be useful to boot just to see how much of the hardware is working (hopefully 100%). I'm planning on booting it for the first time this week, after sorting out power (it'll require 11 power points for the hardware) in my office.

    Posted by Rys at January 16, 2005 05:44 PM
    MOLLY AND POPPY!

  4. #4
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Lightbulb January 17, 2005

    Myrinet switch is alive

    I can successfully give the Myrinet switch an IP address using RARP, which allows it to wake up on the Ethernet management port and respond to simple ICMP pings and certain SNMP traffic. The switch listens on port 4003 for SNMP and uses the community string public. Using some freeware SNMP tools I can obtain very limited information (finding an OID list for the switch seems impossible without contacting Myricom), so it looks like the switch is alive. Actual functional testing looks like being "plug it in and see" though. There's some scope for testing doing TCP/IP over GM though, outside of operation in the cluster, if I fancy pulling a couple of cards from two nodes.

    It's completely silent when running, compared to the Cisco, although it might get noisier when loaded up with traffic to process.

    Posted by Rys at January 17, 2005 12:26 AM
    MOLLY AND POPPY!

  5. #5
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Lightbulb January 18, 2005

    Early cluster pictures

    While Alex was on my PC surfing the web earlier, I decided to not just sit there and twiddle my thumbs, so I brought the cluster upstairs to my office. She didn't give me a hand, so I now feel like death. The individual nodes are very heavy and taking them up two flights of stairs wasn't fun. The result is a few new pics though.

    Find those here.

    Posted by Rys at January 18, 2005 12:14 AM
    MOLLY AND POPPY!

  6. #6
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Lightbulb January 20, 2005

    Power provision was purchased

    I grabbed a couple of Belkin Surgemasters for the cluster. Power provision will still be on the "unplug some other stuff first" side of the fence for the time being, until I rejig my office, but I can start the cluster rebuild at the weekend.

    Posted by Rys at January 20, 2005 09:54 PM
    MOLLY AND POPPY!

  7. #7
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Lightbulb January 22, 2005

    Management node is setup

    The ethereal, mysterious 'Tim', and Ben, came round earlier to help me do the initial setup and cabling. We booted the management system and the first compute node with their original operating system installs (RedHat 7.3 on the management node, 6.2 on the compute node) to check they'd boot and no hardware was misconfigured. Then we set about following the configuration guide for setting up a front-end server with Rocks. Install went fine and we ended up calling the cluster superchode (thanks, Alex!). There looks to be some problems with Ganglia but given that no compute nodes are connected to the switch, it's understandable.

    We also set the cabling up for power to the two switches, front-end and four compute nodes (and a monitor we won't use too much until we can SSH in with confidence) ready for attaching nodes to the cluster.

    Tomorrow's task is to bring up as many compute nodes as possible using the basic Rocks install and then possibly experiment with adding Roll packages to the nodes for further functionality. 'Tim' will assist again I think and I'm sure Ben will pop over too. It's better than doing it solo, although the cluster's blinking lights on the Netfinity chassis' are mesmerising and give it the appearance of limited intelligence on its own, to keep me company, like some second-rate console panel from the first series of Star Trek.

    More tomorrow as we bring the nodes up. There's some new pictures from today's endeavours in the image gallery. If it's got a cable attached in the picture, it's something we did earlier.

    Posted by Rys at January 22, 2005 09:52 PM
    MOLLY AND POPPY!

  8. #8
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Lightbulb January 23, 2005

    Superchode and four baby chodenodes

    Tim and I rebuilt superchode's front-end system this afternoon to correct a misconfiguration with the Ethernet interfaces. The box has three Ethernet adapters and the Rocks installer numbers them as it finds them on a bus probe, so the on-board AMD controller was being brought up first. We wanted to ignore that and just use the Intel 100Mbit cards instead so a quick trip into the BIOS (IBM BIOS implementations really do go the extra ten miles to give you information and system setup) and a reinstall later and superchode's head was alive again.

    Then attaching the first half of the compute nodes to the cluster was fairly simple. Nodes 0, 1 and 2 attached without issue, although node 1 has a dead stick of memory that'll need replacing to give it the full 1GB. The BIOS disables the DIMM slot, so it's purring along with 768MB for the time being. Node 3 wouldn't boot from the CD drive so we swapped in a DVD drive to test and that wouldn't be booted from either. An IDE ribbon swap and everything was alive. It also has a slightly sensitive chassis intrusion switch that stops you booting it unless the chassis cover is on properly.

    So an hour or so after that node had kick-started itself across the network, feeding from the front-end, we had 10 active processors, just under 5GB of memory and the cluster alive. The cluster monitoring software, Ganglia, showed us this when it was done.

    The next step is to boostrap the final four nodes and register them into the cluster database. Then the Myrinet interconnect can be brought up and cluster jobs started over that. I might benchmark it before and after, just to see how the Myrinet affects I/O-bound performance.

    So everything's coming along nicely. Rocks makes it relatively simple to get compute nodes added and while it'd be nice to say it took us a massive amount of time and effort to bring up this first half of the cluster, it wasn't too bad and hardware issues kicked our asses more than the software did.

    Apache doesn't stay up for long at the moment however, but that's no doubt fixable. I should have the entire cluster connected (at least with Ethernet) sometime this week, time pending.

    Posted by Rys at January 23, 2005 11:24 PM
    MOLLY AND POPPY!

  9. #9
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Thumbs up

    So the initial stages have gone pretty well, bar the odd bit of hardware in the compute nodes not cooperating. That was all easily fixed or ignored for the time being, so things are going well. 10 processors connected is a pretty good start and while that could have been all 18 yesterday, we were running late due to the hardware problems.

    It's definitely been a learning experience. The cool this is that you don't need a high-speed interconnect like Myrinet to build one, you can do it over Ethernet just as easily. Rocks has support for x86-64 and you can co-cluster that architecture with regular x86 if you're careful about the jobs you run.

    There's loads more to it since all I've done is bring up the hardware so far. So there's talk on the job scheduler, process accounting, security and a myriad of others things to cover over time. I'll get there though (with your help!).

    Have a look at Rocks, ClusterWorld, Grid Computing and SourceForge's clustering foundry for more information.

    Two 386 PCs on a 10Mbit hub can be a cluster, so don't think it's out of your league in terms of building one for fun, enjoyment and your own education.

    Enjoy!

    Rys
    MOLLY AND POPPY!

  10. #10
    Bonnet mounted gunsight megah0's Avatar
    Join Date
    Jul 2003
    Location
    Birmingham
    Posts
    3,381
    Thanks
    79
    Thanked
    73 times in 49 posts
    Brilliant,

    Going to be keeping a close eye on this thread.

    Great project
    Recycling consultant

  11. #11
    HEXUS.timelord. Zak33's Avatar
    Join Date
    Jul 2003
    Location
    I'm a Jessie
    Posts
    35,176
    Thanks
    3,121
    Thanked
    3,173 times in 1,922 posts
    • Zak33's system
      • Storage:
      • Kingston HyperX SSD, Hitachi 1Tb
      • Graphics card(s):
      • Nvidia 1050
      • PSU:
      • Coolermaster 800w
      • Case:
      • Silverstone Fortress FT01
      • Operating System:
      • Win10
      • Internet:
      • Zen FTC uber speedy
    my mind is now BOGGLED. I have absolutely no idea what most of that is.

    Starting at base point A, I shall ask : Are these computers, individually, similar to a home PC in as much as having a CPU, some ram, a mobo, a vid card, a hdd, and a PSU?

    Quote Originally Posted by Advice Trinity by Knoxville
    "The second you aren't paying attention to the tool you're using, it will take your fingers from you. It does not know sympathy." |
    "If you don't gaffer it, it will gaffer you" | "Belt and braces"

  12. #12
    Senior Member
    Join Date
    Jul 2004
    Location
    London
    Posts
    2,456
    Thanks
    100
    Thanked
    75 times in 51 posts
    • Mblaster's system
      • Motherboard:
      • ASUS PK5 Premium
      • CPU:
      • Intel i5 2500K
      • Memory:
      • 8gb DDR3
      • Storage:
      • Intel X25 SSD + WD 2TB HDD
      • Graphics card(s):
      • Nvidia GeForce GTX 570
      • PSU:
      • Corsair HX520
      • Case:
      • Antec P180
      • Operating System:
      • Windows 7 Professional x64
      • Monitor(s):
      • HP w2207 (22" wide)
      • Internet:
      • Rubbish ADSL
    I have no idea about clustering but this sure looks interesting

    Zak: This should help you there
    I don't mean to sound cold, or cruel, or vicious, but I am so that's the way it comes out.

  13. #13
    TiG
    TiG is offline
    Walk a mile in other peoples shoes...
    Join Date
    Jul 2003
    Location
    Questioning it all
    Posts
    6,213
    Thanks
    45
    Thanked
    48 times in 43 posts
    The only thing i'll say is that while building a cluster may be fun and a learning experience most of us have little or no use for it, and sadly i don't view UD or SETi as valuable uses of that amount of processing power.

    I was lucky enough to get taught at uni by one of the experts in the country on distributed computing and still remember a stack load of that information. The real interesting thing about a cluster is the ability to use rather different setups of software.

    Something like Occam to program your own capabilities on the cluster is something that is imminently doable and could be used to do some rather interesting maths problems.

    Interested to hear the uses you find for it Rys

    TiG
    -- Hexus Meets Rock! --

  14. #14
    Rys
    Rys is offline
    Tiled
    Join Date
    Jul 2003
    Location
    Abbots Langley
    Posts
    1,479
    Thanks
    0
    Thanked
    2 times in 1 post

    Lightbulb

    Quote Originally Posted by Zak33
    my mind is now BOGGLED. I have absolutely no idea what most of that is.

    Starting at base point A, I shall ask : Are these computers, individually, similar to a home PC in as much as having a CPU, some ram, a mobo, a vid card, a hdd, and a PSU?
    Yeah, they're just indivdual PCs connected together. Each node is an IBM Netfinity 5100 server, which is just a server with a pair of P3s, on-board graphics, memory, mainboard, HDD and the PSU, exactly like any normal PC.

    They're in rack-mount cases, rather than your usual desktop ATX fare, but that's the only difference

    Software then links them together and lets you run a program across all the PCs connected to the cluster. So my collection of 8 systems in one just appears to any software as a single PC, albeit one with 16 processors and 8GB of memory.

    Clustering came about as a way to migrate away from the massive shared-memory systems peddled by the likes of Sun, SGI and Cray (to name but three) and create supercomputers from basic PCs, all linked together.

    Think of it as something like 8 Shuttles linked together on a basic network switch, just with some special software to make them appear as one PC.

    My cluster is a bit more involved that than, but not by much.

    In terms of what I'll do with it, there's a mound of research software I could put to use, including fluid dynamics simulations, protein folding, number crunching, distributed webserving and loads more. Picking a task for it is definitely the hard part!

    Rys
    MOLLY AND POPPY!

  15. #15
    Senior Member
    Join Date
    Jul 2004
    Location
    London
    Posts
    2,456
    Thanks
    100
    Thanked
    75 times in 51 posts
    • Mblaster's system
      • Motherboard:
      • ASUS PK5 Premium
      • CPU:
      • Intel i5 2500K
      • Memory:
      • 8gb DDR3
      • Storage:
      • Intel X25 SSD + WD 2TB HDD
      • Graphics card(s):
      • Nvidia GeForce GTX 570
      • PSU:
      • Corsair HX520
      • Case:
      • Antec P180
      • Operating System:
      • Windows 7 Professional x64
      • Monitor(s):
      • HP w2207 (22" wide)
      • Internet:
      • Rubbish ADSL
    That's also another thing I was wandering, what can you do with it?

    *edit*
    I was a bit late with the reply

    How about using it as the webserver for hexus?
    Last edited by Mblaster; 24-01-2005 at 12:34 PM.
    I don't mean to sound cold, or cruel, or vicious, but I am so that's the way it comes out.

  16. #16
    Sublime HEXUS.net
    Join Date
    Jul 2003
    Location
    The Void.. Floating
    Posts
    11,819
    Thanks
    213
    Thanked
    233 times in 160 posts
    • Stoo's system
      • Motherboard:
      • Mac Pro
      • CPU:
      • 2*Xeon 5450 @ 2.8GHz, 12MB Cache
      • Memory:
      • 32GB 1600MHz FBDIMM
      • Storage:
      • ~ 2.5TB + 4TB external array
      • Graphics card(s):
      • ATI Radeon HD 4870
      • Case:
      • Mac Pro
      • Operating System:
      • OS X 10.7
      • Monitor(s):
      • 24" Samsung 244T Black
      • Internet:
      • Zen Max Pro
    Some comparison stats against a standard pc for something like folding etc would be good
    (\__/)
    (='.'=)
    (")_(")

Page 1 of 3 123 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. hardware clustering
    By firefox in forum PC Hardware and Components
    Replies: 23
    Last Post: 05-09-2004, 05:23 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •