Thursday, October 28, 2004

What is Clustering

Recent years seem to have sparked a lot of interest in Clustering Technology. In one form or another, clusters will exist and seamless in our everyday lives and vital for the future of computing.

Clustering is not a new methodology. Clusters have been around for decades! There are myriad ways why one would “cluster” computers. Some, because they demand computing power that “ordinary” computers can not provide or because their setup demands redundancy at multiple levels. Clusters are aptly named because you put together individual computers to serve a purpose and typically modern clusters are designed in such a way that the end user “sees” just one machine.

We can generally define two distinct methodologies of clusters: Highly Available Computing (HA) and High Performance Computing (HPC). Each have their own distinct purpose.

Lets talk HA. Highly Available Computing is so named because they are deployed in mission critical setups. It simply means that at no time must the service be denied a user. These are done by Data Centers, Financial Institutions, Media Groups, Email providers, Governments, Military, DNS servers, etc. HA have redundancy built into them. Thats the whole purpose. The “Machine” typically have multiple hard drives, power supplies and entire systems that should any one item or “point of failure” go down another machine or device must take its place. Yet at the same time, data must be secure and available and this action must be seamless to the end user!

Another methodology is called High Performance Computing (HPC). True, HPC is not limited to clustering technology and can mean massively parallel processors but for our discussion we will limit ourselves to HPC as they pertain to Clusters.

As its name implies, High Performance Computing's focus is crunching numbers. Thats its primary purpose. These machines are typically used in laboratory settings--- their purpose in life is to crunch numbers, decode DNA, predict the weather, track satellites and search for that obscure baby name site on the Internet. A pretty good example of HPC are those deployed by Google which they use to search information quickly.

David Becker and Thomas Sterling of NASA back in the 1990s had this idea. Why not build a High Performance Computer out of commodity components? PCs and PC parts had gone to price levels that were so cheap that coupling of PCs to build a supercomputer was possible. Enter the age of Commodity Off-the Shelf-based High Performance Computing which was called “Beowulf”.

Beowulfs are highly scalable meaning you can just add and add more “nodes” to your machine to increase performance. Beowulf runs on Linux and its future is bright as niche market in the ever increasing high performance computing world. Today, Beowulf is an accepted genre in High Performance Computing. (for more information on Beowulfs click here).

Microsoft did a similar project but for Windows way back when called the Wolfpack. Essentially the purpose was to deploy HPC/HA using Windows. However, most people well the sane ones would prefer to deploy Unix and/or its variants e.g. Linux, BSD, etc. on an HA/HPC. Well its much easier on the latter than the former in our humble opinion but thats just us.

Anyway, there are myriad ways to skin a cat, as they say. Present day network technologies allow off the shelf deployment of Cluster technology and its not just Beowulf. Apple Macs utilizing their Rendezvous software solution, Apple Airport (WiFi), Mac OS X and Xgrid software can get you started on a “two” node cluster and can scale appropriately. Linux Virtual Server Project (www.linuxvirtualserver.org) can help you build affordable Highly Available Cluster and then there is something called a Grid (or you can also visit Globus) which is the coupling on not only the hardware/network level but application layer level as well to form inter-operable highly available and high performance systems.

Isn't it all exciting and stimulating?

References and Further reading:

IBM DeveloperWorks Bleeding-Edge Stuff, lots of tutorials on these "edge" technologies. also can be found here.


for linux:
The Linux Documentation Project - everything you need to know about linux.

Linux Virtual Server Project - good primer on clusters and clustering methodologies

Linux Iso Images - site where you can download an iso image to burn so that you can run linux now.

Suse Linux - one of the best commercial distributions of linux owned by Novell

the Debian Project the best linux distribution in my humble opinion on the planet today because of "apt", just don't get involve with the politics!

Knoppix Project! Linux on a LiveCD! it means, you can run the operating system by simply booting the CD or DVD. :D great no? especially when you don't have a hard drive or machine to spare to play with linux!


The Beowulf:

Beowulf Project - Number one resource on the Internet for Beowulf.

Information about Grids:

Grid

Globus Project this will driving Grid technology today and in the future!

Apple Mac:
Apple XGrid - really nice piece of technology. makes life easier.

Microsoft:
Microsoft Wolfpack - article on wolfpack.

Eclipse Project:
Eclispe Project - one of the best development platform out there. could be useful when you start developing Cluster/Grid Apps. very scalable piece of software. It was developed by Big Blue

Monday, October 04, 2004

Suse 9.1: a Review

Ok. I finally installed that Suse 9.1 DVD Novell. Yes, that free DVD that everyone seemed to want. :D

I installed Suse 9.1 on an ancient P3 600 on 820 intel board and 192MB Ram and nVidia 32MB RivaTNT. Ancient tech. But i was fed up with Mandrake 10. wiped it off the machine and finally installed Suse 9.1 DVD.

I got to tell you, I've used RH, FC1 & 2, Mandrake and the like since 2000 and this was my first crack at Suse. It certainly didn't disappoint.

Installation? Yast? Excellent! It was able to detect every ancient hardware I got. Well, its a rare occasion when linux fails to detect such ancient hardware these days. That was the easy part.

I was able to setup the apps, without trouble. Although I wanted to give the cryptographic filesystem a try, didn't have enough disk space. The machine only had 8GB and 4GB's already running off windoze (i wanted to play games, which is why windoze still exists at all on this machine). Package installation was a breeze. Everything was setup... but I did change KDE to Gnome (just a matter of preference).

Booting? Excellent. No problems and the graphics covering the underlying systems startup is great. :D cool blue with suse logo. GDM did disappoint... didn't look as cool but then, that's just for aesthetics sake so no big problem.

Gnome booted well. All systems enabled. The menus... didn't contain everything I wanted installed... but they were hidden under the “suse menu” which ought to be corrected in future releases. Its hard for a user to just look under three layers of menus for a simple office app. Don't you think? That was my first complaint.

My second complaint was the lack of dvd access. I was playing DVDs that were original and paid for legally. Surely, I have the right as a consumer to be able to play them on my machine? Suse pointed out that there were legal considerations why xine's dvd capability had been disabled and that I should visit the link that they gave me. Unfortunately, the machine didn't have a live Net connection. There are other ways to skin a cat after all (but thats another story).

But, bloody entertainment industry fools, haven't you guys learned from Apple and company? Tech isn't something to be afraid of, its a tool. I digress yet again.

As I was saying, didn't have a chance to run yast on a live internet link. But it used it to install gtk from the dvd. Yast is the package installer/systems control panel thingy and as far as I could tell... runs on text mode (well i've only had suse for 2 days). Which isn't so bad come to think of it. Makes it difficult for a “normal” user to play with. :D you've got to be an experienced/intermediate user to figure out the system or destroy it.

Suse 9.1 is an excellent piece of software. It runs kernel 2.6.4, gnome 2.4 (which isn't so bad) and a whole lot of software that works... including if I may add mono (albeit 0.3).

This is the next best distribution I'd recommend and the first commercial one. The first is still debian of course with its powerful apt! :D