Question about October meeting

General questions regarding Linux.

Moderators: Terry, FWLUG Administrator

Question about October meeting

Postby stack » Sat Aug 27, 2011 11:04 am

Hello everyone!

There is a good possibility that I am going to be in town on the weekend of October 8th. I do believe that you will be meeting that weekend, so I am planning on showing up. My question is, would anyone like me to present on a topic?

I don't have my personal cluster at the moment, however, if you guys wanted a demo in building a cluster again I will make sure it is ready by then. Building giant supercomputer clusters /is/ what I do for a living.
I am doing a presentation this September for the OKC LUGNuts on Kickstart files.
I have done recent presentations on fslint (I help maintain the manual), rsync, htop (short presentation), disk management (moving partitions, resizing partitions, LVM, badblocks, SMART, ect), MythTV, and a bunch of other really small topics.

I don't have to do a presentation, I just thought I would offer. If anyone has something that they would like to hear about, let me know. If I have any clue how your request works, I am sure I can pull something together.

Take care!

~Stack~
RHCE
Product Keys? Imaginary Property rights? Digital Restriction Management?
These things confuse me and I have no need for them for I run Debian Linux!
User avatar
stack
 
Posts: 268
Joined: Sat Jul 14, 2007 2:11 pm
Location: Fort Worth, Texas

Re: Question about October meeting

Postby David Miller » Sun Aug 28, 2011 10:12 am

Very cool that you might be in town! I think it would be great if you gave a presentation!

I would like to see a kickstart presention if it would also (briefly?) cover debian/ubuntu alternatives, FAI and what have you. I would also like to see myth because its been a long time since I've looked at that project. So I vote for myth or kickstart.
David Miller
 

Re: Question about October meeting

Postby glevans2 » Sun Aug 28, 2011 2:31 pm

It has been a very long time since I have posted, and even longer since I have seen any of you guys, and I have missed it.

R/L has conspired to keep me away, and I still have a ton of stuff to do, but I am trying to get back.

Not sure I can make the September meeting, but I will try.
Garen L. Evans, II
++It is pitch black. You are likely to be eaten by a grue.++
User avatar
glevans2
 
Posts: 63
Joined: Mon Mar 05, 2007 1:35 pm
Location: Fort Worth

Re: Question about October meeting

Postby shultzjr » Mon Sep 05, 2011 4:05 pm

hey stack,

what's the minimum number of computers required needed for a cluster?
what's the minimum hw for each computer?
what's the sw that gets installed to make the cluster work?
doesn't the software need to be custom written to take advabtage of the cluster?
any examples of clustering software like rendering?
wolfpack versus beowolf? spelling?

thanks,
charles.....

sorry about the double posts
shultzjr
 
Posts: 56
Joined: Sat Feb 10, 2007 2:44 pm

Re: Question about October meeting

Postby stack » Tue Sep 06, 2011 7:05 pm

shultzjr wrote:hey stack,

Hey Shultzjr
shultzjr wrote:what's the minimum number of computers required needed for a cluster?

Bare minimum: 2. A head node and one worker node.
Most small clusters have 3. A head node and two worker nodes.

You can do parallel development, run parallel code, and do most all the same type of processing on a single box with dual cores/processors. However, the whole point of a cluster is having multiple computers though.

shultzjr wrote:what's the minimum hw for each computer?

This is *absolutely* defined by your requirements.

On one absurd end, the Raspberry Pi http://www.raspberrypi.org/ has just about everything you need to cluster a few of them together and do basic processing. Sure, you could out process them with just about any Pentium3 you dig out of a closet, but you could technically cluster them. I know of at least one College professor who has clustered 50 Arduino's together http://www.arduino.cc/. He led the design team in building a special hardware signal processor that runs off of the Arduino and then he clustered them to do extreamly fast signal processing. It is a highly specialized system, but it does everything he needs faster then most supercomputers (and far faster then anything in his budget).

On the other absurd end, I know one guy who does weather processing. His simulation eats 16GB in a matter of minutes. Most of his systems have 32-64GB of memory and it still isn't enough.

I also know someone who has built a GPGPU cluster. His boxes are very basic systems (Dual core 1.8Ghz Intel processors, 1GB memory, 16GB USB thumbdrive for the OS) because he has 4 NVidia GPU's in each box.

If you are just going to go with a standard distribution, well there are usually a few minimums.

Rocks Clusterhttp://www.rocksclusters.org require 30GB disk and 1GB memory on all systems. The Frontend needs to have 2 network cards.

Bootable Cluster CD http://bccd.net/ isn't as full featured as Rocks is, however, it only requires a 2GB disk and 512MB of memory on the frontend. The nodes need to have 512MB of memory and no disk is required. The frontend requires 2 network cards.

So it depends on how full featured you want your system to be and what resources you need availible.

shultzjr wrote:what's the sw that gets installed to make the cluster work?


Again, this is very much dependant on your software.

I know one guy who has a HUGE amount of work load to do. He has a single program that runs with _billions_ of possible permutations. He made his cluster a BOINC cluster http://boinc.berkeley.edu/.

I know someone who was able to adapt his program into sed and awk. I am not even kidding. He did message passing over SSH keys.

I have built, used, and submitted bug reports and code to the Parallel Processing Shell Script http://code.google.com/p/ppss/. Again, SSH keys and mostly Bash scripting. If you have the work load that can run in this environment, I think it might be among the easiest to learn if you already use a lot of Bash.

Most clusters in the http://top500.org/ seem to use either the C language or the Fortran language with the MPI library http://www.mcs.anl.gov/research/projects/mpi/. This requires some sort of communication so the MPICH library is usually used http://www.mcs.anl.gov/research/projects/mpich2/. However, if you are going to have multiple systems then trying to manage that all on your own can be difficult (esp as you scale larger) so most people use a computer resource manager like Torque http://www.clusterresources.com/products/torque-resource-manager.php. If you have multiple users, you will need a user manager like SGE http://wikis.sun.com/display/GridEngine/Home or Maui http://www.clusterresources.com/products/maui-cluster-scheduler.php. I prefer Maui, however, I know a lot of people who will argue for SGE or for the paid version of Maui called Moab.

The growing number of GPGPU clusters in the Top500 mostly use C/MPI with CUDA kernels.

shultzjr wrote:doesn't the software need to be custom written to take advabtage of the cluster?

99.9999% of the time you will have to modify your code.

Is your program serial? Yes.

You will have to adapt a serial program. This may be easier then you think or WAY harder. Adapting a serial program into OpenMP code can be fairly trivial. Then, depending on your workload, you *might* be fortuante enough to just have to write a MPI wrapper.

Is your program pthread enabled? Yes.
Pthreads are great but if your program was written expressly for pthreads it most likely is not configured for distributed processing across nodes. Multiple processors with multiple cores in a single box? Well, you won't notice a difference because pthreads will just scale as best as it can within the boundaries of your program.

Did you design your program to use multiple processors and multiple cores and code in something other then pthreads? Then you might be able to make a few adaptions to the code and not have to worry about it.

shultzjr wrote:any examples of clustering software like rendering?

Blender. The most famous example of 3D rendering around, right? If you have a multi-core system, pass blender a few options and it will use them (if it didn't auto detect the cores on startup). It has a mode that will allow it to dump the pre-scenes that nodes can pick up and render. There is also a way to pipe the scenes into a queue manager like Torque for processing. Also, there are ways of getting Blender to run in BOINC (check out http://burp.renderfarming.net/ and http://www.renderfarm.fi/ ).

It has been a while, but I know one of the art guys in college piped out his GIMP renderings (using the script-fu plugins) to a cluster for various renderings. Not sure how official that was vs how home-made it was, however, it is possible.

Almost all of the big 3d rendering companies use clusters. You would be hard pressed to find a movie studio that _DOESN'T_ use a Linux cluster (Dreamworks, Pixar, Paramount, ect all doing movies from Toy Story to Spiderwick Chronicles to Lord of the Rings to Legend of the Guardians [which was done by the company Animal Logic on their (at the time) Top500 Linux cluster using a lot of open source software like GIMP and Blender]). Even the REALLY expensive proprietary 3D modeling software products have software installs for Linux cluster rendering. It is just too big of a feature in the industry to /not/ include the ability to do Linux cluster rendering.


shultzjr wrote:wolfpack versus beowolf? spelling?


Wolfpack is /terrible/ and no I am not just saying that because I am a huge Linux nerd who doesn't own a windows box. It really is a terrible terrible product. The only people I know in the industry who use it *have* to use it because their boss's boss's boss's boss made a deal with microsoft. The admins all hate it and are losing their sanity all while driving away their user base. Most of them dual boot their cluster. Linux 99% of the time and microsoft whenever they have to.

I know one guy (honest!) who worked at a college where microsoft paid them a LARGE amount of money for them to buy their cluster. Therefore, they have to claim that it is a microsoft cluster and when they held a spot in the Top500 supercomputers of the world, it was with a benchmark run on the microsoft platform. However, they couldn't get it to function properly at all and they wasted many hours and far more money then they should have trying to get it to work. It benchmarked great and had fantastic hardware but they had issues with the software. They converted it to Linux. The only time it ever ran microsoft was for benchmarks. The *entire* rest of the year was Linux.

Don't even get me started on my experience with microsoft clusters. The *only* thing I will grant microsoft leaway on is that it was 8 years ago I had those issues. (How many of you have blue screened 25 windows PC all at one time before? I have :D ) I will grant that they are getting better, but they have a LONG way to go to even be considered a *potential* contender against the *nix world.

As for Beowulf, it is just a term for a cluster of inexpensive PC's. Originally, it was a name of a specific cluster but now the term is very vauge and generic. It typically refers to a Unix OS, but I have heard the term applied to windows and OS X clusters before as well.




Anyway, hope that clears up a few of your questions. Feel free to ask more. I love talking about this stuff (if you couldn't tell). I tried to be as simple as possible while still providing accurate information and plenty of links for you to read more if you want.

Later.

~Stack~
RHCE
Product Keys? Imaginary Property rights? Digital Restriction Management?
These things confuse me and I have no need for them for I run Debian Linux!
User avatar
stack
 
Posts: 268
Joined: Sat Jul 14, 2007 2:11 pm
Location: Fort Worth, Texas

Re: Question about October meeting

Postby shultzjr » Fri Sep 09, 2011 5:27 pm

hey,
great responce. I've always been interested in clusters they tend to be expensive.
I would assume one could build a small one to speed up the drawings in a CAD program for faster production work.
or would that mostly be the graphical subsystem?

later,
charles.....
shultzjr
 
Posts: 56
Joined: Sat Feb 10, 2007 2:44 pm

Re: Question about October meeting

Postby stack » Fri Sep 16, 2011 7:12 am

For a CAD project, you would probably be better off looking into either a better video card or one of the new Hybrid chips that AMD/Intel are putting out where you would have multiple processor cores and GPU cores in a single die. I know a few people who export the final renderings off to a cluster, but to actually build/work-with something in CAD it is better to have the processing power locally.


OK, I have confirmed that I *will* be in town that weekend and I am planning on doing a presentation. I don't know if I am going to have my new cluster-in-a-box built by that time so I don't know if I can do the cluster talk (I guess I could do it in a vm, but that isn't nearly as cool :-) ). I have my kickstart presentation almost done. I just need to run through it a few times to make sure I know what I am talking about. So I am leaning towards doing that presentation.

Can't wait to see everyone again!

~Stack~
RHCE
Product Keys? Imaginary Property rights? Digital Restriction Management?
These things confuse me and I have no need for them for I run Debian Linux!
User avatar
stack
 
Posts: 268
Joined: Sat Jul 14, 2007 2:11 pm
Location: Fort Worth, Texas

Re: Question about October meeting

Postby RedCactus » Fri Sep 16, 2011 1:32 pm

Stack,

Thanks for the very interesting write-up on cluster computing. Just this morning, before visiting the FWLUG site, I was looking at OpenCL on Macs. Turns out OpenCL comes installed on Mac OS 10.6, and later.

At the September meeting, we talked for a short while about using GPUs to do mostly vector array processing and DSP tasks. Fedora use to support this on the early PlayStation 3s, I believe it was, up until Fedora 9. They still have an outdated web page at https://fedoraproject.org/wiki/PlayStation.

I would be interested in a cluster computing presentation, with discussion about OpenCL, or a kickstart presentation as Fedora, the Linux I use, offers it.

Thanks,
RedCactus
RedCactus
 
Posts: 4
Joined: Fri Sep 16, 2011 12:58 pm

Re: Question about October meeting

Postby stack » Tue Oct 04, 2011 6:57 am

I will *not* have my portable cluster ready. However, I will still be up for doing a talk about kickstart.

See you guys there!
Product Keys? Imaginary Property rights? Digital Restriction Management?
These things confuse me and I have no need for them for I run Debian Linux!
User avatar
stack
 
Posts: 268
Joined: Sat Jul 14, 2007 2:11 pm
Location: Fort Worth, Texas

Re: Question about October meeting

Postby jonG » Tue Oct 04, 2011 10:26 am

Nothing posted but I assume there's going to be a meeting Saturday at the downtown Fort Worth library around 10:30...?
jonG
 
Posts: 1
Joined: Tue Oct 04, 2011 1:29 am

Re: Question about October meeting

Postby David Miller » Wed Oct 05, 2011 2:06 pm

The library meeting room was taken, so I called our usual VFW Post 8235 and let them know that we'd be meeting there this saturday.

The meeting will be at 10 am at the VFW, I'll update the website and send out a mass email here shortly.
David Miller
 


Return to FWLUG General Discussions

Who is online

Users browsing this forum: No registered users and 15 guests

cron