Tuesday, July 28, 2009

Redundancy

redundant: n. see redundant

At the top of the IT world, is not only expensive equipment, but also equipment which needs to never fail. Let me give two examples from my work.
http://www.fz-juelich.de/jsc/CompServ/graphics/cell_blade_side1_large.jpghttp://www-06.ibm.com/systems/jp/photo/bladecenter/picture/ls42_r.jpg
We have a few VMware clusters. One recent setup has seven blades, each has two sockets with quad cores and 32GB of RAM. It occupies a standard rack width and about 4U high. If you've never worked with VM's you probably think that this means we have the equivalent of 56* systems each with 4GB of memory in that little space, which is pretty cool. But we can run over 60 machines on a single blade without overloading it. We can easily fill a Class-C subnet in a quarter-rack -- crazy! This is part of the recent appeal of virtualization. Floor space in a raised floor is expensive. If you have a smallish server farm, you could relocate it into a closet with the right air-handling. There are a lot more arguments for virtualization, but redundancy is a pleasant surprise for most people who actually use the servers. If one of the seven blades** dies, the infrastructure software can boot the server instances on another blade automatically in seconds (vm's boot really fast). If we need to do planned service on a blade, the vm's can be migrated to another blade with a hiccup so quick (1-2 seconds) that you have to be looking for it to notice since all running programs and memory gets copied to the new blade. The blade center itself (the case that holds the blades) has dual power supplies which go to entirely different power distro boxes, and in turn, provides power to the blades. The blade center also features eight IO module slots, each of which can house (6) gigabit ethernet ports, (6) Fiber port pairs or a console interface. We actually have ours connected to four different subnets with two cables for each coming from different switches. In terms of space: we're using part of one rack to replace eight or so, we're using two 30amp circuits to replace sixty 15amp circuits, and 8 network cables to replace 60. And in all this it's like we have given each virtualized server dual power, failover networking and a hot spare system since all files including the OS is stored on a SAN.
http://www-05.ibm.com/lt/storage/disk/i/DS4800.jpg
Another example of redundancy is a SAN system we use for data streams to and from manufacturing and critical to operations. In addition to the redundant fiber, ethernet and power, each DS4800 controller is actually two controllers which alternate as primary and backup controller for each group of drives. We have twelve drawers of fourteen drives (the horizontal container of drives is a drawer) with drives arranged into RAID5 arrays vertically, so if an entire drawer dies, no data is lost and the hot spare drive or two at the end of each drawer will take up some of the slack, completely rebuilding some arrays while other continue is a slower configuration (degraded) . With the hot spares and the parity inherent in RAID5, 12.2TB of raw disk space yields 8.7TB of usable space in 32 arrays/LUNs. But this level of redundancy allows us to maintain the system on a Monday thru Friday 8-5 basis despite being critical to manufacturing 24x7. I once had a drive fail on Saturday -- this slowed down the array but no data is lost. Meanwhile, the data from the missing drive is re-created on a hot-spare in about 15 minutes, after which it is added to the array and the array is back up to full speed. Then about 10 hours later, another drive from the same array (different drawer) failed and it was replaced by another hotspare. For those who aren't in this game, a failed drive in a RAID5 array for a crititcal system will usually result in someone getting called out. For two drives in the same array, multiple people lose sleep and calls are made to techs in other states. All this will be followed by a week or more of post-mortem meetings deciding how to avoid this in the future and quite possibly who is going to lose their job(s). With a SAN's ability to realistically share so many hotspares among multiple arrays, this was a non event. As long as two drives in the same array don't die at the same time, we can lose nine drives before data loss at its most pessimistic, and if the failures are evenly distributed, we could lose 40 drives without losing data.

All this is great for protecting against equipment failures or other remote issues like switch failures. And because it's so dense physical security is easier to achieve and monitor for one spot than dozens scattered around. It does make a tempting target for the malicious however. Since there are four subnets connected to the blade center with two cables each. Someone who understood what was going on could swap the ethernet cables, plugging each into a port for the wrong subnet for that cable. You could easily disconnect 200 servers from the network in less than 30 seconds and it would probably take over an hour to diagnose the problem, and quite possibly 4-5 hours while the admin checks the switches and routers which would seem the obvious culprit. Even a natural disaster, like a maintenance worker working in the ceiling above the drop tiles could step on the wrong pipe and send hundreds of gallons of water from the sprinkler system onto that one rack hosting 200 machines.

Last April, Morgan Hill, an affluent community in a valley South of San Jose (and SF) along the 101 which runs the length of California, fell victim to an apparently coordinated team of unknown men who entered four service access areas via man-hole and cut eight fiber-optic cables utterly disconnecting Morgan Hill from the rest of the US communication network. Eight teeny glass fibers the size of a human hair -- how bad could it be?
The city of Morgan Hill and parts of three counties lost 911 service, cellular mobile telephone communications, land-line telephone, DSL internet and private networks, central station fire and burglar alarms, ATMs, credit card terminals, and monitoring of critical utilities. In addition, resources that should not have failed, like the local hospital's internal computer network, proved to be dependent on external resources, leaving the hospital with a "paper system" for the day.

Although the author goes on to blame centralization, the fact that the had to hit four places seems reasonably redundant. The problem lies, in my opinion, with so many pieces of critical technology having so many physical and logical layers and dependency of still lower layers, that analysis is nearly impossible. I'm sure a lot of planners figured that even if the telephone land-lines went out, the cell phones would still work. I'll bet the hospital didn't have their own DNS server and that was the piece that broke communications on their local network. The local workstations and servers were still connected, but they couldn't find each other without the a DNS server to tell them which unique address went with which system name.

If I had to guess, I'd say that between 1-2 billion USD were spent for Y2K compliance across the US. After all the studies it looks like the fallout might have topped 50M, but we didn't know until we did the studies. We all hated the Y2K process because it was mandated by bureaucrats and tedious. But as IT services becomes more embedded into critical services***, we need to dust off and reuse these skills to dig down to the bare ground when analysing failure scenarios. If you have badge access to the server room (and who doesn't) and the whole city block loses power, will you be able to physically get to the server room to shutdown systems? Security will probably be there to let people in the front door, but how about getting to your floor? How about the lobby doors, or the door to the server room? Sure, you have UPS for the critical systems, but it can take 4-5 hours or sometimes days to fix/replace a blown transformer for a downtown block.

--=={{}}==--

* - 7 blades * 2 sockets * 4 (quad) cores = 56; 32GB / (2 * 4(quad)) = 4gb

** - "The Seven Blades" sounds like a Japanese Martial Arts film doesn't it?

*** - It's worth mentioning that I live in a little town that gets it's water from a impressive well-head which has a dedicated backup generator in case of power loss, and even then the storage will run the town for about 3 days before people at higher elevation go dry.

Labels:

Monday, July 20, 2009

Flyboy Conceit

Whenever I get the chance, I watch birds making landings. It's not as instructive as watching airplanes landing, but there are a lot more birds landing and I might catch one anytime -- my favorite is watching them out the window during our weekly dept meeting :)
http://www.betterphoto.com/uploads/processed/0013/0305032143261blue_bird_landing.jpg

Now in some ways, it's not a fair comparison. They have far less mass and much more variation in their wing configuration -- they also have to stop with next to zero ground speed.

A notable exception being birds doing a water landing. I had the pleasure of watching some Canadian Geese coming in to land on a pond recently. They had their feet out in front (gear down) and wings shaped so that they looked like an inverted 'w' (full flaps), gliding in on final. One had obviously misjudged the head winds and had to clean up and even add a little power for a few seconds at about +20' AGL. I watched and thought, I've done that.
http://www.dannybrown.co.uk/Goose%20Landing.jpg
That isn't the first time I've seen a bird have to scramble for their landing. I find it reassuring that these creatures who make dozens of landings every day of their lives still struggle looking for the perfect landing. Now if only I could see my CFI porpoise ...

Another recent insight while observing feathered pilots was a finch who decided he needed to be looking the other way when he was perched at the very top branch of a young tree. Since birds' feet are made for perching and landing, they aren't designed for rotating 180° allowing him to turn around on that narrow branch. So, he flaps a couple times, climbs almost straight up, waits for the stall (probably 5' or so), then at the peak, he dropped one wing, spun a half circle and dropped in to land on that branch facing the other way. Yes, I was jealous.

First I thought, given his extreme acceleration, it might be a useful trick to quickly change direction for a fighter. Then while reading about the required maneuvers for a Commercial license, I found out what that bird did was a very high power, precision Chandelle. I wasn't taught that one for my Private, but I hope it's part of the curriculum for pilots in the Rocky's and near the canyons in the southwest. It looks like a great move to escape getting boxed in by terrain.

Labels:

Wednesday, July 1, 2009

Flight Sims revisited a.k.a. FSX vs. X-Plane

For flight sim'ers, the argument of Microsoft's Flight Simulator X versus X-plane is like high-wing vs. low wing for pilots or vi vs. emacs for Unix geeks. If this were some popular blog with hundreds of readers, the comments section on the previous flight sim post would start out with a few well meaning corrections, then laundry lists of the advantages of one over the other, eventually descending to acrimonious discussions of the ancestry and sexual proclivities of opposing posters.

Since it's just me, the things I read and people I talk to, you don't get the encyclopedic lists of features, but I like to think the signal to noise ratio is higher and much less chance of a Jerry Springer-like devolution.

A more accurate metaphor of FSX vs. X-Plane is CD versus vinyl record. FSX is CD's -- ubiquitous, easy to use and pretty darn good. X-Plane is vinyl records -- better fidelity at the cost of a more complex setup and not nearly as easy to use. To extend the metaphor to nearly it's breaking point, the various ipod mp3 players would equate to the Google Earth Flight Simulator.

The X-Plane flyers will generally have more money in equipment, like yokes, multiple monitors and rudder controls instead of tube amplifiers and sub woofers. But here's the hidden truth to the metaphor: the people who are really serious audiophiles will have that turntable AND a CD player. Similarly there is nothing wrong with having FSX and X-Plane installed at the same time. The extra equipment will work on both unless you want run both programs at the same time*. Getting the other flight sim is a cost of about how much it costs to rent a plane for a half hour.

Assuming your airfoils are correct, X-plane will automatically figure out how the craft will fly taking into account Angle-of-attack, air density, and other factors. If you're looking to learn to fly a specific airplane, be it a 747 or the RV-6 you're building in the garage, X-plane is your baby. Your control is far more granular, and you can even have is display critical values on the screen in mid-flight so you can see where the breaking point is that the lift drops off rapidly.

FSX keeps the model design and flight characteristics separate -- it's fairly easy to modify a 747 to give it the flight characteristics of a sailplane. On the other hand, if you're working procedures, FSX gives you a much better interface for practicing the missed approach and talking to ATC or learning how to fly a WAAS approach using the G1000. The simulated planes don't always respond exactly like a real one will in flight and the instruments don' t update as smoothly as X-Plane, but you can actually use the glass-cockpit interface to enter a flight plan and fly it while talking to ATC. If you looking at getting your Instrument rating or higher and are a good VFR pilot, FSX will probably give you more.

FSX is also often dismissed as being all about the pretty graphics, but we are visual creatures. I'll admit animated water fountains on the Las Vegas Strip are simply fluff; however, the terminal buildings at Burlington airport (BTV) are an important visual queue for telling which side is the commercial terminal and FBO and which side has the ANG and all those F-16's and guys with assault rifles who aren't keen on trespassers. In most cases, FSX aircraft are more detailed and polished, including those little switches, knobs and levers in the cockpit actually doing something and are legibly labeled.

To be fair though, I took screen shots of the cockpit of the most basic, common craft I could think of, the Cessna 172 in both programs running at 1280x1024. While as expected, FSX had the better exterior scenery, the graphics of the cockpit interior had bigger instruments but not nearly as sharp as in X-Plane. It turns out that a standard radio in FSX runs 235x90, while X-Plane manages it in 172x56.

If you want to get really serious about using an accurate G1000, GNS 430 or 530, Reality XP offers packages that promise to integrate the official training simulators from Garmin with either X-Plane or FSX. Since the Garmin software simulators are free, there are other, free solutions, which are both kludgey and get complicated, like running the Garmin simulator on another computer.

If you'd got a spare PC or display, there is a big advantage to running it on another screen. The displays just aren't big enough yet to see the whole panel at once and outside. The graphics cards are getting closer all the time, but the screens themselves just aren't getting bigger and denser to allow a sim player to really see the detail needed for operating a GPS without fudging the size up quite a bit. I would guess you're looking at a 36" inch screen (which they make now) and something like 4000x2250 resolution (which is till a ways off).

The GNS 430w has a 3"w x 2"h display that has 240x128 pixels. The 530 is 4"w x 3"h (exactly twice as big) and 320x240 pixels. I mentioned that there are free simulators from Garmin for the 530 and 430. I checked and the "screen sizes" in the simulations are exactly the same sizes.
Here are "life-sized**" captures of the 530 and 430:


Here's the 530 reduced to fit the FSX cockpit:
http://www.valknot.net/fs_revisit_pics/gns_530w-for-fsx.jpg
And even smaller for X-plane:
http://www.valknot.net/fs_revisit_pics/gns_530w-for-fx.jpg
[For those who are curious, here are the links to see the 530's pasted into the cockpit captures for FSX and X-Plane.]

--=={{}}==--

* - Now I'm going to have to try that since X-Plane starts in a window instead of full-screen, I'll start of with my C 172 on the ground at my home airport and see if I can take off simultaneously in both programs and how far and fast they drift apart.

** - Actual size depends on the dpi of your screen, but the display on the screen captures have the same number of dots as the actual instrument does in real life. I suspect the proportions of the 530's frame and buttons was fudged out a little bit in the simulator to make it easier to hit the right buttons and turn the right knobs since both the 530 and the 430 are exactly the same width for the whole unit.

Labels: ,