Wednesday, December 9, 2009

I Ran Into Calculus One Morning

I've heard the statement "You'll never use more than 10th grade math in your real life" or a similar statement often. For the most part it's true, but there are exceptions to just about every rule. This is my story of one of those exceptions. If you're not a geek, you probably want to stop reading now.

In my last job, me and one other guy had a task to harvest security data from approximately 500 servers, weed out some initial garbage data and create one report for each server containing relevant information each month. As this was not our only responsibility, we had to approach the problem with a programmatic solution, or in other words we had to write a program to harvest the data for us. We did this, and the first time it ran, it took just under 7 days to finish, running non-stop. As this was a month-end problem, starting it on the first of the month and waiting a week to get started on the analysis portion of the task was unacceptable, so we had to find another way to do this.

We were talking in one of our managers offices and just having a brainstorming session when it came upon me that we could emulate clustering if we did it the right way. Clustering, simply put, is using multiple computers at the same time to do work faster. Before we had been accomplishing the harvesting serially, but now we had the power of parallel processing. There were some unique challenges as we were only allowed to use DOS batch scripting on these servers, but we did get it to work. We also noticed that each session used so little processor power (both clock time and memory) that we were able to run multiple sessions of the program on one server, so we didn't have to have tons of resources tied up while this was running either. Incidentally, the way we solved the clustering problem also introduced collision protection as a freebie, so that meant that no two sessions would be doing the same work or get hung up trying to do the same work. We started off 10 sessions of the program the first time we tried it and viola, the task was done overnight.

This was a good solution to our problem, but we wanted to get some more information on what was going on, and the next month we had added logging that checked not only when each server began and ended, but also when the whole process began and ended. We bumped up to 20 sessions on 2 servers and on our second run were done in about 4 hours. This meant that we could actually get started on the other work we had to do on the same day that we ran our harvesting program if we wanted to, but since we wanted to see just how fast we could make it, the next month we pushed it up a bit more.

On our third attempt, we ran 40 sessions on two servers, and the task was completed in about 2 1/2 hours. On first look this might seem just about right as we doubled the sessions and it finished in about half of the time, but we were wondering exactly what was going on, because the last time we doubled the sessions from 10 to 20, we had over a 60% gain in efficiency, but this time we only had 37.5% gain in efficiency, and 60% was with 10 sessions, but 37.5% was with 20 sessions, so if you divide the numbers out, on our second run, each additional session added a 6% efficiency jump, but in the third run each added a 1.875% efficiency jump. How did we get both 6% and 1.875% results by doing the same thing? For a bit we were stymied.

We decided to test what was going on in our fourth run. So that we'd have more data to analyze, we ramped up to 80 sessions on 8 servers and the task completed in 2 hours. That's only 1/2 an hour faster, or 20% spread out across 40 sessions, or per session they added 0.5% efficiency. We went down from 6% to 1.875% then to 0.5%? As Joe (my scripting partner) and I were talking about this over a coffee, I had an epiphany and realized that I had seen something like this once in a Calculus class I'd taken way back in 1994, so I looked it up and sure enough, we'd managed to bump into a Calculus principle without realizing it.

The particular bit of Calculus that was hindering us was something called "The Limit of a Function" which looks at a function containing a variable and sees what happens to the function as the variable's value changes. What we had was a finite amount of work that we were dividing between a variable amount of help. It doesn't matter if I count the work as 1 task or as 500 servers, the principle still applies, but our function could be looked at as either 1/x or 500/x and over the four months x moved from 1 to eventually 80. The notation x=1-->80 is similar to how it looks in Calculus. Let's look at the problem with small numbers first and see if we can see how this was working.

If we go from 1 to 2 sessions, we should finish in 50% of the time or if you subtract that from 100%, which was how much of the time it was at 1 session, we have a 50% time savings. When we go from 2 to 3 though, we should finish in 33.3% of the time, which when we subtract that from 50%, we get a 16.7% time savings. From 3 to 4 we get a 8.3% savings and from 4 to 5 we are down to 5%. I've created a chart to show you just how fast this number drop off from there. By the time your at 10 sessions, that session only saved about 1% and at 20 it only saved 1/4%. The real life variations in our data with concern to time was that our data on each server was different sizes, what we were collecting were different sizes, so those variables added into our data would explain any other discrepancies in our actual time versus this function.

Another thing we noticed was that about 1% of our servers were taking up to 2 hours each to complete. This was due to them being hard to access as they were in South America and on slower lines, but since they had to be done for the whole task to be complete, we always had to wait for them. Since we sorted alphabetically and these particular server's names started with either a B or C, we only had to get through the beginning of the C's to get them started in our initial launch, which was about 25 sessions. If we started fewer sessions than that, we had to wait for some of the sessions to go on to their next servers before these would get picked up, but even if we waited, the difference was negligible.

These five or so servers would take a couple hours no matter what you did, so after this we focused on finishing most of the work efficiently. We had a queue of servers to get to, and as long as there was a server in the queue, each session would stay alive, but when it was empty, each session would terminate when it completed the server it was on. We could push up our first termination by increasing sessions, but we couldn't actually finish the whole task faster by doing it. Since it wasn't really to our benefit to finish part of the task faster and not all of the task faster, what we did was try to bring these two numbers together by reducing sessions so that the queue was emptying at about the same time as the slow servers were finishing, thus balancing optimal efficiency with use of resources. What we determined was that somewhere around 30 sessions was our sweet spot, where the task was evenly spread across as many sessions as we could get and all the sessions were terminating as near to the end of the task as we could get.

So what we determined was that if we added more than 30 sessions to our cluster, the overall result wasn't significant. 100 sessions couldn't really do all the work any faster than 30 could, even though that just seems illogical, but the operative word is all. What was the point of trying to squeak out a fraction of a percentage point of efficiency by adding more sessions? What was humorous about this situation was trying to explain this concept to our managers who never took an advanced math class in their life, and believed with utmost faith that they would never need to use more than 10th grade math in their real life. Fortunately some of the managers just trusted us, and the speed jump from 7 days to a couple hours was enough for them, even if they couldn't figure out just why we could never seem to finish the task in less than 2 hours.

Saturday, October 24, 2009

Just a thought

When an actress says the name Johnny in a movie, why does it invariably turn out creepy?

Sunday, October 11, 2009

Opinions or Stories

I love to write. I've always known I was meant to write, though only recently have I been producing anything of good quality. I've written in the past too, but mainly exposition, not anything narrative, or if I did write any narrative, I never completed it. OK, whatever you might say, and I hear you, but I want to make a point that writing exposition or narrative is worlds apart.

I've written more opinion this year than narrative by a lot. So far this year I've completed 5 short stories and started maybe 10 more. Still, I've written about 60 blog entries (not all have posted yet) in the same time with much less work and in just a mere fraction of the time. What I have discovered is that writing opinion is easy. You don't have to have much of a plan of what you're going to write or where you're going, and on top of it, when you feel like stopping, you're done.

Narrative is a different beast. You have to begin with the end in mind. Know where you're going. Keep an eye on the story and make sure that you are being consistent. There is much more work that goes into a writing a story. You have re-writes and editing to take care of. And even after your first and second drafts you realize that you missed an important part of the story, so you fit it in, and then you have to do one more proof read and edit again.

I'm not trying to make expository authors seem like they are taking the easy way out, but I have a sneaking suspicion that while many narrative writers could easily pick up exposition, not many exposition writers could as easily slip into narrative writing. Maybe I'm wrong, but I don't think so.

Thursday, October 8, 2009

Writer's Block

I get writer's block at the strangest time. I just finished a long blog entry and a short story yesterday and with the feeling of accomplishment of both of those tasks, I do not feel like writing much of anything. This isn't because I don't want to write, but like my creativity is sapped from my last project, or projects in this case. It's one of my greatest frustrations as a writer.

Oh well, maybe it won't last too long. I'd sure like to write a few more stories before the end of this year. Strangely enough, I think I got a title for one of my stories while trying to research the title of another of my stories.

Monday, February 2, 2009

Sharing

I've been considering the whole concept of sharing recently. Everybody considers sharing a selfless act, but in my introspection I'm beginning to think differently. I'm thinking about a very fine line between giving and sharing, so thusly you could say I'm not quantifying all of sharing, just some of it, and in the case I'm speaking of, it could very well be subconscious, and the person not really be aware of it. Let's dissect sharing.

Lets say that someone shares food with another person. In this simple case the person sharing the food could potentially benefit by means of eventual reciprocation and may be subconsciously conditioned to do so guessing that they will eventually be rewarded for it. My guess is that the person sharing food is likely to do so most when they have more than they need or want, and the likelihood of someone sharing goes down in respect to how much excess of what is to be shared goes down. If their is no excess, or even a deficit, sharing becomes just that less likely.

Maybe there are two kinds of sharing. One where you have more than you need, and you are subconsciously giving away what you don't need for a chance to fulfill a future need, and one where you consciously transcend your normal instinct of self-preservation and you share to your own detriment. The second type seems more noble.

Tuesday, January 6, 2009

Barack Obama

I'm going make a statement in this post that someone is sure to take the wrong way. This being the case, I'm going to add a disclaimer to this post, not because I'm worried about being offensive, but because I don't want to be misunderstood. I believe that the points I'm going to make are valid and well thought out, but I know that there are people who just won't understand what I'm going to say. I'm going to make some comments that could be taken as racist, but if you actually read them, you will see I'm being anything but.

On January 20th, Barack Obama will become the 44th President of the United States. Mr. Obama will no doubt begin his administration as one of the most popular Presidents of all time. However, I'm thinking more about Mr. Obama's eventual legacy; how will we look back on Mr. Obama's presidency ten, twenty or even fifty years from now? If the media coverage during his campaign, and President-elect interim could be considered an accurate prognostication of how he will be remembered, it is simply as the first black American President.

I've been considering this a lot over the last few weeks, and I've come to the conclusion that I feel sorry for Barack Obama. For some reason that I'm not really capable of understanding, the man's skin color seems to be more important than the impact he will have on the country. It might not matter in twenty years what the man does while in office at all.

Great American Presidents are remembered for all sorts of things. Abraham Lincoln is remembered primarily for his Emancipation Proclamation. John F. Kennedy is remembered both as a liberal icon, as well as for his philandering ways. George Washington, John Adams and Thomas Jefferson are all remembered as founding fathers of this nation because of their varied dedication and sacrifice for this great nation. Franklin Roosevelt is both remembered both for the social programs that he helped institute to help end the Great Depression as well as his roll of helping bring WWII to a close. More recently, Ronald Reagean is remembered as the great communicator, and Bill Clinton is remembered both for his economic policy as well as the sex scandal that surrounded his administration. Theodore Roosevelt, Dwight Eisenhower, James Madison, William McKinley, Woodrow Wilson; I could go on and list what these Presidents have been remembered for, and while I'll agree that the fact the United States would elect a black man as their president says a lot of about the state of the county and the progress its people have made, it doesn't say anything about the man. Being black is just something he is, nothing more.

Will Barack Obama's presidency be one of economic growth? Will he lend the auto companies what they need to survive and grow? Will he pull America out of a war that practically no one wants any more? Will we be better off in four or eight years, or much worse? It's possible that the man will be an inept leader not capable of leading a nation out of a wet paper bag. Only time will tell how good or poor of a President Barack Obama will be. However when it's all said and done, will the nation in its own stupidity simply judge the man not on his actions, but instead on the amount of melanin in his skin? Is it possible that somehow the fact that he's America's first black president will be pushed through some PC spin that automatically equates black with good? How sad is that. He might be our greatest President ever. He might be the worst. Will he really be remembered not as good or bad, but instead as black?

There's other aspects of making too much of a deal of the man's skin color too. A friend of mine recently voiced the thought that we just might have a president who you can't disagree with or criticize without it becoming a racial issue. I'm not even talking about speaking offensively or in a bigoted way against the man himself. Is it possible that someone will really think that if I make a statement that I disagree with Mr. Obama's economic policy that that becomes a racial issue just because our President is black? I know that sound extreme, but there are people who I have met and had conversations with who are so extremely simple and ignorant, that this is all that they will see. Just because I'm white, if I disagree with something Barack Obama does, someone will think I disagree with him because he's black.

Another thing I've heard about that is racist is also at steak in this presidency. If Mr. Obama turns out to be either a poor, or even just a mediocre President, there are some people who will think "I knew we shouldn't have elected a black man to the Presidency". If this happens, it's possible that a mediocre presidency could be spun into a good one because of the historic importance of Mr. Obama as the first black President. Barack Obama is already being touted as a representative of the black race by the media. Why can't Barack Obama just be an American and a man, and why don't we just judge him as a man?

Many Americans think that President Bush is an idiot right now, but I've never heard that it was because he was white. I don't remember anyone thinking that the Berlin wall came down because Ronald Reagean was white. America didn't rise to Kennedy challenge to reach for the moon simply because of his complexion. Our founding fathers didn't fight against the British because they were white. For Obama though, America's opinions of the man are already being shaped because he's black. This was demonstrated to me while listening to a morning radio host ask a caller during the primaries last year whether she was going to vote for Hillary Clinton or Barack Obama. The caller stated that she hadn't figured out whether being black or a woman was more important to her, and that she was going to vote for the candidate who she could identify with more based on which characteristic she identified more in herself. Simply stated, she couldn't decide between skin pigmentation or genitals. I assume that if Hillary had darker skin, or if Barack Obama had lighter skin, her decision would be easier. How sick is that? Why can't you state that Barack Obama more closely matches my political views, or even just say I think Hillary would do a better job? Incidentally I would be writing a post about Hillary Clinton today if she had won the Presidency because there would be many of the same issues surrounding her Presidency revolving around her gender.

How did we get so skewed that we can't just accept Barack Obama just as our President? Why do we have to modify it by adding an adjective and make him our black President? Either he will do a good job, or he won't, and being black won't have anything to do with it. I hope that he turns out to be a good President, one that America can be proud of based on his actions and decisions. I hope that nothing untoward happens to either him or his family. I don't agree with everything he stands for and I didn't vote for him, but I still think he's a good man, and I hope he does a good job for America.