The Cloud Conversation No One Is Happy Having

Don’t kid yourself, it’s the money

We live in a wondrous age of technological sophistication. Even in this age of wonders the idea that you can do a thing does not necessiate that you should. No where would this thinking be more appropriate than with cloud computing. Cloud computing  is an excellent idea, caused by a convergence of outstanding technologies, but it does not mean everyone should rush out and put themselves in the cloud, no matter how many vendors are telling you it is a good idea, and I am astounded by the number of vendors who talk about the inevitability of our approach to cloud computing. It is unstoppable and inescapable, so you had better embrace it.

The operative word here is vendors. They are selling something. They want you to buy it. So they are certainly going to tell you about every amazing asset and attribute with as little information as necessary for you to make an informed decision. Now let’s be real about one thing. No one but IT people like to talk about IT. In most cases, only IT people have any true understanding of how their technology works and depending on their experiences, the thoroughness of their training, the size of the mistakes they have made during their career and their ability to focus on the issues at hand, will truly prepare them for the complexity inherent in their job; both the human issues and the technological ones.

So let’s couple a vendor’s need to sell you IT and the overall complexity of IT and you have the perfect storm that is “Cloud Computing.” A technology that promises you will not have to keep IT people in house confusing you with their jargon and expensive toys you do not understand and they are unable to do without. Cloud computing will move your IT needs to a remote location that is backed up, redundant, staffed by the best IT people on Earth, in a location that has power, back-up power, and surrounded by fifty feet of solid bedrock, cooled by being three thousand feet underground to an ideal temperature of 58 degrees. Nothing short of a nuclear device will even affect this site because a Service Level Agreement (SLA) says so. So give us your money, and your data, and we will take care of the rest.[1] [13]

Technology simplifies life, Doesn’t it?

It is amazing to me how often I have heard about the technology that is available for the cloud today. And how many different flavors of cloud computing you can have. Infrastructure-as-a-Service, Platform-as-a-Service and Software-as-a-Service, I am waiting for Ice-cream-as-a-Service and I will know the cloud is truly ready. What strikes me strange is how often people want to claim this is a technology we should be putting in place to support our businesses, our lifestyles and our way of doing work in the future. All of the people supporting this technology are always fond of talking about how robust the technology is and how nothing can possible go wrong so it would be okay to place our most important data into the cloud, now.

Contrary to what is believed, it IS possible for a technology to have existed for a long time and still not be a mature technology. Longevity should not be mistake for maturity. The Internet has existed for quite a long time now and is still evolving, looking almost nothing like it did as little as fifteen years ago. I would call the Internet a perfect example of a long-lived but still maturing infrastructure technology. The fact there are highly available clustered servers, clustered storage subsystems, redundant networking infrastructure did not stop what I call the Great Amazon Cloudburst from occurring. Surely no one can say there was insufficient access to great technology and highly skilled technical staff at Amazon’s Elastic Compute Cloud service centers. But the functionality of their clients websites were lost for tens of thousands of people for at least 4 days. All of the support and technology was available and yet did nothing to mitigate the loss of service. Read their explanation about the event for yourself and see if you can understand what the problem was. I am betting if you are not an IT person, you will not have a clue. Reading it will not make you any happier. [2][7]

In the common fashion of this nation, the event is brushed under the rug and everyone returns to business as usual. The promotion of a service that is still too complex and growing more complex every day. The ever-increasing level of complexity of the systems in question has begun to plague all of our networks no matter where they are. Our phones systems have already begun to show signs of being overwhelmed by the complexity of the systems required to control them. Upgrade servers, remote access servers, game servers are all plagued by the increasing complexity of legitimate networks. This does not take into account illegal, illegitimate or poorly-configured network such as botnets which affect users all over the globe.[3][4]

Security isn’t a problem anymore for the Cloud

Pundits and supporters of the cloud say its secure even though we see news every day about the latest internet security breach from one company or another. Can you say the PlayStation Network and its estimated 70 million lost IDs? And these breaches grow larger and more data is gathered with each assault. I read a dozen IT trade magazines and see new security breaches happen that rarely make the evening news. I suspect that more companies do not voice their security issues but have them nonetheless, giving people a false sense of security. There is recent news indicating the PlayStation Network attack may have originated from Amazon Cloud Server environment. [12]

The truth of the matter is companies ARE loathe to let you know how often their systems are penetrated by any number of methods, including inside attackers, social engineering, system failures and security intrusion by outside attackers. If you knew, you might be less willing to put your personal information out there so easily, you might not be willing to bank online, or shop online or do all of these things our society has convinced you that you can no longer live without, because it creates profit for vendors, banks and finance agencies. This is about money, make no mistake. If you want to see information regarding security breaches you can look in The Register, a UK tech trade publication. Symantec mentions in their own literature, the increasing need for security software, potentially worth billions in the coming decade. [5][6]

The technology vendors want to tell you they are increasing reliability by adding virtual servers, virtual workstations and hypervisors which allow them to restore services easier after they are lost or have to be scaled to deal with companies that are growing and need ever-increasing performance. But the real reason this technology is being created and promoted is to send work and the systems required to do it overseas, reducing the need for internal IT infrastructure. No. I cannot substantiate this. It is my impression of the industry and how outsourcing has continued to dominate the landscape. How I arrive at that is another debate, but work with that premise and consider the following.

Outsourcing Considerations

We’re back to money again. Outsourcing for the win!

Corporations are outsourcing their services in record numbers. Human Resources departments, finance services, manufacturing inventory, company records, databases and now IT services are all being moved into outsource models reducing their cost to companies everywhere. But the question begs to be asked if after outsourcing these services, we also store the company’s data in the cloud, what we have said is, in the event of a catastropic emergency, even with a well-provisioned, well-equipped, highly trained service provider, a company will have limited to no access to its records, its databases, its human resource information, its healthcare records, its finance information, or any of it’s IT infrastructure including its virtual workstations, or virtual domain services or virtual telecom systems.

And I know, you are all thinking, this could never happen. But if you remember the Northeast Blackout of 2003 which left a large portion of the Eastern seaboard without electrical power and affected 45 million people in at least 8 US states. During this time, cellular, cable and telephone services were disrupted and the internet services of Advance Publications went offline, affecting three online news services for days. The nation was reduced to using amateur radios to pass emergency communications. [8][9]

The issue I worry the most about is how will the nation perform when hundreds or thousands of corporations are sharing the same series of servers and lose their infrastructure in a foreign land that is affected by a quake or tsunami or monsoon or any other of a number of catastrophic events outside of human control, what will your corporation do while it has placed more than fifty percent of its manpower and resources in virtual form, unable to be accessed for a day or week or a month. Can your business survive when all of its vital support services are unable to be accessed? You will lose even the ability to make even a simple phone call if your virtual network includes your digital telecom and voice mail services. Virtual domain services? Good luck being able to connect to your email, voicemail, VPN, SharePoint, file servers, clustered data services that you may have kept locally.

And yes, for all of you who are saying, there are failover technologies in place to allow redundant services to pick up the slack in case of an emergency. Amazon said that too, especially in hindsight, they were unhappy their failover technology did not operate as expected.

Did you test that? Prove it.

This is the kicker. Once you start aggregating clients into your environment, you come across a curious dynamic. How do you test your environment for failover to be sure it actually works. Anyone who has worked in an IT environment remembers how difficult it is to test your failover for your domain servers, or your remote email servers when it just YOUR company that you have to deal with across two or three timezones and you want to test to see if your email services will fail over to that redundant server cluster you are paying a princes wages to in Singapore. You are only affecting a few thousand people’s ability to work whenever they want. What happens when you are a Cloud Provider and you have five million different people scattered across the world and you want to test your failover services. Someone, somewhere WILL be inconvenienced. But it will need to be done, because if Amazon had done this test, they would have known something was incomplete in their process and would have pre-empted this problem which took them four days to correct.

This is not about painting a picture of worst case scenarios. But someone should ask the question, what DO we do in the event of a major failure when tens of thousands of companies have moved to this sort of infrastructure service and are unable to simply walk away because too much of their company is invested in that provider. Companies promise they will always maintain a certain level of performance, but history has shown as companies grow, their ability to maintain their level of agreed upon performance has suffered while they grew. Dare I mention, AT&T, Comcast, AOL, Time Warner and Enron just to pick a bunch out of a hat?

What we are really saying is ultimately, we are prepared to risk our entire livelihood on the development of cloud technology which we are doing our damnednest to get everyone possible to participate in every cloud provider we can find, whether they be public or private clouds. This means in fact, we are aggregating our businesses and their infrastructure into collective pots of provided services and depending on those services to be completely bulletproof, resistant to external hacking attacks and penetrations by unlawful persons. We are expecting it to be perfectly configured with thousands of companies sharing IP4 and now the new IP6 services, sharing switches, firewalls, shared servers, clustered resources, virtual environments and of course done by the company who offers the lowest bid and claims to have the best trained people in the industry. [10][11]

Single Point of Failure. You.

It seems a tall order to put the entire infrastructure of a nation into consolidated single points of failure without addressing what Plan B is supposed to look like, just in case Plan A, the world of the perfectly actualized, completely failure-proof, infrastructure that we already have, which never fails when we need it most, and isn’t staffed by over-worked, under-rested, hypercaffinated gearheads should happen to go offline for a month or two.

I am not a Luddite. I have worked with IT for thirty years at all levels of it. I have a healthy respect for technology and its seemingly supernatural ability to wait until it has the most people it can possibly have dependent on it and then unerringly to fail when you need it most. With that kind of perverse nature, do we really want our places of industry to be complete dependent on something that could simultaneously have us all conducting business on Post-its in our paperless offices? Can we find that happy medium that will not have the nation clutching our collective tuchas while we simultaneously wait for the same five or six cloud providers to figure out what went wrong today and how long America will be sitting in neutral until it can be fixed.

To quote Spongebob Squarepants: Good Luck with that.

Look through the references, read the articles and if you still disagree, please feel free to comment right here. Don’t bring me some vendor talking about how wonderful the cloud is. Bring me a real reason we cannot live without the cloud and what we need to be doing to mitigate some of the things I have mentioned in the article and that are in the articles listed in the references. We need to address this sooner rather than later. I am sure vendors will read this and dismiss everything I say and tell you to do the same. When the next cloud outage affects one million people rather than the twenty thousand this outage did at Amazon’s EC2 center, you remember who told you it was an inevitability. Remember, I am not trying to sell you anything, that vendor can’t look you in the eye and say that. He has a vested interest in making sure you bite.


3 responses to “The Cloud Conversation No One Is Happy Having

  1. This is indeed a thought provoking article. I for one (a software developer) can see benefit in cloud computing. Ebonstorm has however raised a lot of very valid concerns and I believe that today we have the technology for this “thing” called cloud computing to work. But we need more convincing evidence to re-assure us and I will caution consumers to tread lightly in this arena. The all-eggs-in-one-basket expression comes to mind…

  2. Pingback: Anatomy of the Most Significant Data Theft in History (to date) « A Matter of Scale

