Saturday, November 15, 2008

Pay Forward Digital Assets Using Cloud Computing

It takes 2 months (60 DAYS) to move 1 terabyte of data over a T1.

Don't have that much data ? It takes about a month (30 DAYS) to move 500 gigabytes. Only have 100 gigabytes - you should be able to pull it back in about a week (6 DAYS). Now this is for an uncompressed data stream but as we all know the ratio of multi media files (which are already at maximum compression) is growing rapidly so your mileage will vary. Point being if you are hoping that compression or de-duplication can fix it, well, that is a discussion for a later entry.

In any case, pretty much every Small to Medium Business (SMB) has eclipsed the 100 gigabyte mark in aggregate and should be mindful of this simple math. Unlike all of the dedicated resources that we can swap out to go faster or store more, bandwidth remains a shared resource and critically, is only as fast as it's slowest link. Someday we will all have unlimited fibre to the touchscreens on our refrigerators... someday.

In the meantime, SMBs have two specific needs; the first is online backup to an authoritative READ ONLY HISTORICAL repository to serve as a backstop for Disaster Recovery (and e-discovery) and the second is an ACTIVE replica of the most RECENT version for Business Continuity. The phrase "kill two birds with one stone" is a good one for the opportunity presented with cloud computing.

Most of us can agree on the need for an authoritative offsite repository to make sure that our data is safe no matter what happens on the ground. The reasons are easy to tick off, starting with physical threats and ending with deliberate malicious intent. More often than not however, this "Disaster Recovery" (DR) solution is passive and idles in background collecting data over time until you pull it back - usually when a problem occurs.

But what if that repository were dynamic ? What if it satisfied not only the natural role of DR but could also "push" the critical data you might need for "Business Continuity" (BC) to one (or more) alternate platforms in waiting. That is what the "pay forward" in the title refers to; maintaining an offsite or online parallel working environment that is time shifted, or lagging the primary production environment, by a reasonable delta. There are a lot of solutions that achieve this goal using proprietary ecosystems and dedicated bandwidth but those can be expensive - and that is one of the biggest reasons why you might want to factor cloud computing into your solution.

Dramatically simplifying this concept of time shifting (which will be revisited in greater detail in later posts), we simply talk about real time production and a "latest snapshot" from delta changes. Let's assume that your production data is constantly in motion during it's "day shift" from, say 9 to 5, and you maintain a local near Continuous Data Protection (CDP) disk to disk capability with snapshots every 15 minutes to an hour. At 5:00 PM, most everybody powers off their systems (to conserve energy) and goes home.

This however is your data's "night shift" and the opportune time to post through the latest delta changes for the entire day to your DR repository in the clouds which, in this hyper simplified explanation will continue until around 1:00 AM.

After which your data's "graveyard shift" begins and the previous day's delta changes are relayed again to your terrestrial offsite or cloud computing online BC failover facilities. Critically, this data transmission is sequenced so that the most important data sets go first.

Of course the order is open to interpretation; ie. you go offsite first and then online depending on your specific needs. In addition, the solution you choose should be able to transmit on multiple threads simultaneously and give you the option to pick and choose subsets of your data for retransmission much as we do with WarLock DR+.

In summary, the cloud is correctly isolated from your organization's terrestrial exposure and well suited to functioning in the dual role of authoritative repository and dynamic hub for paying forward delta changes in anticipation of failover.

Wednesday, September 17, 2008

Building the Data Highway to the Clouds

Moving data between terrestrial and cloud computers is simply an age old challenge with a new twist. Once upon a time information technology architects spent a LOT of time on data in motion so they could guarantee response times and availability. A good example pitted DEC against IBM where the former would transmit characters and the latter would buffer entire screens to maintain an "illusion" of responsiveness.

Cloud computing presents a similar set of challenges because some of the luxuries we've grown accustomed won't always be in play. For instance, as LAN speeds have risen to the gigabit level, we rarely ponder the physical aspects of moving a file. When we browse the web, our pages are popping but we give no thought to the web services infrastructure behind it. Some of these little considered aspects will now move back to the front of the class as we try to leverage the power of cloud computers.

Starting from the basic premise that building an effective cloud computing environment for any organization will happen over time, we can anticipate a LOT of back and forth movement of data between the ground and grids built by Amazon and Google. Of the current solutions, Amazon's is closest to a standard data center with their Elastic Computing Cloud (EC2) delivering gigahertz of compute power and their Elastic Block Storage (EBS) promising terabytes of data storage for substantially less than hosted solutions.

Taking advantage of this new frontier will mean that we need to pay attention to the arcane tools of compression, encryption, delta differencing, versioning, and especially time shifting data movement because simple copies between hosts will take too long. Specifically, the LANs on the ground and in the grid will be used to quickly transmit files to a data pump that will manage the trip between the earth and the sky, or vice versa. This data pump will securely hold files for use on an as needed basis when new cloud computers or terrestrial virtual machines are deployed.

In other words the logistics of traveling the data highway will lead to data abstraction from the operating system images so systems can be built and torn down as needed. Last but not least, this new frontier will put a premium on lightweight, and license free, deployments of specialized compute kernels that will request their data files "at birth".

This promises to be an exciting time and we look forward to meeting the challenge.

Tuesday, August 26, 2008

Elastic Block Storage - Nuts & Bolts vs 3Rs

Honestly, could the screws to mount a hard drive be any smaller ?

That single question reminds most IT professionals of hours on end going nuts trying to bolt and cable hard drives into servers and RAID arrays. Our reward was the blinking lights; at least until we ran out of disk space.

Enter the good old 3Rs. Reading, wRiting, and aRithmetic. Instead of a screwdriver, you can provision volumes with your keyboard and Amazon's Elastic Block Storage (EBS). The following example shuffles the 3Rs a bit to make a point.

Step 1 - aRithmetic
Add up the space to backup all of your systems at all of your sites:
This example uses 800GB so you can keep a year of snapshots.

Step 2 - wRiting
Type in this command:
$ ec2-create-volume --size 800

Step 3 - Reading
You will see:
VOLUME vol-4d826724 858993459200 creating 2008-02-14T00:00:00+0000

Then a few moments after that:
VOLUME vol-4d826724 858993459200 available 2008-02-14T00:00:00+0000

And that is the 3Rs method to provision highly available data storage on a virtual server in a completely private account that is up to 10 times more reliable than a typical hard disk.

To delete the volume:
$ ec2-delete-volume vol-4d826724

Of course there's more to it than just the volume, but the steps covered here are the equivalent of buying, shipping, unpacking, inserting, securing, rack mounting, powering, cooling, and securing a bunch of disk drives. The last command is the equivalent of undoing it all.

For WarLock Software and our resellers the process is so easy and risk free that we can always include a free turn-key cloud computing installation with our DR+ product.

Looks like the 3Rs are going to make a comeback. Big time.

Thursday, August 21, 2008

Elastic Block Storage is Perfect for Online Backup

With today's delivery of Elastic Block Storage (EBS) in the Elastic Computing Cloud (EC2), Amazon removes the last possible objection to transforming cloud computing into the ideal resource for online backup. Now that persistent storage with 10 times the reliability of standard disk is available for 10 cents a month per gigabyte it makes perfect sense to start moving backup data into the cloud. This is particularly true for Small to Medium Businesses (SMBs) and Managed Service Providers (MSPs).

Beyond the technical merits however (which will be addressed in later blogs) the MOST IMPORTANT thing about the EC2/EBS online backup model is DATA OWNERSHIP. When you set up an account with Amazon Web Services (AWS) you have complete control and there is NO THIRD PARTY GATEKEEPER between you and your data. You, and/or your trusted service provider, can do what you want when you want.

Which brings us to the second most important thing; NO LONG TERM CONTRACTS that lock in a mandatory expense that must be satisfied to avoid penalties. As your organization's needs change you can either expand, or shrink, or stop altogether with zero consequences. Imagine leasing your car and being able to turn it in on a whim... and then get it, or a different one, back two months later all the while paying only for the miles you drive.

Finally, even as this new capability dramatically reduces the upfront cost and ongoing risk for customers and vendors alike, it makes sense to leverage technology to keep the pay as you go costs to a minimum. For online backup, WarLock DR offers their Data Relay function which supports near Continuous Data Protection (CDP) for terrestrial physical systems and virtual machines all day long with integrated store and forward after business hours into EC2/EBS.

So be sure to check out Amazon's new Elastic Block Storage (EBS) along with WarLock DR+ for the best value in Tier 1 online backup.

Monday, August 18, 2008

SMB MSPs Recover Margin & Account Control

Managed Services Providers (MSPs) offering online backup for Disaster Recovery and Business Continuity to their Small & Medium Business (SMB) customers now have an option to the prevailing practice of reselling services from industry heavyweights.

With the arrival of Elastic Block Storage (EBS) in the Amazon Elastic Compute Cloud (EC2) the up front cost for the data center assets to deliver an offsite storage capability have been reduced to zero. This means that MSPs in the space will be able to offer their customers equivalent data integrity while reclaiming 100% of margin and account control. This last issue looms large for many MSPs because they have seen their influence erode over time as their customers default to communicating directly with the provider. The theory is that the MSP "owns" the customer but the reality is that they are often reduced to bystander status.

There are two major costs for such an "indirect" relationship with the customer. The first is obvious; the upstream provider typically consumes the lion's share of the margin. The second is less apparent and goes by many names. For this discussion, we will use the term "opportunity cost" because every customer conversation not held means the opportunity is lost to understand what their objectives are, not just for data protection, but for their business objectives.

This is of critical importance as system and storage virtualization shift the preferred data replication methodologies from block to file and accelerate the ability and the desire of SMB customer to do lots more than just backup data. Tools like WarLock DR+ offer SMBs the chance to proactively aggregate and distribute data files to improve not just survivability but also their operational efficiency. Many of these conversations are now being held for the first time and MSPs should strive to be in them all.

Sure EC2 and EBS fullfill the promise of hardware as a service (HaaS) but as with all new technologies they require active participation and delivery. Fortunately the rewards of delivering the message are crystal clear and once again profitable.

Wednesday, August 13, 2008

Independence from Single Vendor Ecosystems

A lot of things have changed in the technology marketplace. One constant however has been the strategic value of locking customers into an ecosystem because they come to a point where they can't afford to switch. This may finally be changing with cloud computing, open source, and virtualization on the rise. Basically, people are starting to use so many different tools from so many different sources as they venture out on to the web that they are growing comfortable with not having a complete ecosystem from a single vendor.

As the reader may know, offsite and online data replication (DR) is a big focus here at WarLock Software and we believe our DR+ solution is an excellent tool to gain ecosystem independence in three ways:
  1. WarLock Software's DR+ solution runs on anything and backs up everything. This strikes a blow to the ecosystem argument that data must be moved into a proprietary storage device.

  2. Real opportunities for improving operations flow from flexible replication that can read from and write to many disparate sources including in our case, Amazon's Elastic Computing Cloud (EC2) with persistent storage.

  3. A lot of work is being done with a combination of tools on the desktop, the web, and the intranet and replicating these pockets of information is beyond a typical storage ecosystem.
Finally, VMWare and Hyper-V are bursting single vendor ecosystems apart at the seams because they are so cost effective and flexible, especially for business continuity and disaster recovery. We're hoping all of these factors together may finally make it "affordable" to move away from vendor ecosystem lock in.

Wednesday, July 16, 2008

Powered Replication Makes the Difference

Data replication (DR) requires compute power at each endpoint for greatest efficiency. This is especially true with Cloud Computing as we move files back and forth between terrestrial systems and the grid. Web 2.0 push technologies are ok for a few files but bog down quickly under load.

There are four (4) primary reasons you need powered replication; compression, encryption, delta change calculations, and snapshot (versioning) rotation.

The basic cycle of compress at source, transmit, de-compress at target is critical for reducing bandwidth. Similarly the encrypt, transmit, un-encrypt cycle ensures in-flight security for sensitive data and maximum utility at both ends. Both require compute power at the endpoints.

Delta change calculations are the best way to shorten the backup window at the source by sending the least possible amount of data which is then combined with the unchanged portion of prior versions of those files at the target. Delta change, or commonality factoring, is the key to efficiently keeping pace with changing data.

Finally, the preservation of point in time snapshots definitely requires compute power at the target endpoint. Before combining the previous version of files with the latest changes, a powered replication solution will rotate the snapshots and create synthetic full restores.

Ideally the horsepower to perform all of these tasks should not be stolen from the systems being protected but provided by a one-to-many solution that introduces minimal load.

These four (4) methods, compression, encryption, delta change replication, and snapshot rotation for full synthetic restore of terrestrial, virtual, and cloud computers are all seamlessly delivered by WarLock Software DR+ solution.