IT Economics

Dedicated to investigating the financial value of information technology.

Category: Capital Expenses

Improve Performance AND Reduce Costs

How to Reduce the Costs of Performance

Large scale applications often require high performing storage devices to deliver data to the software system and the end-user in a timely manner. Where sub-second response time is mandated we often exceed the capacity of the storage devices to perform. Thus vendors created a technique which attained the required speed. Unfortunately for the customer, the method comes at a very high cost. Let me explain their solution to the problem and a new approach to it that will save you a lot of money.

Computer disk storage speeds and read/write performance have improved a great deal almost every year since 1975. Despite these improvements, spinning disks still pose a bottleneck in the IT process. To meet the demand for sub-second performance the storage vendors recommend a technique which they call “wide striping”—a nebulous term meant to confuse executives and hide the ugly costs associated with it. Here is what it is in plain English.

Wide Striping is where you only use a portion of the disks, perhaps 10% to 30%. The data are spread over multiple disks that are mostly empty. This way the read/write heads have less geography to cover and can retrieve or write information much faster. The technique does achieve the performance objectives necessary for the application. But, stop and think about the costs.

You buy the disks, your vendor’s software license fee is based on the total size of the array, you pay for hardware maintenance on the whole thing, and software maintenance on the total size as well. To make it worse, you now consume much more floor space, you must provide power for the entire array, and air conditioning to cool it down. Thus you pay for 100% of the storage capital and operating expenses and use a small piece of it, only 10% to 30 %. And, because it is an important application, we duplicate the array in the data center and replicate it to a remote site—where we duplicate it again. What a waste of money and we have been doing it for years!

Fortunately new technologies have been developed that can save us a lot of money. Known as Flash and Solid State Disk, SSD, they are similar to technologies with which you are quite familiar. They reside in your digital camera, tablet, and/or cell phone, and act like a USB drive. In fact, one systems engineer I spoke to recently had computed that it would take 900 spinning disks to deliver the performance of an enterprise class flash storage unit. These technologies seem to cost more per volume, but do not forget that you get to use almost all of it compared to the wide striping alternative. You can save a lot of money by reclaiming those wide striped, very fast, low capacity utilization disk arrays. Here’s the plan.

Whatever your next planned storage addition, don’t do it. Instead, use those project or upgrade funds to buy a flash unit. Replace the wide striped arrays with the flash unit and use those arrays for the planned project. They now can be used up to 90%, thus the ROA and ROI of those acquisitions are much improved. The flash unit will do the high-performance application, and, as in cases I am familiar with, will deliver 4 to 5 times the performance of the striped array. Implementing this acquisition strategy will reduce both capital and operating expenses.

Be wary, there are more than a dozen flash and SSD vendors on the market and not all flash arrays are created equal. Test them out and ask for references where they have been in production for a year or more. The technical challenges of these new technologies are quite complex.

For a technical explanation see Hu Yoshida’s blog and the white paper listed in it: which explains the challenges and solutions quite well from a technical perspective.





What’s in a protocol, what is a protocol, and why care?

Replication of data across data centers is an accepted “best practice” to protect files from damage, natural disasters and other risks. Storage vendors understand this quite well and over the years have developed proprietary languages, called protocols, whereby their storage arrays can talk to each other and replicate the data between them. Buyers have come to expect this and take it for granted that these protocols will function properly, and they do. Note that each vendor has their own protocol and it is specific to their storage array.

Each vendor’s protocol may produce the desired result, properly replicated data. As buyers, we rarely question the vendors as to the efficiency of their protocol. Replication requires telecommunication/network links and these are expensive, especially in the case where replication requires long distance satellite links. Occasionally vendors are put to the test in a head-to-head Proof of Concept, PoC, and I was fortunate to meet a systems engineer that had recently concluded a PoC.

This PoC involve replication of data from a data center in Europe to one in North America for a large financial institution. The buyer had prepared a set of benchmarks regarding the amount of data and within what time period it required replication to take place. Given these parameters two vendors were put to the test.

In order to meet the objectives, Vendor A required 16 satellite links to replicate the data in the time allowed. Vendor B was able to meet the data volume and time objective with only three (3) [This is not a typo.] satellite links. Imagine the savings or the expense of 12 additional satellite links! This reduction in telecommunication (operating) expenses more than outweighs any difference in up front capital expenses for the storage arrays. What is most impressive is that the buyer took the time, effort and expense to conduct the PoC on a topic rarely studied—protocol efficiency. And for that effort they will enjoy many months of reduced telecommunication expenses.

How many zombies are in your data center?

What do I mean by zombies in your datacenter? Zombies are the living dead servers, both virtual and physical, that you have in your computer room. These servers are no longer in production, however they have not been decommissioned or removed.

In the case of physical servers they are occupying rack space, may be consuming power, and generating heat that must be removed from the data center. In addition, network connections and storage capacity, allocated to these now deceased applications, have become unused resources.

In the case of virtual servers you might argue that they are not consuming resources, however, they are occupying albeit a minimal amount of processor power and storage and/or network connections might be allocated to these virtual servers. Thus other resources are negatively impacted by the existence of the zombies.

From the financial perspective what we have here is money being spent for no reason at all. Our electric bill is inflated because of the zombies consuming power and air-conditioning resources, thus negatively impacting operating expenses. Since the zombies are consuming datacenter assets, we lose the opportunity costs of the network connections and storage capacity. We may perhaps even by additional switch connections and/or storage under the impression that insufficient assets are in place because of the zombies consuming these scarce resources. Thus capital expenses are negatively impacted by the zombies.

Housecleaning may not be a high priority action item in the datacenter. Over the years the zombies increase in numbers and we become more and more inefficient, and the funds to maintain them continue to grow. Recent studies indicate that only 35% of our processing power in the datacenter is actually in productive use, yet we continue to pay the operating expenses for the remaining 65%. Given the magnitude of our zombie problem the time has come to purge the datacenter of these monsters and reduce our costs.

© 2017 IT Economics

Theme by Anders NorenUp ↑