We Live in a Multi-Cloud World: Here's Why

It's almost 2019 and there's still a lot of chatter, specifically from hardware vendors, that 'We're moving to a multi-cloud world. This is highly erroneous. When you hear someone say things like that, what they mean is 'we're catching up to the rest of the world and trying to sell a product in the space.'

Multi-cloud is a reality, and it's here today. Companies are deploying applications on-premises, using traditional IT stacks, automated stacks, IaaS, and private-cloud infrastructure. They are simultaneously using more than one public cloud resources. If you truly believe that your company, or company x is not operating in a multi-cloud fashion start asking the lines of business. The odds are you'll be surprised.

Most of the world has moved past the public-cloud vs. private-cloud debate. We realized that there are far more use-cases for hybrid clouds than the original asinine idea of 'cloud-bursting' which I ranted about for years (http://www.definethecloud.net/the-reality-of-cloud-bursting/Â and https://www.networkcomputing.com/cloud-infrastructure/hybrid-clouds-burst-bubble/2082848167.) After the arguing, vendor nay saying, and general moronics slowed down we started to see that specific applications made more sense in specific environments, for specific customers. Imagine that, we came full-circle to the only answer that ever applies in technology: it depends.

There are many factors that would come into play when deciding where to deploy or buildÂ an application (specifically which public or private resource, and which deployment model (IaaS, PaaS, SaaS, etc.) The following is not intended to be an exhaustive list:

Application maturity and deployment model
Data requirements (type, structure, locality, latency, etc.)
Security requirements and organizational security maturity. Note: in general, public cloud is no more, or less secure than private. Security is always the responsibility of the teams developing and supporting the application and can be effectively achieved regardless of infrastructure location.
Scale (general size, elasticity requirements, etc.)
Licensing requirements/restrictions
Hard support restrictions. Some examples include: requires bare-metal deployment, Fibre Channel storage, specific hardware in the form of an appliance, magic elves who shit rainbows.
Cost, both how much will it cost on any given environment and what type of costs are most beneficial to your business (capital vs. operational expenses, etc.)
Governance, compliance, regulatory concerns.

Lastly, don't discount peoples technology religions. ThereÂ is typically more than one way to skin a cat, so it's not often worth it to fight an uphill battle against an entrenched opinion. Personally when I'm working with my customers if I start to sense a 'religious' stance to a technology or vendor I assess whether a more palatable option can fit the same need. Only when the answer is no do I push the issue. I believe I've discussed that in this post: http://www.definethecloud.net/why-cisco-ucs-is-my-a-game-server-architecture/.

The benefits of multi-cloud modelsÂ are wide, and varied, and like anything else, they come with drawbacks. The primary benefit I focus on is the ability to put the application in focus. With traditional on-premises architectures we are forced to define, design, and deploy our application stack based on infrastructure constraints. This is never beneficial to the success of our applications, or our ability to deploy and change them rapidly.

When we move to a multi-cloud world we can start by defining the app we need, draw from that it's requirements, and finally use those requirements to decide which infrastructure options are most suited to them. Sure I can purchase/build CRM, or expense software, deploy them in my data center or my cloud provider's but I can also simply use the applications as a service. In a multi-cloud world I have all of those options available after defining the business requirements of the application.

Here'sÂ two additional benefits that have made multi-cloud today's reality. I'm sure there are others so help me out in the comments:

Cloud portability:

Even if you only intend to use one public cloud resource, and only in an IaaS model building for portability can save you pain in the long run. Let history teach you this lesson. You built your existing apps assuming you'd always run apps from your own infrastructure. Now you're struggling with the cost and complexity of moving them to cloud, might history repeat itself? Remember that cost models and features change with time, this means it may be attractive down the road to switch from cloud a to cloud b. If you neglect to design for this up-front, the pain will be exponentially greater down the road.

Note: This doesn't mean you need to go all wild-west with this shit. You can select a small set of public and private services such as IaaS and add them to a well-defined service-catalogue. It's really not that far off from what we've traditionally done within large on-premises IT organizations for years.

Picking the right tool for the job:

Like competitors in any industry, public clouds attempt to differentiate from one another to win your business. This differentiation comes in many forms: cost, complexity, unique feature-set, security, platform integration, openness, etc. The requirements for an individual app, within any unique company,Â will place more emphasis on one or more of these. In a multi-cloud deployment those requirements can be used to decide the right cloud, public or private, to use. Simply saying 'We're moving everything to cloud x' is placing you right back into the same situation where your infrastructure dictates your applications.

As I stated early on, multi-cloud doesn't come without it's challenges. One of the more challenging parts is that the tools to alleviate these challenges are, for the most part, in their infancy. The three most common challenges are: holistic visibility (cost, performance, security, compliance, etc.), administrative manageability, and policy/intent consistency especially as it pertains to security.

Visibility:

We've always had visibility challenges when operating IT environments. Almost no one can tell you exactly how many applications they have, or hell, even define what an 'application' is to them. Is it the front-end? Is it the three tiers of the web-app? What about the dependencies, is Active-Directory an app or a service? Oh shit, what's the difference between an app or a service? Because this is already a challenge within the data center walls, and across on-premises infrastructure, it gets exacerbated as we move to a multi-cloud model. More tools are emerging in this space, but be wary as most promise far more than they deliver. Remember not to set your expectations higher than needed. For example if you can find a tool that simply shows you all your apps across the multi-cloud deployment from one portal, you're probably better off than you were before.

Manageability:

Every cloud is unique in how it operates, is managed, and how applications are written to it. For the most part they all use their own proprietary APIs to deploy applications, provide their own dashboards and visibility tools, etc. This means that each additional cloud you use will add some additional overhead and complexity, typically in an exponential fashion. The solution is to be selective in which private and public resources you use, and add to that only when the business and technical benefits outweigh the costs.

Tools exist to assist in this multi-cloud management category, but none that are simply amazing. Without ranting too much on sepcfic options, the typical issues you'll see with these tools are they oversimplify things dumbing down the underlying infrastructure and negating the advantages underneath, they require far too much customization and software development upkeep, and they lack critical features or vendor support that would be needed.

Intent Consistency:

Policy, or intent can be described as SLAs, user-experience, up-time, security, compliance and risk requirements. These are all things we're familiar with supporting and caring for on our existing infrastructure. As we expand into multi-cloud we find the tools for intent enforcement are all very disparate, even if the end result is the same. I draw an analogy to my woodworking. When joining two pieces of wood there are several joint options to choose from. The type of joint will narrow the selection down, but typically leave more than one viable option to get the desired result. Depending on the joint I'm working on, I must know the available options, and pick my preference of the ones that will work for that application.

Each publicÂ or private infrastructureÂ generally provides the tools to acheive an equivelant level of intent enforcement (joints), but they each offer different tools for the job (joinery options.) This means that if you stretch an application or its components across clouds, or move it from one to the other, you'll be stuck defining it's intent multiple times.

This categoryÂ offers the most hope, in that an overarching industry architecture is being adopted to solve it. This is known as intent driven architecture, which I've described in a three part series starting here: http://www.definethecloud.net/intent-driven-architectures-wtf-is-intent/. The quick and dirty description is that 'Intent Driven' is analogous to the park button appearing in many cars. I push park, and the car is responsible for deciding if the space is parallel, pull-through, or pull-in, then deciding the required maneuvers to park me. With intent driven deployments I say park the app with my compliance accounted for, and the system is responsible for the specifics of the parking space (infrastructure). Many vendors are working towards products in this category, and many can work in very heterogeneous environments. While it's still in it's infancy it has the most potential today. The beauty of intent driven methodologies is that while alleviating policy inconsistency they also help with manageability and visibility.

Overall, multi-cloud is here, and it should be. There are of course companies that deploy holistically on-premises, or holistically in one chosen public cloud, but in today's world these are more corner case than the norm, especially with more established companies.

For another perspective check out this excellent blog article I was pointed to by Dmitri Kalintsev (@dkalintsev)Â https://bravenewgeek.com/multi-cloud-is-a-trap/. I very much agree with much, if not all of what he has to say. His article is focused primarily on running an individual app, or service across multiple clouds, where I'm positioning different cloud options for different workloads.

Your Technology Sunk Cost is KILLING you

I recently bought a Nest Hello to replace my perfectly good, near new, Ring Video Doorbell. The experience got me thinking about sunk cost in IT and how significantly it strangles the business and costs companies ridiculous amounts of money.

When I first saw the Nest Hello, I had no interest. I had recently purchased and installed my Ring. I was happy with it, and the Amazon Alexa integration was great. I had no need to change. A few weeks later I decided to replace my home security system because it's a cable provider system and like everything from a cable provider it's a shit service at caviar pricing because 'Hey, you have no choice you sad F'er.' That's the beauty of the monopoly our government happily built and sustainsÂ for them. I chose to go with a system from Nest, because I already have two of their thermostats, several of their smoke detectors, and a couple of their indoor cameras. I ordered the security system components I needed, and a few cameras to compliment it, then I looked back into the Nest Hello.

The Nest Hello is a much better camera, and more feature rich device. More importantly it will integrate seamlessly with my new security system, and existing devices, eliminating yet another single use app on my phone (the Ring app.) The counter argument for purchasing the device was my sunk cost. I'd spent money on the Ring, and I'd also spent time and hassle installing it. The Nest might require me to get back in the attic and change out the transformer for my doorbell as well as wire in a new line conditioner. Not things I enjoy doing. The sunk cost nearly stopped my purchase. Why throw away a good device I just installed, to get a feature or two and a better picture.

I then stepped back and looked at it from a different point of view. What's my business case? What's the outcome I'm purchasing this technology to achieve? The answer is a little bit of security, but a lot of piece of mind for my home.Â I live alone, and I travel a lot. While I'm gone I need to manage packages, service people, and my pets. I also need to do this quickly and easily. This means that seamless integration is a top priority for me, and video quality, etc.Â is another big concern. Nest's Hello camera feature set far better for my use case, especially when adding their IQ cameras. Lastly for video recording and monitoring service, I would now only need one provider, and one manageable bill rather than oneÂ for Nest and oneÂ for Ring. From that perspective the answer became clear: the cost I sunk wasn't providing any value based on my use-cases, therefore it was irrelevant. It was actually irrelvant in the first place, but we'll get back to that.

I went ahead and bought the Nest Hello. Next came another sunk cost problem. My house is covered in Amazon Alexa devices which integrate quite well with Ring. I have no fewer than 8 Alexa enabled devices around the home, garage, etc. Nest is a Google product, so it's best integration is with Google Home. Do I replace my beloved Amazon devices with Google Home to get the best integration?

First a rant: The fact that I should even have to consider this is ludicrous, and shows that both products are run by shit heads that won't even feign the semblance of looking out for their customers interests. Because they have competing products they forcibly degrade any integration between the systems rather than integrating and differentiating on product quality rather than engineered lock-in. I despise this, it's bad business, and completely unnecessary. I'd guess it actually stalls potential sales of both because people want to 'sit back and see how it plays out' before investing in one or the other.

I have a lot of sunk financial cost in my Alexa devices. There's also some cost in time setting them up and integrating them with my other home-automation tools. That in mind I went back to the outcome I'm trying to achieve. My Alexa/Ring integration allowed me to see who was at the front door, and talk to them. My Alexa/Hello integration will only let me view the video. What's my use-case? I use the integration to see the door, and decide if I should walk to the front doorÂ to answer. If it's a package delivery, I can grab it later. If it needs a signature, I'll see them waiting. If it's something else, I walk to the door for a conversation. Basically I only use the integration to view the video and decide if I should go to the door or not. This means that Alexa/Hello integration, while not ideal, meets my needs perfectly. I easily chose to keep Alexa which provides the side benefit of not providing the evil behemoth that is Google any more access to my life than I already have. Last thing I need is my Gmail recommending male potency remedies after the Google device in my bedroom listens in on a night with my girlfriend. I'm picturing Microsoft Clippy here for some reason.

I'm muchÂ more comfortable withÂ Amazon listening in and craftily adding some books on love making for dummies to my Kindle recommendations while using price discrimination to charge me more for marital aid purchases because they know I need them.

Ok, enough TMI, back to the point. Your technology sunk cost is killing you, mmkay? When making technology decisions for your company you should ignore sunk costs. Your rational brain knows this, but you don't do it.

â€œRational thinking dictates that we should ignore sunk costs when making a decision. The goal of a decision is to alter the course of the future. And since sunk costs cannot be changed, you should avoid taking those costs into account when deciding how to proceed.â€ https://blog.fastfedora.com/2011/01/the-sunk-cost-dilemma.html

You have sunk cost in hardware, software, people-hours, consulting, and everywhere else under the sun. If you're like most these sunk costs hinder every decision you make. â€œI just refreshed my network, I can't buy new equipment.â€ â€œMy servers are only two years old, I won't swap them out.â€ I have an enterprise ELA with them, I should use their version. These are all bad reasons to make a decision. The cost is already spent, it's gone, it can't be changed, but future costs, and capabilities can. Maybe:

That sparkly $400,000 SDNÂ rip and replaceÂ will plug far more cohesively into the VP of Applications ongoing DevOps project allowing them to launch features faster resulting inÂ millions of dollars in potential profit to the company over the next 24 months.
The new servers increase compute density lowering your overall footprint and saving you on power, cooling, management, and licensing over time starting a quarter or two down the road.
Maybe that feature that's included for free with your ELA will end up costing you thousands inÂ unforeseen integration challenges while only solving 10% of your existing problem.

This issue becomes insanely more relevant as you try and modernize for more agile IT delivery. Regardless of the buzzword you're shooting towards, DevOps, Cloud, UnicornRainbowDeliverySystems, the shift will be difficult. It will be exponentially more difficult if you anchor it with the sunk cost of every bad decision ever made in your environment.

â€œOf course your tool sounds great, and we need something exactly like it, but we already have so many tools, I can't justify another one.â€ I've heard that verbatim from a customer, and it's batâ€”shitâ€”freakingâ€”crazy. If your other tools suck, get rid of them, don't let those bad decisions negate you from purchasing something that does what you need. Maybe it's your vetting process, or um, eh, that thing you see when you look in the mirror that needs changing. That's like saying 'My wife needs a car to get to work, but I already have these two project cars I can't get running, I can't justify buying her a commuter car.'

Most of our data centers are built using the same methodology Dr. Frankenstein used to reanimate the dead. He grabbed a cart and a wheelbarrow and set off for his local graveyard. He dug up graves grabbing the things he needed, a torso, a couple ofÂ legs, a head, etc. and carted them back to his lab. Once safely back at the lab he happily stitched them together and applied power.

Data centers have been built buying the piece needed at the time from the favored vendor of the moment. A smattering of HP here, a dash of Cisco there, some EMC, a touch of NetApp, oh this Arista thing is shinyâ€¦ Then up through the software stack, a teaspoon of Oracle makes the profits go down, the profits go downâ€¦ some SalesForce, some VMware, and on, and on. We've stitched these things together with Ethernet and applied power.

Now you want to 'DevOps that', or 'cloudify the thing'? Really, are you sure you REALLY want to do that? Fine go ahead, I won't call you crazy, I'll just thinkâ€¦ never mind, yes I will call you crazyâ€¦ crazy. DevOps, Cloud, etc. are all like virtualization before them, if you put them on a shit foundation, you get shit results.

Now don't get me wrong. You can protect your sunk costs, sweat your assets, and still achieve buzzword greatness. It's possible. The question is should you, and would it actually save you money? The answer is no, and 'hell no.' The cost of additional tools, customization, integration and lost time will quickly, and exponentially, outweigh any perceived 'investment protection' savings, except in the most extreme of corner-cases.

I'm not promoting throwing the baby out with the bathwater, or rip-and-replace every step of the way. I amÂ recommending you consider those options. Look at the big picture and ignore sunk-cost as much as you can.

Maybe you replace $500,000 in hardware and software you bought last year with $750,000 worth of new-fangled shit today, and $250,000 in services to build and launch it. Crap, you wasted the sunk $500K and sunk $1 millionÂ more! How do you explain that? Maybe you'll be explaining it as the cost of moving your company fromÂ 4 software releases per year to 1 software reease per week. Maybe that release schedule is what just allowed your Dev team to 'dark test'Â then rollingÂ release the next killer feature on your customer platform.Â Maybe customer attrition is down 50% while the cost of customer acquisition is 30% of what it was a year ago. Maybe you'll be explaining the tough calls it takes to be the hero.

Intent Driven Architecture Part III: Policy Assurance

Here I am finally getting around to the third part of my blog on Intent Driven Architectures, but hey, what's a year between friends. If you missed or forgot parts I and II the links are below:

Intent Driven Architectures: WTF is Intent

Intent Driven Architectures Part II: Policy Analytics

Intent Driven Data Center: A Brief Overview Video

Now on to part III and a discussion of how assurance systems finalize the architecture.

What gap does assurance fill?

'Intent' and 'Policy' can be used interchangeably for the purposes of this discussion. Intent is what I want to do, policy is a description of that intent. The tougher question is what intent intent assurance is. Using the network as an example, let's assume you have a proper intent driven system that can automatically translate a business level intent into infrastructure level configuration.

An intent like deploying a financial application beholden to PCI compliance will boil down into a myriad of config level objects: connectivity, security, quality, etc. At the lowest level this will translate to things like Access Control lists (ACLs), VLANs, firewall (FW) rules, and Quality of Service (QoS) settings. The diagram below shows this mapping.

Note: In anÂ intent driven system the high level business intent is automatically translated down into the low-level constructs based on pre-defined rules and resource pools. Basically, the mapping below should happen automatically.

The translation below is one of the biggest challenges in traditional architectures. In those architectures the entire process is manual and human driven. Automating this process through intent creates an exponential speed increase while reducing risk and providing the ability to apply tighter security. That being said it doesn't get us all the way there. We still need to deploy this intent. Still within the networking example the intent driven system should have a network capable of deploying this policy automatically, but how do you know it can accept these changes, and what they will effect?

In steps assuranceâ€¦

The purpose of an assurance system is to guarantee that the proposed changes (policy modifications based on intent) can be consumed by the infrastructure. Let's take one small example to get an idea of how important this is. This example will sound technical, but the technical bits are irrelevant. We'll call this example F'ing TCAM.

F'ing TCAM:

TCAM (ternary content addressable memory) is the piece of hardware that stores Access Control Entries (ACEs).
TCAM is very expensive, therefore you have a finite amount in any given switch.
These are how ACLs get enforced at 'line-rate' (as fast as the wire).
ACLs can be/are used along with other tools to enforce things like PCI compliance.
An individual DC switch can theoretically be out of TCAM space, therefore unable to enforce a new policy.
Troubleshooting and verifying that across al the switches in a data center is hard.

That's only one example of verification that needs to happen before a new intent can be pushed out. Things like VLAN and route availability, hardware/bandwidth utilization, etc. are also important. In the traditional world two terrible choices are available: verify everything manually per device, or 'spray and pray' (push the configuration and hope.)

This is where the assurance engine fits in. An assurance engine verifies the ability of the infrastructure to consume new policy before that policy is pushed out. This allows the policy to be modified if necessary prior to changes on the system, and reduces troubleshooting required after a change.

Advanced assurance systems will take this one step further. They perform step 1 as outlined above, which verifies that the change can be made. Step 2 will verify if the change should be made. What I mean by this is that step 2 will check compliance, IT policy, and other guidelines to ensure that the change will not violate them. Many times a change will be possible, even though it will violate some other policy, step 2 ensures that administrators are aware of this before a change is made.

This combination of features is crucial for the infrastructure agility required by modern business. It also greatly reduces the risk of change allowing maintenance windows to be reduced greatly or eliminated. Assurance is a critical piece of achieving true intent driven architectures.