Essential Cloud Governance Disciplines and Best Practices

Essential Cloud Governance Disciplines and Best Practices
Create a long-term cloud governance strategy that reduces risk while increasing efficiency.

Good cloud governance, when properly implemented, enables an organization to fully realize the business benefits of cloud computing while managing the risks associated with this new operational paradigm. 

What exactly is Cloud Governance?

Cloud governance has several simplified definitions. Some people define it as a set of rules that must be followed. Others define it as controls to manage access, budgets, and cloud compliance, or as a method of developing rules, monitoring, and adjusting as needed to achieve business goals.

Consider the following definitions:

"Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction," according to the NIST.

Governance is defined as "the act or process of governing or supervising the control and direction of something (such as a country or organization)."

Here is how I summarize cloud governance based on the above definitions and my experience at Sitech:

Cloud governance is the overall process and system by which an organization oversees the control and direction (e.g., manages the use) of cloud services and resources from cloud service providers.

Cloud governance is frequently used interchangeably with corporate policies, standards, and procedures (PSPs) pertaining to cloud computing operations. In some cases, combining concepts can help organizations early in their cloud journey simplify the concept of cloud governance.

It is critical for large organizations in regulated industries to separate the concepts of cloud governance and PSPs. This is due to the fact that cloud governance encompasses the systems and cloud platform technology-specific implementations that prevent, identify and correct deviations from the PSP-defined requirements. The interpretation of controls and their proper implementation method is critical for regulated organizations operating in the cloud. This is the only way for an organization to ensure that the spirit of the PSP requirements, which are critical for managing operational and security risks, is met.

To effectively manage cloud computing risks and reap its benefits, good cloud governance should be founded on cloud-native thinking.

What Are the Advantages of Cloud Governance?

Effective cloud governance, based on a well-defined cloud governance framework, enables your organization to fully realize the benefits of the cloud while managing costs, in addition to operational and security risks holistically. As physical constraints of infrastructure capacity, capability, configuration, and speed are removed from application teams, the need for governance, i.e. oversight and direction, becomes even more important in the cloud.

Cloud computing solutions provide on-demand computing and data storage capabilities with nearly infinite capacity. They are available globally, cost orders of magnitude less than company-owned infrastructure, and are only billed as additional resources are consumed. With one line of code executed automatically via an unattended process anywhere in the world—at the speed of light—all of the resources for an entire data center can be deleted and rebuilt repeatedly.

This is a new paradigm that necessitates a cultural shift. Traditional approaches to IT infrastructure and compliance are ineffective when applied to the cloud because it is as distinct from datacenter architecture as the concepts of batch computing vs. real-time computing are. The cloud operates in real-time. To manage the speed and scale that cloud computing provides, cloud governance must include processes for defining and operating new and tailored policies, standards, and procedures via automated systems.

Putting in Place a Cloud Governance Framework

Several cloud governance frameworks are available. In my opinion, the AWS 5 Pillars of a Well-Architecture Framework represent the most "cloud-centric" disciplines and are listed in the following order:

  • Security Cost Reduction
  • Excellence in Operations
  • Reliability
  • Efficiency of Performance

These disciplines and best practices should be prioritized according to your organization's business objectives, risk profile and tolerance, and cloud maturity. All of the framework elements should ideally be incorporated to some extent from the start of your cloud governance program. In the following section, I'll go over how to use key components of this framework's best practices to create a comprehensive cloud governance strategy.

It takes years of training and practice for an athlete to hone the skills and abilities required to compete at an elite level, just as it takes time to develop and refine all of the standards and automation capabilities required to become proficient with cloud governance best practices. In other words, it is preferable to crawl, walk, and then run. The composition of your cloud governance foundation and the capabilities required for success during the 'crawl' stage is determined by the needs of your organization. It is critical to balance business objectives for using the cloud, DevOps and cloud-native maturity, and operational priorities (reliability, security, feature deployment speed, cost, and so on) while maintaining system Confidentiality, Integrity, and Availability.

Using Best Practices to Develop a Cloud Governance Strategy

Best practices for cloud governance start with the Cloud Service Provider's (CSP) shared responsibilities model, which defines your organization's responsibilities for protecting your resources for each service you use.

It is important to consider how cloud governance best practices are implemented. If your organization uses ten different patterns to achieve the same operational result, your organization's ability to automate that best practice will be limited. Standardization of well-defined cloud infrastructure, configuration patterns, and controls is required to automate your cloud governance program's best practices in order to keep up with the cloud's speed and scale.

Security

Don't assume that a CSP's services all have consistent security features. While it is the responsibility of the individual to prevent unauthorized events such as:

The customer and CSP share data sharing, internet access to the resource, and other tenant access to the resource. If the customer chooses to use a particular service, it is responsible for configuring it to meet its requirements as defined by the service-specific customer responsibilities. For newly released services, CSPs may not provide fine-grained access control, views of the administrative activity or data access logging, or even encryption of data at rest. Because each service is typically built by an agile product team that launches a series of Minimum Viable Products (MVPs) to serve the needs of a target customer persona, and then matures each product with additional features over time, there may or may not be a set of common capabilities implemented across a group of services. This means that your organization may be unable to fulfill its fundamental responsibilities, i.e. meet minimum requirements or standards for safely configuring and utilizing all CSP services.

To begin, compare the cloud platform framework components for Identity and Access Management and networking, workload separation and isolation, and configuration options to your organization's standards. Then, evaluate each service that could be deployed to meet the requirements baseline in order to determine which configuration parameters and controls are required to protect your application workloads in relation to your data classification categories. These configurations and controls should be monitored and maintained automatically using compliance tools.

Cost (Optimization)

Begin by effectively implementing basic cost management controls and tools on day one, and optimize later in the cloud governance maturation journey. Even though cloud computing and storage costs are lower, you only pay for what you use. Because physical constraints on infrastructure capacity and availability do not exist in the cloud, many organizations have been taken aback by massive cloud bills caused by resource sprawl. 

This includes, but is not restricted to:

  • Continuous development and testing environments
  • After use, large-scale evaluation and testing infrastructure is not deleted.
  • Indefinite backup and replication of obsolete and unused data
  • Snapshots of virtual machines, databases, and other resources
  • oversupplied resources

The availability of virtually full instant snapshots and unlimited capacity, combined with the cloud's scale, is likely to result in resource usage far exceeding expected costs. This is because, in the absence of automated custodial tools, actual cloud resource consumption will be significantly higher than anticipated. At the end of the day, without proper cloud cost optimization, you miss out on key benefits provided by the cloud in the first place.

Excellence in Operations

All cloud infrastructure operations should be performed as code in all environments, according to the operational excellence best practice recommended at the start of a cloud governance program. Infrastructure as code enables cloud-native thinking and the consistent, accurate, and compliant creation of infrastructure resources over and over again.

Understanding the characteristics and volumes of workloads is critical, as is ensuring that service quotas (rate limits) and network topology are sufficiently configured to accommodate them. The workload characteristics must account for the additional usage and rates of your automated monitoring and other capabilities operating with the workload and the workload itself.

Performance (Efficiency)

In terms of performance, instead of managing the provisioning and scaling of individual compute instances, consider standardizing on Platform-as-a-Service (PaaS, also known as "managed services") for application workloads.

The use of managed services shifts more responsibilities to the cloud platform provider at the expense of losing access to, control over, and transparency over the underlying compute and storage systems. Your evaluation of the managed service should indicate whether there is enough configuration control and insight to meet your organization's security and operational requirements.

Reliability

The workload architecture should be designed not only to prevent failure scenarios but also to automatically detect and mitigate failures or changes in workload demand.

Cloud services' data redundancy, fault detection, and automatic scaling capabilities range from none to global. Managed cloud services typically include built-in within-region data redundancy and automatic capacity scaling, and some even replicate data across regions and scale workload processing capacity globally. The Service Level Agreement (SLA) terms for the services used should be consistent with the workload's resiliency requirements.

The previously identified resiliency factors will need to be implemented appropriately to support the workload's reliability requirements for workloads operating on groups of individual compute and storage resources (Infrastructure-as-a-Service).

Cloud Governance Strategy

Using the above insights to develop a cloud governance strategy is critical for minimizing risk and optimizing organizational efficiency. Your approach to each best practice will most likely evolve as your cloud governance program matures to meet the changing needs of your business. Nonetheless, the value of a solid framework will provide long-term benefits to your business and customers.

Need help reducing your Cloud costs? Book a free consultation with our certified AWS and GCP teams today.