Written by Shivani Govind, Product Engineer
Contributor Anirudh Sareen, Product Manager
Design and Operate Your Cloud System Using the best Architectural Practices
Well-Architected Framework assists Architects to understand & build high-quality infrastructure solutions that are more secure, have high performance, are reliable, and provide efficient guidelines suited for their applications.
With the help of a Well-Architected framework, one can quickly identify the problem areas, get actionable guidance that can improve workloads in a specific area that matters most in your organization and ensure Standardization, and also achieve consistency.
You have to keep track of the changes and updates that usually occur in & around your cloud portfolio to ensure you follow the best practices always.
Well-Architected Audit (WAA) – a module of CloudEnsure, offers a multi-cloud platform that assists you to understand your system effectively as well as analyze the business to ensure it delivers as per expectations. WAA tackles general design principles with specific best practices and guidance in key areas known as the 5 pillars of the Well-Architected Audit.
5 pillars of the Well-Architected Audit in details
Let’s take a closer look at the 5 pillars of the Well-Architected Audit and run through a checklist of questions you should be able to answer:
1. Operational excellence:
Operational excellence includes the ability to run and observe the system to deliver business principles and deliberately improve the supporting process.
- How are you evolving your workload while minimizing the impact of changes?
- What best practices for cloud operations are you using?
2. Security:
The security pillar is having ability to protect the information, system and assets while implementing the business principles through risk evaluation to improve the overall security of cloud.
- How are you encrypting and protecting your data at rest?
- How are you encrypting and protecting your data in transit?
3. Reliability:
The reliability pillar is based upon the ability to meet business and customer demand without interrupting and quick recover from failures.
- How are you managing AWS limits for your account?
- How are you planning your network topology on AWS?
- Do you have an escalation path to deal with technical issues?
4. Performance efficiency:
The performance Efficiency is based upon the ability to used computing resources efficiently.
- How do you select the appropriate instance type for your system?
- How do you ensure that you continue to have the most appropriate instance type as new instances types and features are introduced?
- How do you monitor your instances post launch to ensure they are performing as expected?
5. Cost optimization:
The ability to avoid or eliminate unneeded cost or sub-optimal resources.
- How do you make sure your capacity matches but does not substantially exceed what you need?
- How are you optimizing your usage of AWS service?
The list of checks has been categorized based on these pillars in CloudEnsure to keep it easy to understand and action.
What is a Check?
Check is basically a rule that ensures your workload is following the best industry-standard & avoiding any misconfigurations. CloudEnsure has numerous rules, for example, In AWS, there are few rules to follow while creating any resource or while implementing architecture on the cloud:
- Termination protection should be enabled on EC2
- S3 bucket should not be accessible publicly
- Enable disk encryption on virtual machine.
- Email Alerts for SQL threat detection service not enabled.
Above listed are few instructions one should follow to ensure smooth provisioning; we call these set of instructions or rules as a check. CloudEnsure calls a cloud service API and then determines what is the value returned as part of that check.
Example:
Termination Protection is “enabled” or “disabled” for any particular EC2 instance.
The huge number of checks performed using the API calls are segregated based on Service Names provided by Cloud provider.
Each check gets executed and based on the result, the status gets recorded. In CloudEnsure, we have 4 status values (Pass, Fail, N/A, Cannot be determined)
- Pass Status: The check satisfies all the conditions for the rule
- Fail Status: The check fails to satisfy the condition and acts as an Identified Issue
- N/A (Not Available): No specified resource has been created for this service
- Cannot be determined: Unable to reach & establish a connection with the service API’s
The Checks which acts as an issue are Categorized based on the severity levels (Catastrophic, Critical, Moderate, Low).
we get count of issues for each pillar level which on summing up we get total issues for a particular status on the top of the module.
Features of well-architect audit:
1. Rerun
Users can re-run the audit anytime if it’s a premium account. CloudEnsure runs the audit daily once automatically. Even for a free account, the audit runs once in 3 days.
You can get to know the status and the last re-run that happened on the specific account.
2. Download WAA report
We can generate reports for easy analysis, WAA report is again categorized into two
As the name suggests, these reports are self-explanatory
- Summary report – an overall summarized report of WAA
- Detailed report – list of failed checks with descriptions, its resource name, and resource id.
We can Download the detailed report either in PDF format or in the Excel report best suiting the business needs.
3. Mute/Unmute Functionality
Using this functionality, one can make an informed decision to ignore or hide some of the checks of any service within any pillar which currently is not essential or does not require immediate attention. Once muted they are still available for review at a later point in time.
4. Fix Now
To resolve any issue, the user need not login into the portal and fix it, rather can just select the check and click on “Fix Now” & the check gets resolved. This feature is Called a One-Click Remediation.
- If the user wanted to do it manually then we have a step to follow
- CLI steps
2. Console Steps
5. Send Reports by Email:
You can send reports through emails to required users or even schedule sending a report as and when required. The report may be either in PDF or in excel format to a user group or to an individual registered user.
Process involved in Well-Architected Review
The well-Architected review is an activity that involves the audit of your cloud architecture, in order to obtain the report if there are any vulnerabilities present. The main aim of the entire audit is to get consideration about the client business needs, the core of their infrastructure, and preparation towards load variability and security.
A well-architected review allows you to know the best architectural practices in designing solutions on cloud infrastructure. The review process carried out follows the procedure, proficiency, and expertise of industry experts, cloud service providers, combining hundreds of cloud infrastructure checks at one single place in CloudEnsure.
Once the review process completes your organization will receive a detailed workload review and a dashboard where you can check the results across the organization. You can then use the detailed findings to fix the issues and improve your infrastructure as per the industry benchmarks to make your infrastructure more secure, high performing and cost efficient.
Benefits of Well-Architecture Audit
- Get the best solution to the vulnerabilities present within your cloud infrastructure along with the prioritized standard procedure of the remediation objectives.
- Keep streamlined with the new event occur in the cloud infrastructure. E.g., a sudden spike in the cost.
- A detailed discussion with the higher authority of the organization and IT stack-holder on how the updated infrastructure can be advantageous for you.
- Address area of concern, security, and compliance risk.
Conclusion:
The Adoption of a Well-Architected Framework is an essential part of ensuring smooth Cloud governance across organization infrastructure. It not only defines best practices for operating and designing safe and cost-effective cloud services across five pillars but also ensures you are always optimized by providing actionable items. Often organizations are found wanting as they adopt a manual process for adopting this framework. This is where CloudEnsure comes as a Warrior and helps automates a lot of processes in specifically the well-architected Audit. The WAA, within a single page, gets you complete visibility in your cloud account and a better understanding of your cloud infrastructure without any efforts. It continuously monitors and governs your architecture with actionable recommendations to shape your infrastructure into a secure, optimized, cost-efficient, reliable, and operational efficient system.