DevOps Domains

|

,

|

DevOps is where Software Engineering and Operations meet, and as can be imagined, there’s a lot of very important domains that’s encompassed within it.

The primary domains are as listed below:

  1. Planning & Collaboration
  2. Source Code & Configuration Management
  3. CI/CD
  4. Change Management & Governance
  5. Monitoring & Observability
  6. Site Reliability
  7. Security & Compliance (DevSecOps)
  8. Cloud & Infrastructure Management
  9. Experimentation & Innovation (DevEx & AI/ML Ops)

Each of these domains are fundamental to the 3 underlying goals of most companies and organizations.

  1. Delivering the highest quality product to the end-user.
  2. Continuously improving operational efficiencies and reducing infrastructure cost.
  3. Staying out of the News.

Though the job DevOps Engineer is a very common job within the Industry, and is absolutely needed for implementing the tools, pipelines, and processes, not all of those that “do DevOps” hold a “DevOps” title.

In my experience, I’ve seen the responsibilities and tasks that fall into these 10 DevOps Domains be thought of as a “shared responsibility”. This is why understanding each DevOps Domains — their objectives, values, common tasks, and emerging trends are crucial for anyone that’s part of a Product Engineering team to be the best they can be.

DevOps planning & Collaboration Domain has a high overlap of traditional views of other Roles within Product Engineering.

Tasks such as:

  • Agile Ceremonies (Standup, Planning Meetings, Retros)
  • Requirements Analysis, Breakdown, & Prioritization
  • Backlog & Iteration Management
  • Stakeholder Alignment

All fall into this overlapping bucket.

However, for what I consider as more uniquely, “DevOps Planning”, think: planning for what is being planned.

Things like determining what the Storage, Compute, & Networking Infrastructure needs are based on product requirements and projected usage, so that estimated infrastructure cost can be projected along side estimated revenue projections , and development resource cost. Additionally, so that the infrastructure is available when needed by the development team to avoid any operational bottlenecks.

DevOps Planning also includes Code Quality & Development Workflow planning. This entails things like defining the code-quality gates for Code Coverage %, Acceptable Code Analysis Score Thresholds, Peer Review Policy, Acceptable Severity for Security Scan Findings, etc. This planning allows for the Development Team to continually deliver with quality baked in, which reduces risk of additional Development efforts in the future to address Vulnerabilities or avoidable Bugs, and it promotes quality as a requirement instead of phase in the process.

DevOps goal for Source Code & Configuration Management is that all technical components be defined as Code – infrastructure, application, database, application configuration settings, data operations, and even the CI/CD Pipeline itself. The value of this is that everything is version-controlled, and *almost* anything can be reproducible, and is a very friendly approach for automation.

Another goal of DevOps within the Source Code & Configuration Management domain is Security. This is primarily in two fronts:

  1. Compliance and Auditability – When everything is code and version controlled, everything is traceable, making it easier to obtain industry and/or contractual compliance.
  2. Credential & Secret Management – Ensuring that sensitive data has access-control management and is stored encrypted in a dedicated Secrets Management software are some of the first steps to ensuring sensitive data is never leaked (callback to one of the main goals – Staying out of the News).

Some emerging areas within this domain are:

Emerging Trends & Best Practices:

  • GitOps: A growing best practice is GitOps, where Git is not just for app code but the single source of truth for infrastructure and configuration as well. All changes (code or infra) are done via Git commits and pull requests, and automated agents apply those changes to environments. This brings Kubernetes and cloud infrastructure management into the same workflow as code, with full audit trail and the ability to rollback by reverting a Git commit.
  • Trunk-Based Development: Many high-performing teams favor trunk-based development for continuous integration – keeping a single main branch that’s always stable, and using feature flags for work in progress. This reduces long-lived branches and complex merges, enabling frequent integrations (often multiple times a day). It’s complemented by automated tests to ensure stability.
  • Shifting Config to Code: There is a move towards treating everything as code. Not only infrastructure, but also things like pipeline definitions, compliance policies, and even documentation are managed in source control. This ensures traceability and the ability to automate changes.
  • Policy as Code: In configuration management, teams are adopting policy as code to enforce standards. For example, using tools like Open Policy Agent (OPA) to automatically review infrastructure code for policy violations (e.g., disallowing open security groups in Terraform code) before it’s merged. This governance-as-code approach is becoming a best practice in regulated environments.
  • Continuous Integration of Config: Merging configuration changes triggers test deployments in ephemeral environments. Best practice is to test infrastructure changes similarly to application code changes. For instance, if a developer updates an Ansible playbook, a CI pipeline might instantiate a test VM, apply the playbook, and run tests to verify the configuration works as expected. This catches errors early in config changes.

Some consider CI/CD the backbone of DevOps. For those unfamiliar with what CI/CD stands for it’s a combination of two very widely used practices – Continuous Integration & Continuous Delivery (or Deployment).

The main goal of CI/CD are:

  1. Automate Code Integration & Validation to detect and address bugs earlier in the Development Cycle. (CI)
  2. Deliver Software faster and more reliable. (CD)
  3. Zero Downtime Deployments. (CD)
  4. Improve Software Quality and Stability .(CD)
  5. Enhance Security & Compliance. (CD)

Notice I’ve categorized these goals with either “CI” or “CD”. I think of these two as “Build” & “Deliver” (where branch strategies, testing, code analysis scans, and quality gates are part of the definition for build, and where environments, approvals, and deployment strategies are part of the definition of “Deliver”).

TODO – IMAGE of a CI/CD Pipeline

Together these goals provide the value of enhanced delivery speed, quality, security, & even provides the ability for Progressive Delivery through the use of feature flags, canary releases, and A/B testing to reduce deployment risks, and to provide quicker and safer Build-Measure-Learn loops.

Emerging Trends & Best Practices

  • Everything as Code: Treating pipelines as code (e.g., GitHub Actions YAML, Jenkinsfile) for versioned and reproducible workflows.
  • GitOps for CI/CD: Use Git as the single source of truth, automating deployments based on Git repository changes.
  • AI-Driven Pipelines: Using AI/ML in CI/CD to optimize test execution, detect flaky tests, and predict deployment failures.

Contrary to the traditional Change Management processes that usually include a heavy amount of manual approvals which take a lot of time. DevOps Change Management focuses on light-weight, automated governance. This approach doesn’t aim to cut corners, still requiring all changes are tracked, reviewed, and auditable, just without the bottlenecks.

The way the DevOps Change Management approach works, builds off other key principle discussed in sections above like – “all components as code”, “all code version controlled”, “automated testing and quality gates”. The combination of these other goals and objectives can serves as automated change approvals. In short, if all changes trigger a CI and then CD, and all quality gates pass then there’s no need for someone to review each deployment if there’s faith in the quality gates.

For Production and sometimes Staging deployments or changes, the DevOps approach still uses Change Tracking usually through the likes of JIRA or another Product Management Software, where all changes are recorded in a Change Request with the associated to Backlog item(s) and thus their associated Code Commits per item, for end to end traceability. Often times the Backlog item has associated Acceptance Criteria and Test Artifacts for proof that the changes were validated if ever audited.

This also allows for Incident and Change Correlation, which is covered more in-depth in the SRE section below, but in short, Change Management practices can help correlate changes to incidents so that checks can be added to prevent similar incidents in the future (continual improvement).

Emerging Trends & Best Practices:

  • Continuous Compliance: Instead of periodic audits, teams aim for continuous compliance. This means the system is always in a compliant state by design. Automated checks make sure every change meets compliance requirements, and evidence is collected in real-time. For example, if an industry regulation requires test evidence for each change, the pipeline attaches test reports to each change record automatically. This trend reduces the scramble of preparing for audits, since you’re always audit-ready.
  • Blameless Governance: Borrowing from the SRE culture of blameless post-mortems, governance is shifting to not punish or gatekeep developers but to enable them. Best practice is to treat governance failures (e.g., a change that caused an incident) as a learning opportunity to improve the automated rules or to add a missing test, rather than reverting to heavy-handed manual processes. This encourages devs to engage positively with the change process rather than trying to circumvent it.
  • Change as Code: As with other areas, treating change processes as code is a best practice. Teams store their change process definitions (workflows, pipeline configurations, policy rules) in version control. This allows reviewing changes to governance the same way as code changes. For instance, if you want to modify the approvals needed for a deployment pipeline, you do so via a code change in the pipeline config, which can be reviewed and tested.
  • Decentralized Approval with Guardrails: Enterprises adopting DevOps at scale often move from a single centralized CAB to decentralized approval. Each team can approve their changes as long as they stay within guardrails (like change types deemed standard). Automation classifies changes – say, a routine application deployment vs a risky infrastructure change – and auto-approves the routine ones. Only exceptional cases go to a central authority. This dramatically increases deployment frequency while still managing risk.
  • Metrics-Driven Change Management: Teams start to measure the effectiveness of their change management process with metrics like Change Lead Time (how long from code commit to production), Change Failure Rate (what percentage of changes cause incidents), etc. These come from the DevOps Research and Assessment (DORA) metrics. A best practice is to use these metrics to continually improve: for example, if the change failure rate is high, invest in better testing (governance by improving quality). If lead time is high without justification, see which approvals or steps are bottlenecks and automate or streamline them. In essence, governance itself becomes data-informed in the DevOps model.

If CI/CD is the backbone of DevOps, Monitoring and Observability is the brains. In short, monitoring and observability are about ensuring the health and performance of applications and infrastructure after they’re deployed.

Monitoring traditionally refers to collecting metrics and logs on predefined aspects of the system (CPU, memory, error rates, etc.) and setting up alerts for when things go wrong​

Observability is a broader concept – the ability to ask questions about system behavior and dig into issues even if they weren’t explicitly anticipated. An observable system produces rich telemetry (logs, metrics, traces) that allow engineers to understand why something is wrong, not just that something is wrong.

DevOps Monitoring & Observability goals are:

  • Identify and address gaps in System Performance
  • Capacity and Performance Monitoring: Continuously observing usage patterns to predict scaling needs.
  • Tacking & Monitoring System Performance Metrics and

TODO – Visual of monitoring integrations

Emerging Trends & Best Practices:

  • Observability-Driven Development: Just as tests are written for new features, teams now consider how they will observe a feature in production as they build it. Best practice is to add telemetry (logs, metrics, traces) for new code paths during development. For example, if you add a new payment service, you also add metrics for payment attempts/successes and a log line for each payment transaction. This “instrument as you develop” approach ensures new features come with the visibility needed to monitor them.
  • Three Pillars of Observability: It’s now standard to ensure you have the three pillars: Logs, Metrics, and Traces. Mature teams augment this with event data or synthetic monitoring (like scheduled test transactions) – sometimes called the fourth pillar. A best practice is correlating these signals. For instance, use trace IDs in log entries so you can jump from a log line to a full trace of that request. This unified approach dramatically speeds up troubleshooting.
  • AIOps and Intelligent Alerting: AI is increasingly applied to monitoring data. AIOps platforms analyze logs and metrics to identify patterns or predict incidents (e.g., using machine learning to forecast when resource exhaustion might occur based on trends). They can also reduce noise by grouping related alerts. For example, instead of 100 alerts for 100 servers all reporting high CPU (from a single cause), an AI-driven tool might group them into one incident. The trend is to leverage machine learning to assist DevOps teams in catching issues proactively and pinpointing root causes faster.
  • Shift-Right (Observability in Testing): Observability tools are not just for production. Teams are using them in pre-prod environments to debug test failures and even to ensure tests themselves cover the right scenarios. For instance, during a load test, the same dashboards and traces used in production can reveal how the system behaved under load. This helps correlate test results with system internals. It’s a best practice to treat staging like prod in terms of monitoring – so you have confidence before release, and you practice using your observability tooling.
  • Customer Experience Monitoring: Beyond technical metrics, there’s a push to monitor user experience more directly. This includes Real User Monitoring (RUM) – capturing performance data from actual user browsers/devices – and synthetic user journeys (transaction monitoring). By observing things like page load times from the user’s perspective or the success rate of a full purchase flow, DevOps teams get a more holistic view of service quality. This trend ties in with DevOps focus on customer value: not just “is the server up?” but “are users able to do what they need, quickly and successfully?”. Combining these insights with traditional monitoring rounds out observability.

The WordPress community and theme developers are actively contributing to a growing library of block patterns, making it easier for users to find a pattern that suits their needs.

  • Handling incidents and outages, performing root cause analysis to prevent recurrence.

The WordPress community and theme developers are actively contributing to a growing library of block patterns, making it easier for users to find a pattern that suits their needs. Whether you`re building a landing page, a photo gallery, or a complex layout, there`s likely a block pattern ready to use.

The WordPress community and theme developers are actively contributing to a growing library of block patterns, making it easier for users to find a pattern that suits their needs. Whether you`re building a landing page, a photo gallery, or a complex layout, there`s likely a block pattern ready to use.

The WordPress community and theme developers are actively contributing to a growing library of block patterns, making it easier for users to find a pattern that suits their needs. Whether you`re building a landing page, a photo gallery, or a complex layout, there`s likely a block pattern ready to use.

The WordPress community and theme developers are actively contributing to a growing library of block patterns, making it easier for users to find a pattern that suits their needs. Whether you`re building a landing page, a photo gallery, or a complex layout, there`s likely a block pattern ready to use.

The WordPress community and theme developers are actively contributing to a growing library of block patterns, making it easier for users to find a pattern that suits their needs. Whether you`re building a landing page, a photo gallery, or a complex layout, there`s likely a block pattern ready to use.

Leave a Reply

Your email address will not be published. Required fields are marked *