ITOps Buyer’s Guide to Evaluating Workflow Automation Tools

As IT Operations (ITOps) teams are tasked with managing complex systems and processes, finding the right automation tool can significantly impact how effectively they meet these challenges.

But how do you choose between two products that can do everything?

Workflow-based automation tools are Turing complete programming languages that allow both developers and non-developers to program, trigger, and run IT and business processes. This means they can perform any computation. As such, their inherent value is not just repackaging a scripting language. But rather, these tools offer a more accessible and advanced alternative to writing automation scripts that provides shorter time-to-value, allowing developers to code only when needed, and enabling non-developer audiences to write automation logic.

Evaluating a workflow-based automation tool for your IT Ops workflows should be based on the following:

Quality and range of prepackaged content and integrations
Flexibility and extensiveness of the automation engine
Value-add features

Prepackaged content and integrations

By design, workflow-based automation tools lower time-to-value by providing pre-packaged, pre-configured, and pre-validated content that can help organizations solve common ITOps use cases.

This out-of-the-box content is built on top of the product’s feature set and are the best indicators for how quickly organizations can automate their business processes. The range of available out-of-the-box content is straightforward to determine - are there ready made workflows that can help me automate my use cases?

For ITOps, these generally fall into one of the following categories:

Employee and device lifecycle management - such as employee onboarding and offboarding, device assignment, role changes, system updates and tool rollouts
Alert and incident management - investigation and correlation, root cause analysis, automated notifications.
IT infrastructure management - infrastructure updates, CMDB updates, vulnerability management, automated troubleshooting, new service monitoring, setting up data pipelines.
Helpdesk and ticketing - AI agent support, automated ticket management, automated self-service requests
Compliance - Report generation, security policy enforcement, data management, and audit logging.
Performance dashboards - analysis of multiple data sources for infrastructure performance, website performance, financial reporting, and others.

However, the difference typically lies in the quality of out-of-the-box content, which is determined by the following:

Pre-configuration - ingested data is normalized and adheres to a schema, actions call the right connectors and endpoints in the APIs.
Documentation - pre-packaged workflows have accompanying comments both as high-level descriptions and inside the workflow designer canvas. The logic flow from ingest to action is explained and each action and configuration within the workflow is documented.
Pre-validation - workflows are tested by the vendor under multiple circumstances, such as using different types of data, running at scale, and orchestrating different business applications.
Monitoring - workflows have built-in monitoring capabilities that report on execution times, errors, and generate logs.
Version control - changes and customization in the workflow are maintained with a version control system that can revert back, review, and merge multiple instances.
Testing data and environments - workflows can be run with real data in a test environment to see how they would behave in production.
Data connectivity - the solution can ingest data both synchronously and asynchronously to trigger workflows or feed new data to in-flight workflows. For example, they can use Websockets or subscribe to Kafka topics.

Content can also be created by third-parties, such as the technology providers themselves, such as ServiceNow for ITSM or Splunk for Observability. These workflows can be available via marketplaces or catalogs, where administrators choose the vendor and technology they want to integrate with and download fill-in-the-blanks connectors, playbooks, and actions.

Prepackaged integrations

Integrations can be evaluated on both quantity and quality. Quantity is straightforward - the more out-of-the-box integrations a tool has, the more likely it can support your entire tech stack. The quality of the integration depends on how much adjustment and manual configuration is required to integrate with the third-party tools.

Generally speaking, integrations are achieved via APIs, which means that comprehensive automation tools support the following:

Multiple actions per integrated tool - they must contain the most common (if not all) actions used by organizations in the third-party tool. For example, an integration with service management tools must include ticket creation, ticket update, ticket assignment, query ticket status, add comments, and so on. This affects the requests’ parameters and queries.
Authentication and authorization - API access is granted via tokens, such as OAuth tokens. Workflow automation tools can have a no-code field that requires administrators to input the authentication token. These can be done at a global level per integration rather than for each action.
Payload awareness - the workflow tool can identify the payload requirements, such as ticket ID, severity status, escalation status, and the like.
Version control - Handling large numbers of API-based scripts also entails a requirement for version control as scripts get updated or APIs get updated and changed. Ingested API data also must be formalized across tools to a shared schema such as the Common Event Format (CEF).
Validation - integrations can be tested using synthetic data to simulate how the workflow will behave in production without needing to use real-world data.

Automation engine

This refers to a tool’s ability to define automation logic. It can be workflow-based, scripting-based, and support built-in automation. Workflow-based automation defines step-by-step activities in a no-code/low-code environment, with playbooks triggering whenever their initial conditions are met. Playbook engines are complex features that must support advanced logic such as nested playbooks and calls to other workflows. Playbooks also must be validated in test environments with representative sample data as defined to ensure there are no unintended loops or other errors in the workflow. These playbooks can run in test environments with representative sample data.

Scripting-based automation is typically based in programming languages such as Python and requires an adequate coding environment with validation capabilities, which can be delivered via a built-in IDE with debugging functions, schema validation, and similar supporting functions.

Tools can also support built-in automation processes that are not exposed to customers, such as automatically normalizing data to a predefined product-specific schema, deduplicating fields, removing null values, and extracting payloads.Tools can also automatically and asynchronously update data whenever sources are updated, such as updates in local data sources such as CMDBs.

The automation engine must also be evaluated based on multiple requirements, which can be categorized into the following:

Developer-friendliness
Non-developer-friendliness
Deployment and scalability
Maintenance and management

Developer-friendliness

Developers have different use cases from an automation platform compared to their non-developer counterparts. Their processes are usually more complex and must play into their technology stack. As such, tools that cater to developer audiences must offer capabilities such as:

Integrating with infrastructure-as-code (IaC) tools such as Terraform and Ansible.
Supporting declarative APIs, and GitOps frameworks such as ArgoCD, Flux, and Weave.
Integrating with CI/CD tools such as Jenkins and Gitlab.
Integrating with version control systems such as Git.
Automating and scripting via languages like Python, Perl, and PowerShell.
Supporting configuration file formats such as YAML and JSON.
Javascript-based data manipulation.
Native integrated development environments with built-in autocomplete, multi-line editing, debugging, and linting.
Importing external JavaScript libraries

Non-developer-friendliness

Also referred to as citizen developers, these are non-technical staff that can use the platform to build their own automation logic or evaluate that the automation is in conformance with the business process. To do so, solutions must provide:

Drag-and-drop workflows
AI-based automation builders, which convert natural language into playbooks
Guided workflows, which recommend next steps within a workflow

Deployment and scaling up

Tools can be deployed in multiple form factors, such as:

VM-based virtual appliance - solution being provided as a virtual machine image which can either be run in on-premises environments or in the cloud
Container form factor - typically a Docker image running in either on-premises environments or in the cloud.
Software-based - an installation file they can install and run on their preferred operating system and hardware.
Public cloud image - the tool can be purchased from a public cloud provider’s marketplace and then run it in the respective cloud environment.
Software as a service (SaaS) - The vendor hosts the solution on the customer’s behalf and provides a web-based interface for users to interact with the application.

Scaling can take two forms, scaling up and scaling out. Scaling up refers to adding more computing to one instance so it can handle a larger amount of events, data, or concurrent workflows. Scaling out refers to adding multiple parallel instances such that the solution can be built horizontally.

Some tools may offer additional services such as proprietary databases, which means that the database component also needs to be considered in the deployment model. A SaaS-based solution abstracts this complexity from the end-user, but any self-hosted option must consider implications such as storage services, data architecture, retention policies, and hot/cold storage.

Maintenance and management

Automation tools can only realize their intended value if they are easy to manage. If you automate business processes but then have to spend resources managing the automation tool, your are only offloading manual work to a different activity.

Dev, Staging, and Production environments - tools must be able to support multiple types of environments for testing and development purposes. These ensure that only production-ready workflows are fully deployed.
Debugging and Validation - tools can support feature that highlight misconfigurations, circular logic, broken dependencies, incorrect credentials and the like.
Monitoring - tools can monitor automation activity both inside in the workflow and at a high level across playbooks. Inside a workflow, a tool can monitor status codes for API requests, execution times, and so on.
Multitenancy and role-based access controls - in instances where multiple business units across the organization need to use the automation tool, multitenancy features can isolate data and compute across separate instances. RBAC can ensure that administrators will only view and access workflows based on their permissions.
Integrations with IdP - the tool can import users and access policies from directory services such as Okta and Active Directory, including SSO and MFA.

Value-add features

These are solution capabilities that are neither out-of-the-box content, nor automation based. They can include non-functional requirements such as the vendor’s capability to provide support for large scale deployments. These can include the following:

Creating customer-facing front-ends - tools can generate forms or other types of applications that provide an UI for consumers. Advanced solutions can generate web pages for these front-ends and host them within the tool’s deployment.
SOC 2 Type II - demonstrated ability to securely manage data
Enterprise-grade support - these can include support services such as ticketing, phone and chat-based support, technical documentation, community forums
Professional services - tool providers can offer engineering services for development and deployment assistance, such as creating high availability configuration or integration with legacy or proprietary applications
Encryption for data in transit - tools that pull sensitive information such as customer data into the workflow can employ transit encryption to protect against MITM or sniffing attacks.
Embedding Interfaces into your existing ITSM tools - workflows inputs can be natively integrated in other ITSM tools to ingest data directly.
AI self-hosting - embedding AI models in the tool’s deployment to run models such as LLMs locally rather than calling to external APIs.

Wrap up

In summary, when selecting a workflow-based automation tool for your IT Ops, focus on the following key factors:

Prepackaged content & integrations: Look for ready-made workflows and extensive integrations.
Automation engine flexibility: Ensure it supports both developers and non-developers with no-code/low-code options and advanced logic.
Scalability & deployment: Consider deployment models (SaaS, container, on-premise) and scalability.
Value-add features: Evaluate additional features like AI self-hosting, enterprise support, and data security.

These elements will help you choose a tool that meets your organization's automation needs efficiently!

ITOps Buyer’s Guide to Evaluating Workflow Automation Tools