Running tests with the CLI

export const meta = { title: 'Running tests with the CLI', description: 'Run Canary workflows from your terminal, locally or in CI/CD, with real-time results.', tags: ['guide', 'cli', 'testing', 'ci-cd'], };

The Canary CLI lets you trigger test runs from your terminal and stream results in real-time. Use it locally during development or wire it into CI/CD to gate deployments.

You can also use the CLI for sandbox authoring and management workflows, including building templates, uploading artifacts, and managing template actions. See the Sandbox reference for sandbox-specific commands and options.

01Prerequisites

At least one published flow in your organization.
An API key or an interactive login session.
If you start from a sandbox, access to that sandbox so you can open the Connect locally entry point and launch the CLI from the sandbox UI.

If you haven't built flows yet, start with Building your first smoke suite.

02Install the CLI

Install the Canary CLI globally with your preferred package manager:

bash

# npm
npm install -g @canaryai/cli

# or with bun
bun add -g @canaryai/cli

Use the same CLI for both test execution and sandbox workflows from your terminal, including service-level sandbox debugging commands.

Verify the install:

bash

canary version

03Authenticate

You have two options: interactive login (for local development) or an API key (for CI/CD and automation).

The same authentication methods work for test commands and sandbox management commands.

If you are working in a sandbox, open the sandbox and use Connect locally to open the connect drawer. The drawer points you to the local CLI flow so you can authenticate, connect your local machine, and continue with richer agent workflows from your terminal.

The CLI also includes built-in guidance for Claude and Cowork MCP usage. After you install and authenticate, you may see CLI guidance that points you to AI-assisted workflows for exploring your app, building workflows, running tests, and investigating failures.

bash

canary login

This opens your browser for a one-time device code flow. Once approved, the CLI stores a long-lived token at ~/.config/canary-cli/auth.json.

If your account belongs to multiple organizations, you'll be prompted to pick one. You can also pass --org <name> to skip the prompt.

A recent CLI fix also improves organization selection during sign-in when your account has memberships tied to deleted organizations.

Option B: API key

Create an API key in Settings > API Keys (requires admin access), then pass it to the CLI:

bash

canary test --remote --token cnry_your_api_key

Or set it as an environment variable so you don't have to pass it every time:

bash

export CANARY_API_TOKEN=cnry_your_api_key

See API Keys for details on creating and managing keys.

04Run your tests

Trigger a test run across all published workflows:

bash

canary test --remote

The CLI will:

Start a remote test run
Stream real-time results to your terminal
Print pass/fail status for each workflow as it finishes
Exit with code 0 if all workflows passed, or 1 if any failed

Example output:

Starting remote workflow tests...

  ✓ Sign in flow
  ✓ Create project
  ✗ Checkout flow
    Error: Expected "Order confirmed" but page showed "500 Internal Server"
  ✓ Search and filter

──────────────────────────────────────────────────
FAILED: 1 of 4 workflows failed (75% pass rate)

Saved credentials now warm up before workflows begin during test suite runs. This helps authenticated sessions start in a ready-to-use state and reduces login friction at the start of automated runs.

AI-assisted guidance in the CLI

The CLI now surfaces built-in guidance for Claude and Cowork MCP usage while you work. Use that guidance when you want help from an AI agent without leaving your terminal workflow.

You may encounter this guidance when you:

Explore an app before you build flows
Build new workflows from a sandbox or test environment
Run tests and decide what to execute next
Investigate failed runs and gather the right context for follow-up work

Use the guidance to speed up common tasks:

Task	How the guidance helps
App exploration	Points you toward MCP-assisted ways to inspect the app and understand key paths before recording or refining workflows
Workflow building	Suggests how to hand off workflow creation or iteration to Claude or Cowork MCP workflows
Running tests	Helps you move from a manual CLI run to an AI-assisted loop for selecting, rerunning, or expanding coverage
Failure investigation	Helps you gather context from failed runs and continue troubleshooting with an AI agent

If you also use the CLI for sandboxes, you can run template and environment workflows from the same terminal session. For example, the CLI now supports sandbox action list, get, create, and delete commands for template authoring. See the Sandbox reference for the full command set.

You can also run a command inside a specific sandbox service instead of only the host VM. Use this when you need to inspect a container, run service-specific checks, or debug how your app behaves inside the sandbox.

bash

canary sandbox run-command <instance-id> <command...> --service <name>

For example, run tests inside a web service container:

bash

canary sandbox run-command sbx_123 npm test --service web

Long-running sandbox builds are also more resilient. If a build takes longer than the gateway timeout, the CLI continues by polling diagnostics and reporting progress instead of failing immediately.

When uploading sandbox artifacts, use template-level artifacts for files shared across a template, and use custom file artifacts when you need an explicit destination path inside the sandbox. Refer to the Sandbox reference for current artifact upload options and examples.

05Filter which tests run

You don't always want to run everything. Use tags and name patterns to run a subset.

By tag

Tags are assigned to flows in the UI. Filter by tag to run a specific category:

bash

canary test --remote --tag smoke

By name pattern

Match workflows by name (case-insensitive):

bash

canary test --remote --name-pattern "checkout"

Combine filters

When both are provided, workflows must match both criteria:

bash

canary test --remote --tag smoke --name-pattern "auth"

06Verbose mode

Add --verbose (or -v) to see every SSE event as it streams, including suite metadata and timing:

bash

canary test --remote --tag smoke --verbose

07Exit codes

Code	Meaning
`0`	All workflows passed
`1`	One or more workflows failed

This makes the CLI a natural fit for CI/CD pipeline gates: if a workflow fails, the pipeline step fails.

08Run in GitHub Actions

Add this workflow to .github/workflows/canary-tests.yml:

yaml

name: Canary Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  canary:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install Canary CLI
        run: npm install -g @canaryai/cli

      - name: Run smoke tests
        env:
          CANARY_API_TOKEN: ${{ secrets.CANARY_API_TOKEN }}
        run: canary test --remote --tag smoke

Setup steps

Create an API key in your Canary organization.
In your GitHub repo, go to Settings > Secrets and variables > Actions.
Add a secret named CANARY_API_TOKEN with the key value.
Commit the workflow file above.

Tips

Use the --tag flag to run only smoke tests in CI rather than the full suite. Save the full suite for nightly or staging deploys.
Pin to a specific CLI version if you want reproducible builds: npm install -g @canaryai/cli@0.1.7
Add the step after your deploy so you're testing the freshly deployed version.
Use --verbose in CI for better debugging when a failure occurs.

For more CI platforms (GitLab CI, CircleCI, Jenkins), see the CI/CD Integration guide.

09Environment variables

Variable	Description	Default
`CANARY_API_TOKEN`	API key or login token	(none)
`CANARY_API_URL`	API endpoint	`https://api.trycanary.ai`

10Troubleshooting

"No API token found"

Either set CANARY_API_TOKEN or run canary login first.

"Failed to start tests: 401"

Your token is invalid, expired, or revoked. Create a new API key or re-run canary login.

"No workflows found matching the filter criteria"

Check that your filters match at least one published workflow. Tags are exact matches; name patterns are case-insensitive substrings.

Tests pass in the UI but fail in CI

Confirm your flows target the correct environment (staging vs production).
Check that credentials are configured for the environment being tested.
Saved credentials now warm up before workflows begin, so re-run the suite if an older session started before this improvement.
Use --verbose to see detailed event data.

Sandbox command runs in the wrong environment

If you expected to run inside a specific container or service, add --service <name> to canary sandbox run-command. This targets that sandbox service directly instead of the host VM. See the Sandbox reference for service names and related sandbox commands.

Sandbox command returns a clearer CLI error

Sandbox actions and sandbox CLI operations now return clearer, more consistent errors for invalid input, disabled actions, missing instances, and timeouts.

If a sandbox command fails, read the CLI message first, then verify:

The instance ID is correct
The action is enabled for that template or environment
The service name passed to --service exists
The sandbox is still running and reachable

11Next steps

Open a sandbox and use Connect locally when you want to move from the sandbox UI into a richer local CLI workflow.
Building your first smoke suite -- pick the right flows to test
API Keys -- manage keys for your team
CI/CD Integration -- examples for GitLab, CircleCI, and Jenkins
Sandbox reference -- manage sandbox templates, artifacts, builds, actions, service-scoped CLI commands, and agent-friendly sandbox workflows
Test Runs -- understand test run lifecycle and auto-verification

Running tests with the CLI

01Prerequisites

02Install the CLI

03Authenticate

Option A: Interactive login

Option B: API key

04Run your tests

AI-assisted guidance in the CLI

05Filter which tests run

By tag

By name pattern

Combine filters

06Verbose mode

07Exit codes

08Run in GitHub Actions

Setup steps

Tips

09Environment variables

10Troubleshooting

"No API token found"

"Failed to start tests: 401"

"No workflows found matching the filter criteria"

Tests pass in the UI but fail in CI

Sandbox command runs in the wrong environment

Sandbox command returns a clearer CLI error

11Next steps