Skip to main content
The fastest way to use Operator is to describe a skill in chat. The manager walks you through building it, testing variants, and setting up whatever the skill needs.

Your first skill in five steps

Describe your skill

Open the Operator console and tell the manager what you want. Include what the skill should do and, if it is not obvious, how you will measure success.
Build a skill that triages our PagerDuty alerts using runbooks and past incident data.
Learn which alerts need a human and which can auto-resolve.
Measure by pages per week to on-call.
Test 5 variants and show me the results.
The manager asks one or two clarifying questions if needed, then creates the instance, writes the skill (a SKILL.md plus any scripts or references it needs), and installs it. It walks you through what it built and what comes next. Skills follow the open Agent Skills format.

Add secrets

Each instance is its own computer, but it needs credentials to access external services. If the skill needs API keys, open Environment and add them. Any API key you store here and grant to the instance becomes available to the agent.
ServiceSecret
GitHubGITHUB_TOKEN
SalesforceSALESFORCE_TOKEN
Email (Resend)RESEND_API_KEY
DatabaseDATABASE_URL
PagerDutyPAGERDUTY_API_KEY
Any APIWhatever key that service requires
Each secret can be global (available to every instance) or instance specific (granted only where needed).Return to chat after saving:
I added SALESFORCE_TOKEN. Grant it to the lead scoring skill and continue setup.

Connect a channel (optional)

If the agent should reach you directly in Telegram or Discord, open the instance page and connect a channel.
Needs a bot token from @BotFather and your numeric Telegram user ID.
Telegram is set up. Have the agent confirm it can reach me.

Watch the test round

The manager walks you through testing. It explains what variants it wants to create, spins them up, sends test inputs, and reports results.You will see a comparison table:
| Variant | Approach                  | Score  |
|---------|---------------------------|--------|
| v1      | Weighted feature scoring   | 72.1%  |
| v2      | Chain-of-thought analysis  | 68.4%  |
| v3      | Recency-biased heuristic   | 76.8%  |
| v4      | Ensemble (v1 + v3)         | 79.2%  |
| v5      | Random baseline            | 54.3%  |
From here you can:
Run another generation. The top variant is good but I think we can do better on edge cases.
Deploy the winner to production. Clean up the variant instances.

Deploy or iterate

When you are happy with the score, tell the manager to deploy. It installs the winning skill on your production instance and cleans up variants.
If the skill tracks outcomes that resolve over time (lead conversion, alert escalation), the manager can set up an automation to score periodically and trigger improvement rounds when new data comes in.

Next docs

Skills

The build, test, improve lifecycle in detail.

Instances

Variant instances, production agents, and statuses.

Environment

Secret storage and access control.

API

Drive everything programmatically.