Add a “test” skill to improve agentic workflows.#355
Add a “test” skill to improve agentic workflows.#355theengineear wants to merge 1 commit intomainfrom
Conversation
| Use conversation context to determine the appropriate test scope: | ||
|
|
||
| - **If user asks to run a test by name** (e.g., "run the 'click events' test"): | ||
| Use filtering with the `--test-name` flag (CLI) or `?x-test-name=<pattern>` |
There was a problem hiding this comment.
Stuff like this is particularly helpful… otherwise agents default to grep-ing output and missing critical information because they are trying to limit the text they ingest from running things. It also helps hoomans follow along on what the agent is up to in a less cryptic way.
|
|
||
| - **If no specific context**: Run all tests | ||
|
|
||
| Use judgment to pick the most targeted test scope for the situation. |
There was a problem hiding this comment.
Of note — when writing a skill, I find it’s important to provide information, but not be too overbearing. I have seen poor results when skills are written too rigidly. In this case, language like “use your judgement” or “choose based on context” are nice ways to ensure that the agent understands what’s possible, but still works to decide what to do.
| @@ -0,0 +1,181 @@ | |||
| --- | |||
| name: test | |||
There was a problem hiding this comment.
Just to make sure we’re operating with the same context. So-called skills are auto-discovered based on the name / description here. So the goal is to provide enough flavor text here to allow agents to connect the dots during a conversation / session.
You can invoke these manually via commands, but that’s typically not what you will do. The idea is to write these in such a way that the agent just does what you would expect based on natural language prompts.
|
@klebba — any opposition to leaning more into claude-based development? I have found adding some skills (but not overdoing it) is valuable. This one is just to get claude to better-understand what I mean when I ask it to run tests, etc. It also improves how the agent iterates when working on a feature / fix. |
I have used this skill on other projects leveraging “x-test” and it has improved the efficiency of AI collaborations for me. At some point, we could consider documenting a skill like this in “x-test”, but there is some “x-element”-specific stuff in here now, and it feels premature to over-engineer something as simple as a skill. With this, agents should be able to better interpret conversational requests to test and better-execute iterative workflows which involve both testing via the CLI and via a browser through tools like Chrome DevTools MCP.
3efd6e8 to
5e6497a
Compare
I have used this skill on other projects leveraging “x-test” and it has improved the efficiency of AI collaborations for me. At some point, we could consider documenting a skill like this in “x-test”, but there is some “x-element”-specific stuff in here now, and it feels premature to over-engineer something as simple as a skill.
With this, agents should be able to better interpret conversational requests to test and better-execute iterative workflows which involve both testing via the CLI and via a browser through tools like Chrome DevTools MCP.