Member 1 · Execution side of the loop AI 測試大師 · MK QA MASTER
mk-qa-master is an MCP server that drives web (pytest / Jest / Cypress / Go), mobile (Maestro on iOS + Android, incl. BlueStacks), and API tests (anything your pytest / Jest / Cypress / Go test suite already hits) — writes the next round from a URL or a live screen, and acts as your data-driven QA advisor every single run.
mk-qa-master sits between your AI client and your test framework. It's not the framework, the LLM, a CI runner, a source analyzer, or a SaaS UI.
Ranked by how often you actually use them.
Switch runners with a single QA_RUNNER env var: pytest / Jest / Cypress / Go for web, maestro for iOS Simulator, Android Emulator, real devices, and BlueStacks. Auto-retry, JUnit XML, screenshots, Playwright trace.zip / Maestro recordings — out of the box.
analyze_url probes the DOM; analyze_screen dumps the live mobile hierarchy. Both surface form / cta / nav / tab-bar modules with real selectors, then generate_test emits runnable pytest or Maestro YAML — not # TODO placeholders.
Every run archives a snapshot and writes a new optimization-plan.md. Flaky vs. broken vs. slow-regression — ranked by evidence, not by gut. Same loop works for web and mobile.
Every run feeds the optimizer; the optimizer points at the weakest link; the next run attacks it first. Without this loop, AI is just a faster monkey tester.
analyze_url / analyze_screengenerate_test / auto_generate_testsrun_tests / run_failedget_test_report / get_failure_detailsget_optimization_planA DOM-only analyzer produces 'empty field should error' — monkey testing in a new wrapper. We layer real QA knowledge on top.
qa-knowledge.md in your project root: business rules, historical bugs, standard assertion copy, user journeys, technical constraints. Run init_qa_knowledge to scaffold one.business_context slice into generate_test; it gets printed as a # Business context: block inside the test, so reviewers see why without leaving the file.After each run, the advisor reads history/ and telemetry, then writes a ranked action list. Three perspectives:
Per-test outcome strings like PFPFP feed a flake score. Cross-reference error signatures: three consecutive fails with the same signature → marked broken (a real bug, not flake).
Tool telemetry surfaces top tools, error rate, repeated args, and common A→B chains. Tells you where to ship a meta-tool or cache.
Did the test generate_test wrote show up in the next run? Did the modules analyze_url detected get matching test files? Adoption rate vs. coverage gap — tracked.
Switch via the QA_RUNNER env var. Seven frameworks, one MCP surface — web on four, mobile on Maestro, API on Schemathesis (OpenAPI / Swagger, since v0.6.0) or Newman (Postman collections, since v0.6.1). Pre-existing API tests in pytest + httpx / Jest + supertest / Cypress cy.request() / Go httptest still ride their respective runners — no migration. Pact provider verification on the v0.7.0 roadmap.
pytest-playwrightjestcypressgo testmaestro (iOS + Android + BlueStacks)schemathesis (OpenAPI / Swagger)newman (Postman collections)Grouped by role. Each group is one layer in the analyze → generate → run → report → advise loop. README's prompting cookbook has natural-language phrasings — you rarely name a tool yourself.
get_runner_info — Which runner is active + all available. Call this first so the AI picks the right test template (Playwright .py vs Maestro .yaml).list_tests — Enumerate every collectable test under the active runner — pytest --collect-only, jest --listTests, cypress glob, go -list, maestro YAML walk.analyze_url — Web: probe a live URL — form / nav / dialog / cta modules + selectors + API endpoints the page hits + layout-overflow warnings + candidate TCs.analyze_screen — Mobile: dump maestro hierarchy → form / cta / tab_bar modules + candidate TCs, noise-filtered (status bar + asset names stripped).generate_test — Test skeleton; with module from analyze_url/analyze_screen, a *runnable* Playwright .py or Maestro .yaml with concrete selectors — not # TODO stubs.auto_generate_tests — One-shot: analyze_url → generate_test per module. Hand it a URL, get a tests/ folder back.codegen — Launch Playwright codegen interactively (web) / hint to maestro studio (mobile). Good for baseline happy-path recording.init_qa_knowledge — Scaffold qa-knowledge.md in the project root — business rules / past bugs / standard assertions / user journeys / technical constraints.get_qa_context — Read qa-knowledge.md (built-in ISTQB fallback). Feed a slice into generate_test.business_context for domain-aware tests.run_tests — Execute under the active runner; writes report.json + JUnit XML, snapshots into history/, auto-refreshes optimization-plan.md. Optional filter.run_failed — Re-run only last failures — pytest --lf, jest --onlyFailures, cypress/go reverse-lookup, maestro nodeid → .yaml. Way faster than re-running the suite.get_test_report — Summary: passed / failed / skipped / flaky_in_run / duration. Cheap — use it between actions instead of re-running.get_failure_details — Per-failure message + screenshot + Playwright trace.zip + video paths + parsed step sequence. The 「why did it fail」 tool.generate_html_report — Render the latest run as one self-contained HTML — base64 screenshots, trend sparkline, collapsed Passed, expanded Failed cards. Slack-able.get_test_history — Last N archived run summaries — flake / duration regression / pass-rate trend. Pair with get_optimization_plan for action items.get_optimization_plan — Three-lens prioritized plan: suite quality (flake / broken / slow_regression) + MCP usability (top tools, repeat args, error rate) + AI effectiveness (generate_test adoption, coverage gaps). Writes optimization-plan.md every run.One sentence to the AI client; the tools chain automatically.
"Test https://your-site/login — analyze the page, write tests for every module, run them, then tell me what to fix."
analyze_url → generate_test (×N modules) → run_tests → get_failure_details → get_optimization_plan
"I just added three new feature pages — auto-generate tests for everything the analyzer finds and run them."
auto_generate_tests(url=...) → run_tests → get_test_report → get_optimization_plan
"What's wrong with my test suite this week — give me a ranked plan, not gut feel."
get_test_history(limit=30) → get_optimization_plan(history_limit=30, telemetry_limit=2000)
"Test the barcode button on my mobile app on the iOS Simulator and tell me if it's flaky."
analyze_screen(app_id='com.example.app', launch_app=true) → generate_test(module=<cta>) → run_tests → get_optimization_plan
Same shape as spec-master's plan — markdown, ready to paste into Slack / JIRA / a sprint planning doc. Auto-written after every run.
# Optimization Plan — 2026-05-12T14:03:40 _Based on 6 archived runs._ ## Prioritized Actions ### 1. 🔴 HIGH — flaky - **Target**: `tests/test_login.py::test_invalid_credentials` - **Evidence**: flake_score=0.4, outcomes=PFPFP, rerun_count=1 - **Suggestion**: 加 explicit wait (wait_for_response / locator wait) - **auto_action_hint**: `get_failure_details(test_id="test_invalid_credentials")` ### 2. 🟡 MEDIUM — coverage_gap - **Target**: `register_form` (module detected on /register) - **Evidence**: analyze_url found this module; no matching test_*.py in repo - **Suggestion**: `generate_test(description="...", filename="test_register_form.py")` ### 3. 🟡 MEDIUM — slow_regression - **Target**: `tests/test_checkout.py::test_full_flow` - **Evidence**: median duration 1.8× baseline across last 6 runs - **Suggestion**: profile network waits; pin fixture data; consider parallel mark ## MCP usability - Top tool: `run_tests` (38%) · `analyze_url` (22%) · `get_failure_details` (14%) - Common chain: `analyze_url → generate_test` (17 occurrences) - Error rate: 2.3% (1 timeout in analyze_url against slow staging) ## AI effectiveness - generate_test adoption: 9 / 11 generated tests appeared in the next run (82%) - coverage gap: 1 module from analyze_url has no matching test file (`register_form`)
# Test Report — pytest-playwright
- total: 23
- passed: 19
- failed: 3
- flaky_in_run: 1 ← auto-retry rescued
- skipped: 0
- duration: 31.4s
## Failures
1. `tests/test_login.py::test_invalid_credentials`
- message: `AssertionError: expected error text not visible`
- screenshot: `test-results/.../test-failed-1.png`
- trace: `test-results/.../trace.zip`
- video: `test-results/.../video.webm`
2. `tests/test_coupon.py::test_idempotency`
- message: `Timeout waiting for /api/coupon (5000ms)`
- last step: `Page.waitForResponse('/api/coupon')` Restart your client, then talk to the AI like you always do.
{
"mcpServers": {
"mk-qa-master": {
"command": "uvx",
"args": ["mk-qa-master"],
"env": {
"QA_RUNNER": "pytest",
"QA_PROJECT_ROOT": "/path/to/your/project"
}
}
}
}