thumbgate

0

One 👎 becomes a hard rule the agent cannot bypass. Captures thumbs-down feedback, distills it into PreToolUse Pre-Action Checks, enforced across every future Claude Code session.

11 skills

agent-architect-kit

# Agent Architect Kit ↔ ThumbGate Bridge **Status:** shipped 2026-04-21. Test-backed. CLI-automatable. Safe to recommend publicly. ## What this is [agent-architect-kit](https://github.com/ultrathin

# Agent Architect Kit ↔ ThumbGate Bridge **Status:** shipped 2026-04-21. Test-backed. CLI-automatable. Safe to recommend publicly. ## What this is [agent-architect-kit](https://github.com/ultrathink-art/agent-architect-kit) by @ultrathink-art is a CLAUDE.md + agent-role template kit that persists learnings as per-role markdown files (`agents/state/memory/<role>.md`). They solve the setup layer. ThumbGate solves the runtime-enforcement layer — PreToolUse hooks that actually block known-bad tool calls instead of hoping the agent reads its memory file. The two stacks are complementary. A team running architect-kit graduates to ThumbGate when their markdown memory grows past the ~80-line ceiling their own `memory-directive.md` enforces, or when they want their `operations` meta-agent to do more than edit instructions — actually prevent the failing action. This bridge turns that graduation into a one-shot command. ## The bridge ### What it does `scripts/integrations/architect-kit-memory-bridge.js` walks an architect-kit memory directory, parses each per-role `.md` file, and emits one ThumbGate feedback entry per qualifying line: | architect-kit section | → ThumbGate signal | notes | |---|---|---| | `## Mistakes` | `down` with `whatWentWrong` | every entry becomes a thumbs-down lesson | | `## Learnings` | `up` with `whatWorked` | every entry becomes a thumbs-up memory | | `## Stakeholder Feedback` | `up`/`down` depending on keywords | "rejected", "broken", "wrong", "hate", etc. flip negative | | `## Session Log` | skipped | too granular to be useful in a searchable lesson DB | Every ingested entry is tagged `architect-kit`, `role:<name>`, and the source section, so imports are auditable and rollbackable. ### How to run it **Dry-run first** (no writes, prints classification): ```bash npm run integrations:architect-kit:import -- \ --dir=/path/to/agents/state/memory \ --dry-run --json ``` **Real import** (writes to ThumbGate feedback log): ```bash npm run integrations:architect-kit:import -- \ --dir=/path/to/agents/state/memory ``` **Single role** (e.g. import only `coder.md`): ```bash npm run integrations:architect-kit:import -- \ --dir=/path/to/agents/state/memory \ --role=coder ``` ### What you get after import ```bash npm run feedback:stats # see new entries grouped by tag npm run feedback:rules # regenerate prevention rules from the imported mistakes ``` The imported mistakes now feed the same pipeline as native ThumbGate feedback: lesson DB indexing, Thompson Sampling rollups, prevention-rule generation, and PreToolUse hook injection. The architect-kit `operations` agent's "edit the instructions" loop is now backed by hooks that can actually refuse a tool call. ## Test proof ```bash npm run test:architect-kit-memory-bridge # 16 tests, 0 fail ``` Fixtures live in `tests/fixtures/architect-kit-memory/` and mirror the exact format from architect-kit's `memory-directive.md`. The test-suite parity guard (`tests/test-suite-parity.test.js`) pins this test into the `npm test` chain — dropping the test accidentally now fails CI. ## Positioning talking points (for outreach + docs) - **CLAUDE.md is the control plane.** That's architect-kit's thesis and it's right. ThumbGate doesn't replace CLAUDE.md — it adds a runtime gate that enforces whatever CLAUDE.md says. - **Template memory → queryable memory.** Per-role markdown files stop scaling past ~80 lines (architect-kit documents this limit themselves). ThumbGate's SQLite+FTS5+LanceDB store is what you migrate to when your memory deserves search. - **Operations becomes teeth, not talk.** Their `operations` meta-agent edits instructions. ThumbGate lets operations also register prevention rules that PreToolUse hooks honor — the same loop, but the agent literally cannot skip it. ## What NOT to do - **Do not re-import the same memory twice.** Architect-kit memory has no stable entry IDs; repeated imports duplicate lessons. Prune the source file (their own directive says to) before re-running, or run dry-run + diff first. - **Do not import the `Session Log` section.** The bridge skips it deliberately. Session log entries are ephemeral task receipts, not lessons. - **Do not assume operations-role edits propagate automatically.** The architect-kit operations agent edits markdown instruction files; ThumbGate's prevention rules are regenerated by `npm run feedback:rules`. Wire that into your Ralph Loop (or theirs) if you want continuous re-derivation. ## Files added/touched by this integration - `scripts/integrations/architect-kit-memory-bridge.js` — parser + importer + CLI - `tests/architect-kit-memory-bridge.test.js` — 16 unit tests - `tests/fixtures/architect-kit-memory/{coder,qa}.md` — fixtures matching their format - `package.json` — new `test:architect-kit-memory-bridge` and `integrations:architect-kit:import` scripts, chained into `npm test` - `skills/agent-architect-kit/SKILL.md` — this doc ## When to reach back out to @ultrathink-art If they ship stable entry IDs in their memory format, upgrade the bridge to do incremental imports (skip already-seen entries). Until then, treat this as a one-shot migration tool, not a sync daemon.

Agent Memory

Recall past mistakes and capture feedback so your agent stops repeating errors. Works locally via MCP server — no API key needed.

# Agent Memory Give your agent persistent memory across sessions. Before starting any task, recall what went wrong last time. After completing work, capture whether it succeeded or failed. Prevention rules are auto-generated from repeated mistakes. ## Setup Add the MCP server (one-time): ```bash claude mcp add thumbgate -- npx -y thumbgate serve ``` No API key needed. All data stays local. ## When to Use - Starting a new task or session — recall past context first - After completing work that succeeded or failed — capture feedback - When the agent keeps making the same mistake — check prevention rules ## Workflow ### Step 1: Recall past context (do this FIRST on every task) Call the `recall` MCP tool with a description of your current task. The tool returns: - Past feedback relevant to this task (vector similarity search) - Active prevention rules (auto-generated from repeated failures) - Recent feedback summary with approval rate Read the prevention rules carefully. These are patterns that failed before — follow them. ### Step 2: Do your work Complete the task as normal. Keep track of what you did and whether it worked. ### Step 3: Capture feedback Call the `capture_feedback` MCP tool: **If succeeded:** - signal: `up` - context: What worked and why - tags: Category labels **If failed:** - signal: `down` - context: What you were trying to do - whatWentWrong: Specific failure description - whatToChange: How to avoid this next time - tags: Category labels Vague feedback like "it failed" will be rejected. Be specific. ### Step 4: Check improvement (optional) Call the `feedback_stats` MCP tool to see approval rate, top failure domains, and whether the agent is trending better or worse. ## Available MCP Tools | Tool | What it does | |------|-------------| | `recall` | Search past feedback and prevention rules for current task | | `capture_feedback` | Record what worked or failed with structured context | | `prevention_rules` | View auto-generated rules from repeated mistakes | | `feedback_stats` | Approval rate, trend analysis, top failure domains | | `feedback_summary` | Human-readable summary of recent signals | ## MCP Profiles | Profile | Tools | Use case | |---------|-------|----------| | `essential` | 5 core tools | Default — start here | | `commerce` | 6 tools + commerce_recall | Agentic commerce agents | | `default` | 12 tools | Full pipeline including DPO export | Set profile: `THUMBGATE_MCP_PROFILE=essential npx thumbgate serve` ## How Prevention Rules Work 1. Agent makes mistake A → you capture `down` feedback 2. Agent makes mistake A again → you capture `down` feedback again 3. System detects pattern → auto-generates prevention rule: "NEVER do A" 4. Next session → `recall` returns the rule → agent follows it This is the core value. The agent doesn't learn — but it reads the rules and follows them. ## Links - [GitHub](https://github.com/IgorGanapolsky/thumbgate) - [npm](https://www.npmjs.com/package/thumbgate) - [MCP Registry](https://registry.modelcontextprotocol.io)

bluesky-engagement

>

# Bluesky Engagement Automation for Bluesky reply tracking. The monitor reads; a human (or a queue consumer with explicit approval) writes. ## Why this exists Bluesky is a Zernio-connected channel for publishing, but Zernio does not expose an inbound/comments API. Engagement has to run directly against the open AT Protocol. This skill owns that path so future sessions don't re-ask for credentials or re-derive the PDS routing. ## Credentials Both are set in `.env` at repo root (git-ignored): | Var | Source | Notes | |---|---|---| | `BLUESKY_HANDLE` | bsky profile URL | e.g. `iganapolsky.bsky.social` | | `BLUESKY_APP_PASSWORD` | <https://bsky.app/settings/app-passwords> | Scoped, revocable. Never use the account login password. | **Rotation**: revoke the old app password at the settings URL above, generate a new one named `thumbgate-replies`, and replace the value on the `BLUESKY_APP_PASSWORD=` line of `.env`. No other files need updating. ## Architecture ``` launchd (com.thumbgate.bluesky-reply-monitor) └─ every 900s ─> node scripts/social-reply-monitor-bluesky.js ├─ com.atproto.server.createSession (bsky.social) │ └─ reads didDoc to find user's real PDS ├─ app.bsky.notification.listNotifications (user's PDS) ├─ filter reason in {reply, mention, quote} ├─ dedupe against .thumbgate/reply-monitor-state.json ├─ generateReply() — shared with Reddit/X monitor └─ append to .thumbgate/reply-drafts.jsonl ``` **Federation note**: authenticated AT Protocol calls must hit the user's own PDS (`session.didDoc.service[].serviceEndpoint`), not `bsky.social`. Hitting `bsky.social` returns `502 UpstreamFailure`. This was the first bug. **Transient failures**: Bluesky's appview returns 502 during incidents. The monitor detects 5xx / UpstreamFailure / ECONNRESET and exits 0 so launchd doesn't mark the agent failed. ## File layout | Path | Purpose | Tracked? | |---|---|---| | `scripts/social-reply-monitor-bluesky.js` | The monitor itself | yes | | `~/Library/LaunchAgents/com.thumbgate.bluesky-reply-monitor.plist` | 15-min schedule | no (user-scope) | | `.thumbgate/reply-monitor-state.json` | Dedupe state (also used by Reddit/X monitor) | no | | `.thumbgate/reply-drafts.jsonl` | Human-review queue | no | | `.thumbgate/bluesky-monitor-stdout.log` | launchd stdout | no | | `.thumbgate/bluesky-monitor-stderr.log` | launchd stderr (transient 502 warnings land here) | no | ## Commands ```bash # One-off dry run (no state mutation, logs what would be queued) node scripts/social-reply-monitor-bluesky.js --dry-run # One-off real run (appends to .thumbgate/reply-drafts.jsonl) node scripts/social-reply-monitor-bluesky.js # Reload the launchd agent after editing the plist launchctl unload ~/Library/LaunchAgents/com.thumbgate.bluesky-reply-monitor.plist launchctl load ~/Library/LaunchAgents/com.thumbgate.bluesky-reply-monitor.plist # Check the agent is alive launchctl list com.thumbgate.bluesky-reply-monitor # Tail the logs tail -f .thumbgate/bluesky-monitor-stderr.log # Inspect queued drafts cat .thumbgate/reply-drafts.jsonl | jq -r 'select(.platform=="bluesky") | "\(.createdAt) @\(.notification.authorHandle): \(.incomingText[0:80])"' ``` ## Draft-queue consumption The monitor NEVER auto-posts. Drafts sit in `.thumbgate/reply-drafts.jsonl` with `autoPost: false`. Human workflow: 1. Review the queue. 2. For each draft you want to send, open the notification URI in Bluesky and reply manually (or later, build a `send-reply-queue.js` consumer that hits `com.atproto.repo.createRecord` with `app.bsky.feed.post` — keep the `root`/`parent` CID+URI pair the monitor already stored). Auto-sending is deliberately deferred until there's a UI approval step. This is how we stay off the bot-slop/banned list. ## Troubleshooting | Symptom | Likely cause | Fix | |---|---|---| | `Bluesky auth failed (status=401)` | app password revoked or rotated elsewhere | regenerate per Rotation steps above | | `listNotifications failed on bsky.social: 502` | hitting the wrong host | the script auto-routes to PDS; if you see this, the didDoc parsing broke — inspect `session.didDoc.service` | | `listNotifications failed on <user-pds>: 502 UpstreamFailure` | Bluesky appview incident | nothing to do; soft-fails, next tick retries | | stderr log grows but no drafts appear | all notifications already in state file or all reasons are `like`/`follow` (we only handle reply/mention/quote) | inspect state file: `jq '.repliedTo.bluesky' .thumbgate/reply-monitor-state.json` | ## CI wiring (Ralph Loop) As of 2026-04-21 this monitor also runs hourly in GitHub Actions via `.github/workflows/ralph-loop.yml` → `scripts/ralph-loop.js` → step id `reply-monitor-bluesky`. The step is gated on `requiredEnvAll: ['BLUESKY_HANDLE', 'BLUESKY_APP_PASSWORD']` (GitHub repo secrets) and writes the same draft file the local launchd agent does. Never auto-posts in either path. ## Voice guardrail (2026-04-21) First autonomous-posting attempt (`scripts/bluesky-send-replies.js`, removed 2026-04-21) was reverted after the CEO thumbs-downed the AI-pitch tone on a live reply. All 6 live replies were deleted via `scripts/bluesky-delete-replies.js` (calls `com.atproto.repo.deleteRecord`). The lesson, captured as memory `mem_1776790570289_oc2z6g`: do not write replies that open with "Exactly"/"Right —", do not name-drop ThumbGate features inside conversational replies, keep replies to 1–2 sentences in the voice of a human peer. Until this is enforced by a gate rule, the monitor stays draft-only and a human sends the actual reply. ## Related - `scripts/social-reply-monitor.js` — Reddit, X, LinkedIn monitor. Shares `generateReply()` and the draft file. - `scripts/bluesky-list-actionable.js` — one-shot dump of un-replied notifications for human triage. - `scripts/bluesky-delete-replies.js` — one-shot rollback; reads `postedUri` entries out of `.thumbgate/reply-monitor-state.json` and calls `com.atproto.repo.deleteRecord`. - `scripts/social-analytics/publishers/zernio.js` — publishes Bluesky *posts* via Zernio. Separate concern. - `CLAUDE.md` → "Social stack: Zernio canonical" — explains why publishing uses Zernio but engagement doesn't (Zernio Inbox is dashboard-only, no public API as of 2026-04-21).

solve-architecture-autonomy

Automated skill to handle architecture, autonomy, crisis, debug, deployment, error, execution, external-assessment, feedback, inefficiency, negative, railway, revenue, roi, simplification, user-frustration patterns efficiently.

# SOLVE-ARCHITECTURE-AUTONOMY Capability ## Problem I provided a plan and research instead of immediately deploy ## Automated Diagnosis Repeated execution failure in this domain. ## Usage The agent should call the `handle_architecture` tool when tasks involve `architecture, autonomy, crisis, debug, deployment, error, execution, external-assessment, feedback, inefficiency, negative, railway, revenue, roi, simplification, user-frustration`.

thumbgate-feedback

>

# ThumbGate Feedback Skill When user provides feedback, execute: ```bash # negative node .claude/scripts/feedback/capture-feedback.js \ --feedback=down \ --context="<what failed>" \ --what-went-wrong="<specific failure>" \ --what-to-change="<prevention action>" \ --tags="<domain>,regression" # positive node .claude/scripts/feedback/capture-feedback.js \ --feedback=up \ --context="<what succeeded>" \ --what-worked="<repeatable pattern>" \ --tags="<domain>,fix" ``` If the user only says `thumbs up`, `thumbs down`, `that worked`, or `that failed`, log the signal and ask one follow-up question before claiming it became reusable memory. At session start, run: ```bash npm run feedback:summary npm run feedback:rules ```

thumbgate

>

# ThumbGate Pre-action gates that stop AI coding agents from repeating known mistakes. ## Quick Start ```bash npx thumbgate init ``` This installs the MCP server and wires it into your agent's tool configuration. No API keys required for the free tier. Or install globally: ```bash npm install -g thumbgate thumbgate init ``` ### MCP Configuration Add to your agent's MCP config (e.g., `claude_desktop_config.json` or `.cursor/mcp.json`): ```json { "mcpServers": { "thumbgate": { "command": "npx", "args": ["-y", "thumbgate"] } } } ``` ## How It Works ### Feedback Capture When an agent action succeeds or fails, capture structured feedback: - **Thumbs up**: Records what worked, tags it, and stores it as a reusable pattern. - **Thumbs down**: Records the failure context, what went wrong, and what to change. Repeated failures auto-promote into prevention rules. ### Prevention Rules After a failure pattern repeats (default: 2 occurrences), ThumbGate auto-generates a prevention rule. These rules are injected into agent context before every tool call, blocking the known-bad pattern before it executes. ### Pre-Action Gates Gates intercept tool calls via the MCP PreToolUse hook. Each gate checks the proposed action against: 1. Prevention rules generated from past failures 2. Thompson Sampling confidence scores (adaptive sensitivity) 3. LanceDB vector similarity to known-bad patterns If a match is found, the gate blocks execution and surfaces the prior failure context. ### Context Packs Bounded retrieval of relevant feedback history for the current task. The agent gets exactly the lessons that matter, not the entire history. ## MCP Tools Provided | Tool | Description | |------|-------------| | `capture_feedback` | Record thumbs-up/down with structured context | | `search_lessons` | Query the lesson DB by keyword or semantic similarity | | `get_prevention_rules` | Retrieve active prevention rules for the current context | | `session_primer` | Get session handoff context (last task, next step, blockers) | | `export_dpo` | Export feedback pairs in DPO training format | ## Tiers | | Free | Pro | Team | |---|---|---|---| | Feedback capture | 3/day | Unlimited | Unlimited | | Lesson search | 5/day | Unlimited | Unlimited | | Active gates | 5 | Unlimited | Unlimited | | Dashboard | - | Yes | Yes | | DPO export | - | Yes | Yes | | Seats | 1 | 1 | Per-seat | | Price | $0 | $19/mo | $49/seat/mo | Start a 7-day free trial of Pro: <https://thumbgate-production.up.railway.app/go/pro?utm_source=skill> ## Compatibility ThumbGate works with any MCP-compatible agent: - Cursor - Codex - Gemini CLI - Amp - OpenCode - Any agent supporting the Model Context Protocol ## Links - NPM: [thumbgate](https://www.npmjs.com/package/thumbgate) - Repository: [IgorGanapolsky/ThumbGate](https://github.com/IgorGanapolsky/ThumbGate) - Dashboard: <https://thumbgate-production.up.railway.app/dashboard>

capture-feedback

Capture structured thumbs up/down feedback with context, history-aware lesson distillation, tags, and optional rubric scores after completing a task.

# Capture Feedback Record structured feedback after completing a task or encountering an issue. ## When to use - After completing a coding task (positive or negative outcome) - When a tool call produces unexpected results - After a test failure or deployment issue - When the user explicitly wants to record feedback ## How it works Use the `capture_feedback` MCP tool with: - **signal** — `"thumbs_up"` or `"thumbs_down"` - **context** — Description of what happened and why when the user already said it clearly - **tags** — Array of relevant tags for categorization (e.g., `["test-failure", "refactor"]`) - **chatHistory** — Up to 8 prior recorded entries plus the failed tool call when the thumbs-down signal is vague and the lesson must be distilled from recent context - **relatedFeedbackId** — Use when the user adds clarifying detail later and it should refine the existing feedback event - **rubric_scores** — Optional object with structured quality scores ## Example ``` Capture feedback: thumbs_down for the failed database migration. Context: Migration script dropped the wrong index, causing query timeouts. Tags: database, migration, production-incident ``` ## Vague signal recovery If the user only says `thumbs_down`, `wrong`, `correct`, or `this failed`, do not stop there. Call `capture_feedback` with: - the signal - any minimal context the user already gave - `chatHistory` containing up to 8 prior recorded entries from the current correction thread - the failed tool call or command when available - `relatedFeedbackId` if the user is clarifying an already-open 60-second follow-up session That lets ThumbGate propose `whatWentWrong`, `whatToChange`, and a candidate rule automatically. Feedback feeds into the prevention rule promotion pipeline. Repeated failures with the same pattern are automatically promoted into enforceable prevention rules.

prevention-rules

Generate and review prevention rules auto-promoted from repeated failure patterns.

# Prevention Rules Manage prevention rules that are auto-generated from repeated failure patterns. ## When to use - Reviewing current active prevention rules for the project - Checking if a specific action is blocked by a prevention rule - Understanding why an action was blocked - Generating new prevention rules from observed patterns ## How it works Use the `prevention_rules` MCP tool to: 1. **List rules** — View all active prevention rules with their match patterns and corrective actions. 2. **Check rules** — Test if a specific action matches any prevention rule before execution. 3. **Review rule history** — See which feedback events led to a rule's promotion. ## Example ``` Check prevention rules for "npm publish without running tests" to see if this action is blocked. ``` Prevention rules are auto-promoted when the same failure pattern appears multiple times in captured feedback. Each rule includes the original failure context and a corrective action.

programmatic-agent-runs

Govern Cursor SDK local, cloud, self-hosted, and subagent coding runs before they create branches or PRs.

# Programmatic Agent Runs Use this skill before launching a coding agent through the Cursor SDK, Cursor cloud agents, self-hosted workers, or subagents. ## When to use - A task starts from code, CI, a backend service, a kanban automation, or any unattended workflow - The run can edit files, push a branch, create a PR, publish a package, deploy, or call shell tools - The run delegates work to subagents with different prompts, models, or scopes ## Launch checklist 1. Recall lessons with `search_lessons` for the repo, branch, files, and intended action. 2. Check active gates with `prevention_rules` before enabling writes or `autoCreatePR`. 3. Use the narrowest runtime: local for developer-in-the-loop work, cloud for isolated async work, self-hosted for private network/data requirements. 4. Give each subagent a bounded file or responsibility scope. 5. Persist run metadata: runtime, repo URL, starting ref, run id, agent id, branch, PR URL, and linked ThumbGate evidence. 6. Require verification output before merge, publish, deploy, or customer-facing claims. ## Cursor SDK guardrails - Attach ThumbGate MCP through the run environment so the agent can retrieve lessons and gates. - Prefer a clean starting ref and a disposable branch for cloud VM runs. - Do not set `autoCreatePR` for destructive, billing, release, or production tasks unless the gate check is clean. - If a cloud run finishes while the developer is offline, inspect the run transcript and PR diff before treating the result as accepted. - Capture `thumbs_down` feedback when a run violates scope, skips proof, repeats a known mistake, or opens a noisy PR. ## Output For each governed run, report: - Runtime: local, cloud, or self-hosted - Scope: files, modules, or workflow the agent may touch - Gates checked and result - Verification evidence - Branch or PR URL - Any feedback captured for future prevention

recall-context

Recall relevant past failures, prevention rules, and context packs before starting a coding task.

# Recall Context Retrieve relevant historical context before beginning work on a coding task. ## When to use - Starting a new coding task or feature - Before making changes to code that has failed before - Resuming work from a previous session ## How it works Use the `recall` MCP tool to retrieve: 1. **Prevention rules** — Rules auto-promoted from repeated failure patterns that apply to the current task. 2. **Past failures** — Specific failure events with context, tags, and corrective actions. 3. **Context packs** — Bundled project context from previous sessions. ## Example ``` Use the recall MCP tool to check for known issues with the authentication module before refactoring. ``` The tool returns structured context that helps avoid repeating past mistakes and surfaces corrective actions from promoted lessons.

search-lessons

Search promoted lessons for corrective actions, lifecycle state, linked rules, and linked gates.

# Search Lessons Search the promoted lessons database for corrective actions and guidance. ## When to use - Looking for corrective actions for a specific failure pattern - Checking if a known lesson applies to the current task - Reviewing lifecycle state of lessons (active, archived, superseded) - Finding linked prevention rules and gates for a topic ## How it works Use the `search_lessons` MCP tool with a query string. The tool searches across: 1. **Lesson descriptions** — What happened and why 2. **Corrective actions** — Specific steps to prevent recurrence 3. **Linked rules** — Prevention rules generated from the lesson 4. **Linked gates** — Pre-action gates that enforce the lesson 5. **Lifecycle state** — Whether the lesson is active, archived, or superseded ## Example ``` Search lessons for "force push" to find corrective actions and prevention rules related to force pushing. ``` Results include the full lesson context, any linked enforcement rules, and the current lifecycle state.