매일의 Show HN

Upvote0

2026년 6월 16일의 Show HN

40 개
68

Metiq: a real time 3D globe for 100 public datasets #

metiq.space faviconmetiq.space
18 댓글2:43 PMHN에서 보기
The concept for metiq.space came after playing Global Magnates with friends and realizing how fragmented live global data is. ships, aircraft, satellites, ports, weather, hazards, infrastructure, cyber, and public datasets all exist, but they usually live in separate tools and maps.

The goal was to build one interactive 3D globe where live public data could be visualized by latitude, longitude, and altitude. Surface data stays on the globe, while aircraft, satellites, and other above surface things can be represented in actual 3D space instead of being flattened onto a map.

The outcome is an interactive globe that showcases Earth, air, sea, space, cyber, defense, infrastructure, politics, and the list is continuously growing.

Majority of development right now is going into data filtering and deduping.

47

VoiceDraw – Talk system design out loud, the diagrams draw themselves #

voicedraw.com faviconvoicedraw.com
19 댓글7:13 PMHN에서 보기
I was frustated by having to draw system design diagrams by hand when discussing it with my team mates or in an interview. So I thought "Wouldn't it be great if someone draws it for me, while I think out loud?".

That is when I came up with VoiceDraw. You can just think out loud or discuss your system architecture with a friend/interviewer, the diagrams are automatically drawn along with your reasoning, open questions and tradeoffs beautifully written on to the side.

Demo Video: https://youtu.be/36PgHKSuccE

42

Spin Lab #

srijanshukla.com faviconsrijanshukla.com
26 댓글12:16 PMHN에서 보기
It's been tough to explain how spin works in table tennis, this visualizer/teacher does a great job at that. Guided tour of how spin looks/feels/works in the game of table tennis.

Thanks Fable, we miss you

35

Pen and paper resource development game #

jameshylands.co.uk faviconjameshylands.co.uk
4 댓글5:49 PMHN에서 보기
I've been working for a while on trying to curate a game that has the emergence of procedurally generated computer games but that can be played with only pen and paper. Here I present the best version I've been able to come up with that is simple and emergent. I've really enjoyed being able to engage with this sort of game while not feeling like my brain in rotting. I recon my numeracy improves while playing it.
15

BlitzGraph – Supabase for graphs, built for LLM agents #

blitzgraph.com faviconblitzgraph.com
8 댓글3:41 PMHN에서 보기
Hello HN After becoming allergic to SQL, I opened 120+ issues in Dgraph, Typedb and surrealdb looking for the perfect graphDB. None of them was built for agents nor were they the perfect fit for what we wanted to achieve: fully ditching the SQL legacy to properly model reality. So we decided to build BlitzGraph

In BlitzGraph, records (units) can belong to multiple types (kinds) and evolve through time. Also polymorphic relations are first class and multiple kinds can play the same role. This design helps to escape the old table paradigm and track entities throughout their lifecycle without awkward self-joins that connect an entity to itself under different IDs in other tables

An example:

   { "$id": "amazn", "$kinds": ["Company", "Prospect"], deal: ... }    // Day 1
   { "$id": "amazn", "$kinds": ["Company", "Customer"], contract: .. }   // Day 7
   { "$id": "amazn", "$kinds": ["Company", "Churned"], churnCause: "..." }, ... // Day 86
What makes BlitzGraph different:

   - GraphQL-like nested queries and mutations https://blitzgraph.com/docs - Polymorphic records and relations
   - Bidirectional O(1) relations  - Referential integrity with native cardinality validations
   - JSON query/mutation language designed so AI agents can build them programatically  - Batched queries/mutations without N+1 issues
   - Built-in frontend engine for quick dashboards and MVPs  - Native full text search, file storage, computed fields, ephemeral subspaces, unit history...
Honest comparisons:

- vs typedb: amazing db, but not ideal for app development. On the other hand we loved and brought their inference ideas and how mutations execute smartly instead of line per line - vs surrealdb: Several core differences, a key one is that we run validations and trasnformations in topological order, and our edges are first class citizens - vs dgraph: Their cool features like post commit hooks were attached to the graphQL layer, in BG it is fundational - neo4j: If you've tried it, you know - vs supabase/pg: BG is slower for flat queries but faster in nested ones. But with BG mainly you get rid of the tables paradigm and jump into the graph world while being able to build apps

Not ready:

- While blitzgraph is already an excellent memory backend for AI agents, we still need to finish the semantic search engine - Query planner is not optimized - Cloud frontends have no native auth engine yet

Beta is live, please break things! - Public playground: https://blitzgraph.com/#playground - MCP: https://blitzgraph.com/mcp

12

Memento – Self-hosted agentic search and LLM wiki over your email #

11 댓글4:36 PMHN에서 보기
Our email inboxes carry multiple decades of messages (100K-500K). This is a good proxy for all the important things that happened in your life, the projects you have done and the people that you have connected with. With the chronological view of messages in the inbox, these details remain hidden. What if we could turn this archive into a personal wiki that you can search and curate? That is Memento.

For the information architecture of such a wiki, Memento takes an opinionated view of creating four high level dimensions - People (like a CRM view of your contacts), Projects (life events that are bounded by some start and end dates), Concepts (evergreen topics) and Newsletters.

Inboxes group messages by sender address, so the same person shows up many times across work, personal, and alias addresses. In the People dimension, Memento resolves all of that into one canonical person using deterministic algorithms and brings the people who are related to that person through graph algorithms. With just these two, you get an already populated CRM from your life history. No LLMs involved. From here, you can further enrich each person's wiki page by adding additional personal notes and create a cohesive narrative using LLM. The same applies to other dimensions as well.

How can we query this wiki? This is where Memento uses an agentic search over this curated dataset. Your emails are stored in SQLite DB, indexed using FTS and vector embeddings and kept up-to-date, using another open-source project called Msgvault [1]. Memento extends this DB with its own tables to store the output of various algorithms - canonical people discovery, graph algorithms to find clusters of connections etc. For the agentic search to be effective, Memento exposes the underlying FTS, vector and graph data in a structured way to the agent as tools. The agent can further refine the search with additional tools like get-message-details, message-cluster etc. What we discovered is that the resulting search is much more powerful that the typical ‘Ask Gmail’ search exposed by Google.

Every factual claim exposed by Memento can be traced back to the real email it came from. Any additional notes added to the wiki are incorporated in the next generation. So Memento becomes richer and more personal to you over time.

This worked really well for our personal email archive and we were pleasantly surprised by the things that Memento was able to uncover. In order to demonstrate this capability at scale without exposing our private info, we connected Memento to the public Enron dataset that contains hundreds of mailboxes. With the SQLite store, Go backend and Next.js UI, Memento handles this 5 GB dataset with ease. Now you can query this archive and run agentic searches to re-discover the Enron scandal yourself - you can see the demo set here [2].

The app is a single binary that serves on localhost and treats your archive as read-only, and you can point it at any OpenAI-API-compatible LLM, whether that's a local model or a cloud one. It's open source, so you can inspect how your data is handled.

You can try it out today without connecting your archive - use the hosted demo [2] or download the GitHub release and run it with a synthetic local archive `./memento app --demo`

[1] https://www.msgvault.io

[2] Hosted demo (Enron data): https://memento-demo.latentsignal.org/home

Demo video: https://www.youtube.com/watch?v=Ms1KeAYCN2A

Project home: https://latentsignal.org/projects/memento

GitHub: https://github.com/latentsignal-org/memento

We are George and Ann, creators of Memento.

9

git-lrc – Free, Micro AI Code Reviews That Run on Git Commit #

github.com favicongithub.com
0 댓글2:04 PMHN에서 보기
Hi HN, I'm the author of git-lrc, would appreciate some feedback from the community

Last year my team started using AI coding tools more heavily, and we found ourselves generating tons of code, but spending less time looking at the stuff that's been generated.

We felt like we were losing a bit of grip/understanding of what we were building. Regressions occasionally slipped through. Sometimes changes made it all the way to production only to be reverted later.

We tried several AI code review tools, but most operate at PR time. That felt too late. I wanted review to happen while the implementation was still fresh in the developer's mind at a team level (soft enforcement). I also wanted to emphasize responsibility for keeping prod stable with each individual engineer.

So I built git-lrc.

When you commit, git-lrc opens a review UI with your diff. It summarizes what changed, points out things that deserve a second look, and lets you quickly jump through the important parts of the change.

Over time, git-lrc has grown to check for around 100 common risk patterns across 10 categories, including security, reliability, performance, maintainability, etc.

Note that this is far from a formal review. It's a quick 60 seconds spent looking at your own work before it gets recorded in git.

It also generates a short "summary deck" that highlights the main changes, with special emphasis on potential risks. With git-lrc you can quickly sanity-check what you're about to ship and obtain greater confidence in what's been generated.

In my mind it is less of an AI reviewer and more as a habit for AI-assisted development: a small pause to make sure we understand and stand behind the code we're shipping.

Developers can review the change, vouch for it, or consciously skip the review. Those decisions get recorded in git history, creating a trail of how code was reviewed before it shipped.

It'd be great if you could take a look, give it a try in your projects or teams and let me know what you think.

Happy to take feedback from the HN community and improve it over time!

GitHub: https://github.com/HexmosTech/git-lrc

8

Ctx, save tokens by loading only the relevant tools #

github.com favicongithub.com
2 댓글6:14 PMHN에서 보기
Hi HN!

Token cost has started to become a high topic of concern to all of us. I tried a few (awesome) tools such as rtk, caveman, and the recent (hillarious but effective) ponytail. What they usually do, is in-line token reduction, e.g. try to compress requests / responses as much as possible.

But then it hit me (and I’m sure others had similar ideas) - just like we have routers that pick the right model, why not have something that will also narrow down the amount of available tools, skills and mcps based on repo/context?

People usually accumulate skills, agents, MCP servers, harnesses, prompts, repo instructions, and local scripts. I’m not saying we are all hoarders, but we sort of are. When did you remove a skill recently? After a while, the model has way too many options to choose from.

ctx tries to fix that by selecting context before the session gets bloated.So no, it doesn’t cleanup your messy garage, but it gives you magic glasses that let you focus only on the tools you need.

It does it by watching the repo and task, walks a graph of available tooling, and recommends a small top-scored bundle of skills, agents, MCP servers, and harnesses.

How does it know? To make sure results are not hallucinated, and repeatable, I curated a list of 91k+ skills, 467 agents, 10.7k MCP servers, 207 harnesses, and built a graph to help ctx make decisions on what to recommend. While I used AI to generate it of course, I curated it and revised it to make sure the data is up to date.

So how this is different from rtk, caveman, ponytail, and similar token-saving tools?

As mentioned above those tools mostly reduce tokens after something is already being used.

rtk compresses command output.

caveman-style tools make the assistant respond with fewer words.

ponytail, is, well, awesome, but again it focuses more on reducing code (YAGNI)

ctx is upstream. It tries to avoid loading irrelevant skills, agents, MCPs, and harnesses into context at all.

So it is not really a replacement. It should work side by side with them!

Use ctx to choose the right tools. Use rtk to reduce terminal-output noise. Use terse-output tools if you want shorter responses.

The goal is simple: save tokens without forcing the user to manually test and compare thousands of possible skills, agents, MCP servers, and harnesses.

Repo: https://github.com/stevesolun/ctx

6

Privacy-First Voice Age Verification #

agewarden.ai faviconagewarden.ai
7 댓글1:04 PMHN에서 보기
I miss the old web. As a kid I could type in "a/s/l" in AOL messenger and chat with someone my own age, without worrying about the dangers that lurk on the web today.

After seeing what happened to Omegle, a question stuck: is there a simple way to do age verification that both keeps people safe and doesn't contribute to a surveillance state?

After a year of hard work, that question resulted in AGEWARDEN. Each part of the service puts people first. No tracking, nothing stored (it's more difficult these days to NOT collect data :smh:).

Please give it a try if you have a moment https://agewarden.ai/demo. Feedback is very much welcomed.

GG

6

Numax - a portable runtime for distributed apps #

github.com favicongithub.com
1 댓글10:25 PMHN에서 보기
Hi, over the past few months I've been working on this project: Numax is a small Rust runtime that does three things: it runs WebAssembly modules in a sandbox, has a built-in local key-value store, and syncs everything across nodes with CRDTs and gossip. Basically, you write a wasm module, run it on two machines, and they converge (I hope !). It's a decentralized system... I hope someone finds it interesting! There's a whitepaper I've put a lot into, and I think the code isn't bad either! I believe there's still room in this world for software that's fun and well made, and while building Numax I had a great time! I love Numax and I love software. Thanks to anyone who'll spend a bit of their time even just to open the repo and take a look!
5

Dev-friendly native OTel: only OSS stateful, on-the-wire Observability #

1 댓글7:00 PMHN에서 보기
Hi HN,

We’re the team at MyDecisive.ai, and today we’re giving developers a peek at Octant — point-and-click control and visibility for your OpenTelemetry.

You've likely felt the pain of the "observability tax," especially if you manage K8S clusters. The modern standard is to instrument everything with OpenTelemetry, but piping all those rich OTLP logs, metrics, and traces straight to a SaaS vendor (Datadog, Splunk, Honeycomb) gets expensive fast. You end up paying massive ingestion and storage costs for noisy, low-value data just so it's searchable when something breaks. With Octant you get up and running on OTel in minutes.

We built Octant to flip this model. Instead of blindly shipping all telemetry off-cluster, Octant configures and helps to manage OTEL clusters. It gives you a visual interface for managing K8s objects, but importantly, it acts as an OTLP gateway that filters data at the source before it leaves your VPC.

Because it natively speaks OpenTelemetry, you can point your existing OTel SDKs or collectors right at it without touching your application code. Here is what it does under the hood:

- OTel-Native Trace & Log Sampling: It makes it easy to ingest OTLP traffic and inspects logs and traces on the wire. By waiting for the full context of a trace before determining what to keep, it delivers on the promise of braiding, retaining 100% of the actionable signals around (like errors and high-latency spans) but droppings the junk before it hits your SaaS bill.

- In-Flight Stateful Alerting: Instead of waiting for data to be batched, shipped, and indexed by an external provider to trigger an alert, Octant can process the telemetry streams in-flight. This shrinks the detection gap and reduces the need for SaaS vendors in the first place. - On-the-Wire PII Redaction: It can detect and strip sensitive information from your logs and traces in real-time before they are transmitted over the internet, removing "post-ingestion" clean-up costs and compliance risks.

- K8s Context Injection: Because it's deeply integrated with your cluster, it maps your OTel streams directly to your K8s resources (Deployments, Pods, CRDs) in a unified UI.

The API is built in Go ([github.com/mydecisive/octant] and the whole stack can be deployed directly into your cluster via our Helm charts.

We’d love for you to spin it up on a dev cluster and tear it apart. We just recently merged a PR from our very first community contributor, which was a huge milestone for us! We want to keep that momentum going. If you're interested in hacking on K8s observability and autonomy, OpenTelemetry pipelines, or Go/React, we’ve tagged a few 'good first issues' and would be thrilled to welcome you to the project.

GitHub: https://github.com/MyDecisive/octant

Website: https://www.mydecisive.ai/

I'll be hanging out in the thread today and am happy to answer any questions or dig into the architecture!

5

The Dictionary Game (Fictionary/Balderdash) as a Daily Puzzle #

dictionarypuzzle.com favicondictionarypuzzle.com
1 댓글1:56 PMHN에서 보기
I love the dictionary game: someone takes an unfamiliar word from the dictionary and writes down the definition. Then the other players come up with their own, fake, definition, and guess which one is real.

I built a Wordle-style daily game out of this concept. You get shown a word and four definitions and it's up to you to figure out which one is real. There are five new words to try every day.

The real definitions are from Wiktionary, the fake ones are made up by Claude (and sometimes me).

Let me know what you think, and how this could be improved!

5

Microlearning apps with a TikTok-style feed to beat doomscrolling #

usescroll.app faviconusescroll.app
3 댓글5:06 PMHN에서 보기
I wanted to kick my doomscrolling habit, so I built a microlearning app that uses a TikTok-style algorithm, same addictive feed mechanics, but you actually learn something.

I started with a general version, Scroll: Daily Microlearning (microlearning.usescroll.app), but quickly realised it works better when focused on a single topic. So I split it into:

Scroll: Personal Finance (https://finance.usescroll.app)

Scroll: Learn AI (https://ai.usescroll.app)

Scroll: Daily Microlearning (https://microlearning.usescroll.app)

4

The Oct Programming Language for scientific programming #

github.com favicongithub.com
0 댓글10:34 PMHN에서 보기
Apologies in advance if my writing is unclear.

First, why make another programming language? This came out of my general frustration with working with MATLAB and Python during grad school as well as at work. I'm a mechanical engineer; I write Python scripts for my own use semi-regularly. My issue with Python is that while writing easy, debugging Python code I wrote is hard once the code gets big. Sharing my Python code to colleagues is very difficult as most of them are less technical on coding than I am, so asking them to setup a full Python env just to run a script is out of the question.

Then, the Two-Language Problem for scientific computing: write the prototype code in Python/MATLAB and rewrite the performance sensitive code in C++/Rust when needed. The problem is that C++ and Rust are not very easy to learn and converting from a dynamically typed language like Python to a strict, statically typed language like Rust is not easy, especially for people who are not software engineers.

The name Oct is a reference to GNU Octave and started as my concept of what Octave should have been, and the name stuck. About 75k lines of Oct code are in the repo across experiments, libraries, and examples. Pretty much all of it is written by Claude/Codex, my role is only to prevent drift/hallucinations.

Features, in no particular order:

- Function first, statically typed, both interpreted and compiled: Oct code compiles to a Go binary via MIR codegen, runs on anything that Go runs at Go speed, and inherits Go's absurd compile speed, which means that a JIT is mostly unnecessary. - The entire Go ecosystem is available for Oct if you write a wrapper around it: Oct's `IO.Xlsx` library is `excelize`, Oct's builtin plot is `gonum/plot`, Oct's benchmark profiling is pprof, Oct's C interop is CGo. - Boring syntax, easy to learn. If you know Rust/Go/C#/Swift, learning Oct would take a few hours at most. Vice versa, if you learn Oct, then you are halfway to knowing how to write Rust already; - Octest, xUnit.NET style testing framework with `[Fact]/[Theory]` and various Asserts. - Foundational SI unit built into the language and enforced by typechecker: you can't add Int<m> and Int<kg> together. Units also propagate. ```oct fn StiffnessForce(K: Matrix<Float<kg/s^2>>, u: Vector<Float<m>>) -> Vector<Float<kgm/s^2>> { return K @ u // Matrix<Float<kg/s^2>> @ Vector<Float<m>> → Vector<Float<kgm/s^2>> } ``` - You can't ignore errors, Oct is exception free and `null`/`nil`-free, all errors must be handled explicitly: `?` for propagation, `!` for unwrap, or fallible `match`. - Arrays and vector/matrix are separate but related concepts, with vectors/matrices explicitly defined as Rank 1 and 2 tensors, and Einstein notation for tensors is built into the language. ```oct let c = a[i, k] * b[k, j] ``` - Octomata, Oct's own built in control system runtime, using explicit finite state machines and utility scoring as primitives for control systems.

```oct package Main

flow DoorControl(openCmd: Bool, closeCmd: Bool, blocked: Bool) -> String { state Closed { when { case openCmd and blocked == false -> goto Opening else -> return "closed" } }

    state Opening {
        when {
            case blocked -> goto Closed
            case closeCmd -> goto Closing
            else -> return "opening"
        }
    }

    state Closing { return "closing" }
} ``` Oct is very much a work in progress, but enough jank/bugs has been fixed that hopefully it wouldn't be embarrassing to show it off now, although rough edges still expected. So, would love some feedback.
4

Morning Stack finds real job openings, tweaks resume and cover letter #

morningstack.app faviconmorningstack.app
2 댓글1:01 PMHN에서 보기
Job hunting sucks. That grind of digging through LinkedIn and Indeed, sifting through endless spammy emails from headhunters who try to get the biggest possible audience, clicking "Quick Apply" to silent response, and trying to figure out which of all of these jobs are even real. My wife hit all of this when she started job hunting a few months ago, so I built her a tool to take the worst part of it off her plate. It turned into this. Oh, and she got an offer letter from a major company in just shy of two and a half weeks. Now, she loves her job.

Morning Stack runs overnight. While my users sleep it works through the big job boards (and some niche ones), and then it tries to disprove every listing before it trusts one.

Is the job even real? - It opens the posting in a real browser (Playwright) and checks. There's no API scraping and we don't login to anything. Then, we resolve back to the company's own ATS page so you can apply directly to them. If it can't find it, the job is dropped.

Does it match what you asked for? - An LLM crosschecks the job description against your profile (Resume, Career Story, and Desired Roles, Comp Band, Geography and Benefits). The more specific you are, the less jobs you'll ultimately receive. That's kind of the point.

For the few surviving contenders, it tweaks your actual resume to the job description and drafts a cover letter for each. A separate verifier re-reads those against your actual resume, stripping any facts it can't support. We avoid AI-BS with a 3-strike rule: if it can't support a claim after three tries, it drops the job and moves on instead of shipping something fabricated.

By 7AM, you've got a small stack (up to 3) of real openings with finished packages.

- You see a little bit of good intel on each (company and job info, and any outliers like high-comp or unlimited PTO). - You review them to determine if you're interested. - You polish and submit yourself on the company's most direct site.

That's it. It's your reputation; you control it. Here's a few things I deliberately did not build this for:

It never logs in to any of your actual accounts and it never auto-applies on your behalf. This is a really terrible trend in my opinion. I don't think it's working and I don't think people trust it. "Spray-and-pray" isn't the right fit for people who care about the outcome of their career and you will always be the one who submits under your own name.

Opening for a small beta cohort this week. There's a waitlist. The link is above.

Happy to get into the architecture or the ToS reasoning in the comments.

2

Silver Music Notifier – get notified about your artists' new releases #

github.com favicongithub.com
0 댓글5:09 PMHN에서 보기
I always seem to miss new albums from music artists I like, and I haven't been happy with any existing services (they either cost $ or only work with Spotify, which doesn't have all the artists I listen to). So I built a simple tool to pull data from MusicBrainz for whatever artists you want.

It runs entirely locally (sqlite db) and can be used as a CLI (also works well with agents) or with a local web UI. It can optionally notify you with emails, or you can just check it periodically. I've already found a bunch of releases I missed from my favorite artists.

2

Codyssey – control system programming game (open beta) #

codyssey.andersource.dev faviconcodyssey.andersource.dev
0 댓글1:05 PMHN에서 보기
Hi HN! Codyssey is a small game where you write simple control system logic in Python to solve minigames.

I've been working on it for over two years; its first iteration was as a workshop for beginning programmers with just basic Python knowledge. I wanted something that would take abstract code and make it visual, interactive and fun. Now, after many iterations, I'm launching it as a small web game.

To try the beta:

1. Create an account (to try a demo without signup, go here[0]).

2. "Start tutorial"

3. "Buy" (top right) -> "Join Open Beta"

I'd love to hear your feedback!

Behind the scenes, Brython[1] is used to run the code. I'd have preferred pyodide or micropython (and might get back to doing it), but both ended up having technical issues and Brython was easier to integrate.

Cheers, and hope you enjoy!

[0] https://codyssey.andersource.dev/demo/demo.html

[1] https://brython.info/

1

A policy gate that runs before your AI coding agent's tool calls #

sigmashake.com faviconsigmashake.com
0 댓글5:12 PMHN에서 보기
As a Security Engineer with over 10+ years in industry, I kept running into the same problem with coding agents: Instructions are not guarantees.

I put guidance in `CLAUDE.md`, `AGENTS.md`, memory files, MCP descriptions, and tool documentation. I explicitly told the agent things like:

* Use the code graph for architecture questions instead of grepping the repository. * Do not use deprecated APIs or Unsafe code. * Prefer specific tools for specific tasks.

The agent would still ignore those instructions surprisingly often. It would grep the entire repo, use deprecated APIs, or choose a slower tool even when a better one was available.

That made me realize prompts and rules solve different problems.

A prompt is a probabilistic influence on model behavior. A rule is an enforcement mechanism.

So I built SSG (SigmaShake Governance), which sits between the agent and its tools. Instead of asking the model to remember a policy, SSG evaluates every tool call before it executes.

For example, this rule redirects architecture-related repository searches away from recursive grep and toward a code graph:

```text rule route-codebase-grep-to-graph { enable true priority 80 severity warning CATEGORY tool-routing FORCE search IF tool EQUALS "Grep" MESSAGE "Architecture, relationship, and dependency questions are routed to the code-graph tool." SUBSTITUTE "graphify query \"<what you were searching for>\"" } ```

When the agent attempts a grep for an architecture question, the call is redirected. If it attempts to write deprecated code, the write can be blocked before the content reaches disk and the replacement API can be suggested.

A few design choices:

* Rules are plain text and git-versioned. * Enforcement runs locally. * The same rules work across Claude Code, Codex, Cursor, Gemini, and MCP-based agents. * Bypasses are allowed, but recorded. * The goal is not to sandbox a hostile model; it's to prevent routine agent mistakes and shortcuts.

I found that many existing controls operate either too early or too late:

* Prompt files influence behavior but don't enforce it. * Tool allowlists are often all-or-nothing. * Pre-commit hooks catch problems after files have already been written. * Harness-specific permissions don't travel with the repository. (what if your team does not use the same harness?)

If you do not live in the terminal, SigmaShake Desktop is the same governance dashboard with no CLI required: a free direct download for macOS, Windows, and Linux. The Mac App Store and Microsoft Store also carry it as a paid, sandboxed build that auto-updates and skips the Gatekeeper and SmartScreen prompts, if you would rather pay once for the managed install (or just want to support the project): https://apps.apple.com/us/app/sigmashake-desktop/id676990115... https://apps.microsoft.com/detail/9N2CHV3STGS4

I've been building this in public for a few months. https://twitch.tv/sigmashake https://youtube.com/@sigmashakeinc

What instructions do your agents consistently ignore?