매일의 Show HN

Upvote0

2026년 5월 8일의 Show HN

25 개
129

Git for AI Agents #

github.com favicongithub.com
68 댓글2:15 PMHN에서 보기
hi guys. been working on something i think is fundamentally missing in today's workflow with ai agents.

vcs.

i find myself struggling with questions that agents can't answer like "why did you do it?", "when did u delete this folder? why?", etc. or trying to /rewind (after a /compact...) or basically `bisect` to find when and why something was done by the agent in the current / previous session.

just like git did for code, i think we are the same core capabilities with ai agents

so...

i developed an open source solution for that (currently supporting claude code)

would love to get feedback, contribution or maybe other ideas or solutions you find for those problems.

39

GETadb.com – every GET request creates a DB #

getadb.com favicongetadb.com
44 댓글4:17 PMHN에서 보기
Hey HN! We made GETadb.com, so it's easier to get agents to build you full stack apps. You don't need to give them any credentials. Just by loading a GET request, they get access to a database, a sync engine, and abstractions for auth, presence, and streams.

To see what the agent sees, you can load https://getadb.com/new

There's two fun things about how it's implemented:

1. If you curl the home page, it the agent content rather than human content. We do this by detecting the 'Sec-Fetch-Mode' header. It's not perfect, but gets the job done for Claude Code et al.

2. For an agent to spin up an app, they make _two_ fethes. (1) getadb.com/guide tells them to generate a uuid, and fetch (2) getadb.com/provision/<uuid>. We did this, because just about half of the popular web-based app builders cache URLs globally, even if you return no-store headers. To get around this we just instruct the agent to generate unique URLs

You may wonder: Why GET requests, rather than POST requests? It's because then you can build in surprising places. For example, we get meta.ai to build an app inside the artifact preview: https://artifacts.meta.ai/share/a/b80c7412-c3af-4088-b430-78efdfe8ea2d

Under the hood, this is possible because the whole infra is mult-tenant from ground up. We already announced how that works on HN, but if you're curious here's the essay for it: https://www.instantdb.com/essays/architecture

12

Crit – local review tool for agent plans and code diffs #

crit.md faviconcrit.md
4 댓글12:56 PMHN에서 보기
Crit is a single-binary CLI that opens your file or code diffs in a browser with GitHub-inspired interface. Your favourite agent acts on the feedback and responds back - continue until you're happy.

I've been building this for a few months now and it's been helping me a lot juggling parallel conversations with agents, making it easier to actually review the plans they generate and the resulting code - all before it hits GitHub.

It also allows you to export the plans to a hosted service (which you can self-host for privacy), to solicit feedback from your team. No account needed, and reviews are automatically deleted after 30 days of inactivity.

7

An agent that tunes its own cache #

0 댓글12:39 PMHN에서 보기
The weekend of last week I built chat.betterdb.com as a RAG over Valkey/Redis/Dragonfly docs. The goal was to eat our own dogfood and test publicly our caching libraries. It also saved me from having to come up with various demo/test scenarios, as I could extend the building in public to the demo.

There is a tool-result cache sitting between the SDK and tools. Each call is normalized and then checked before executing. If it hits we return from the cache, and if not, we check the semantic cache, which embeds the prompt and checks with KNN via valkey-search. If the cosine distance is close enough, we again skip the LLM and stream the cached response. In both cases, if we miss, we store the prompt embedding, actual model, input and output tokens from OpenAI's usage report, so a future hit has the dollars avoided as data.

The two tiers handle different shapes. Predefined questions, copy-pasted questions, checking the same thing again after time - produces byte-identical strings the tool cache catches. Human paraphrase is what the semantic tier exists for.

This Wednesday was a bank holiday where I live, so I used to extend it further - the libraries the chat relies on now store metadata in the Valkey (or Redis if that's your preference) instance, then our monitoring reads and analyze that data and suggests improvements. These are exported also through our MCP server, so the chat's agent can check and create suggestions as well, and since this is just a demo, it can also approve its suggestions (do not do this on real production environment, unless you are a true LLM believer). The libs also read the config from the Valkey instance, so there is no restart needed. I hooked it on cron inside Vercel and let it run over the night and next day.

Between Run 1 and Run 3, it started making less tool calls. The first run it suggested several different TTL changes and applied them. Run 2 and 1 had similar suggestions, because the TTL is the wrong point of control - they take natural language input (`How fast is XADD?` vs `XADD performance` are two different strings, that "mean" the same thing) so the tool cache doesn't fire and are covered by the semantic cache. An actual fix would be to move these tools from the exact-match into the semantic cache checks - a code change, not a config change. It was an indicator of a problem the system can't fix on its own. In the future the routing might also become configurable to solve this without redeploying and test and verify in quicker loops. Run 3 just didn't propose anything new - 15 -> 13 -> 8 tool calls across the three runs.

Curious how others running similar loops decide what the agent can touch. Am I too skeptical of hallucinations and overly cautious?

The chat can be found at https://chat.betterdb.com (it has links to all of the repos in it) And a more detailed write up can be found at https://www.betterdb.com/blog/cache-that-tunes-itself

6

tltv – Federation protocol for 24/7 TV channels #

timelooptv.org favicontimelooptv.org
0 댓글7:28 PMHN에서 보기
I spent six years trying to build a tv channel server. rewrote it eight times. flask, fastapi, ffmpeg, gstreamer, named pipes. every version got more complicated and none of them worked right.

turns out I was building the wrong thing. the thing I actually wanted was a protocol.

so tltv is that. a channel is an ed25519 key pair. you sign your metadata with it. you serve hls video from wherever you want. your public key becomes a tltv:// address that anyone can tune into.

relay nodes can re-serve your stream but they can't modify it. they verify signatures on everything. you can move servers and keep your channel because the key is the identity, not the hostname. nodes find each other through peer exchange. no central registry.

the cli is probably the fastest way to see what I mean:

  curl -fsSL timelooptv.org/install | sh

  tltv keygen

  tltv server test --name "my channel" -k TV*.key
that's a fully compliant origin server. pure go, generates smpte bars with audio, no ffmpeg. one binary, ~20mb of ram. there's also a full gstreamer-based server (cathode), a web viewer (phosphor), and bridge/relay servers in the cli. everything mit licensed.

live demo at https://demo.timelooptv.org

https://github.com/tltv-org

5

Notion-to-site – sync any Notion database to local Markdown/MDX/JSON #

github.com favicongithub.com
0 댓글4:12 PMHN에서 보기
Built this while syncing my own blog from Notion. Good alternative to paid options like Super.so.

Supports incremental sync, all Notion block types including equations and synced blocks, image download + WebP conversion, and adapters for Markdown, MDX, and JSON.

Works with Next.js, Astro, SvelteKit, and anything that reads files.

4

Armorer – A secure local control plane for AI agents #

1 댓글12:38 AMHN에서 보기
Hey HN,

I built Armorer because I was tired of two things: 1. The absolute "dependency hell" of setting up new AI agents (Codex, OpenClaw, etc.). 2. The security risk of giving powerful local agents broad access to my host machine.

Armorer is a secure local control plane that manages the lifecycle of your agents. It uses Docker for true process isolation and provides a unified UI/CLI for monitoring and job tracking.

One feature I’m particularly excited about: you can point an existing coding agent at the Armorer repo, and it will autonomously install and configure the entire stack for you securely.

Source: https://github.com/ArmorerLabs/Armorer Website: https://armorerlabs.com

I’d love to hear your thoughts on the architecture and how you're handling agent security locally.

4

The independent guide to agent orchestrators #

agentmgmt.dev faviconagentmgmt.dev
1 댓글9:17 PMHN에서 보기
Hey HN!

I built AgentMGMT.dev today to keep track of all those agent orchestration tools that keep popping up. I've tried a few and landed on Superset, which I'm extremely happy (and productive!) with - but I think this category of tools will be extremely important and interesting in the next couple years, so it's worth keeping an eye on all available tools and how they evolve.

I will keep the site up-to-date, please help me by submitting new tools that are not yet in the list, or add any details that might help folks who are out shopping for their first/next agent orchestrator!

3

Agentctl, a local control plane for coding agents #

github.com favicongithub.com
0 댓글1:56 AMHN에서 보기
I’ve been building agentctl, a small Go tool that sits between coding agents and the risky actions they want to take: package installs, shell execution, secret access, file writes, outbound API calls. The design is deliberately narrow and local-first. No HTTP server, no hosted component, no repo-level config sprawl. Everything lives under ~/.agentctl/. Policy is yours, traces are yours.

The workflow I keep coming back to: write a permissive policy, let the agent run for a week, then tighten the rules and replay the old sessions to see what would have been blocked. Much better than guessing at policy upfront, and it’s the part of the tool I didn’t expect to use as much as I do.

Every gated decision gets written to jsonl, so you can grep, diff, or feed traces back through a stricter policy without re-running the agent. There’s also a TUI for browsing sessions, inspecting individual gate decisions, and stepping through replays interactively, which makes it easier to spot patterns across runs.

Currently works with Claude Code and MCP-based clients like Codex.

Still a WIP and mostly a project for myself, but figured others experimenting with coding agents might find it interesting.

GitHub: https://github.com/chocks/agentctl