每日 Show HN

Upvote0

2026年4月25日 的 Show HN

35 条
258

A Karpathy-style LLM wiki your agents maintain (Markdown and Git) #

github.com favicongithub.com
115 评论8:53 AM在 HN 查看
I shipped a wiki layer for AI agents that uses markdown + git as the source of truth, with a bleve (BM25) + SQLite index on top. No vector or graph db yet.

It runs locally in ~/.wuphf/wiki/ and you can git clone it out if you want to take your knowledge with you.

The shape is the one Karpathy has been circling for a while: an LLM-native knowledge substrate that agents both read from and write into, so context compounds across sessions rather than getting re-pasted every morning. Most implementations of that idea land on Postgres, pgvector, Neo4j, Kafka, and a dashboard.

I wanted to go back to the basics and see how far markdown + git could go before I added anything heavier.

What it does: -> Each agent gets a private notebook at agents/{slug}/notebook/.md, plus access to a shared team wiki at team/.

-> Draft-to-wiki promotion flow. Notebook entries are reviewed (agent or human) and promoted to the canonical wiki with a back-link. A small state machine drives expiry and auto-archive.

-> Per-entity fact log: append-only JSONL at team/entities/{kind}-{slug}.facts.jsonl. A synthesis worker rebuilds the entity brief every N facts. Commits land under a distinct "Pam the Archivist" git identity so provenance is visible in git log.

-> [[Wikilinks]] with broken-link detection rendered in red.

-> Daily lint cron for contradictions, stale entries, and broken wikilinks.

-> /lookup slash command plus an MCP tool for cited retrieval. A heuristic classifier routes short lookups to BM25 and narrative queries to a cited-answer loop.

Substrate choices: Markdown for durability. The wiki outlives the runtime, and a user can walk away with every byte. Bleve for BM25. SQLite for structured metadata (facts, entities, edges, redirects, and supersedes). No vectors yet. The current benchmark (500 artifacts, 50 queries) clears 85% recall@20 on BM25 alone, which is the internal ship gate. sqlite-vec is the pre-committed fallback if a query class drops below that.

Canonical IDs are first-class. Fact IDs are deterministic and include sentence offset. Canonical slugs are assigned once, merged via redirect stubs, and never renamed. A rebuild is logically identical, not byte-identical.

Known limits: -> Recall tuning is ongoing. 85% on the benchmark is not a universal guarantee.

-> Synthesis quality is bounded by agent observation quality. Garbage facts in, garbage briefs out. The lint pass helps. It is not a judgment engine.

-> Single-office scope today. No cross-office federation.

Demo. 5-minute terminal walkthrough that records five facts, fires synthesis, shells out to the user's LLM CLI, and commits the result under Pam's identity: https://asciinema.org/a/vUvjJsB5vtUQQ4Eb

Script lives at ./scripts/demo-entity-synthesis.sh.

Context. The wiki ships as part of WUPHF, an open source collaborative office for AI agents like Claude Code, Codex, OpenClaw, and local LLMs via OpenCode. MIT, self-hosted, bring-your-own keys. You do not have to use the full office to use the wiki layer. If you already have an agent setup, point WUPHF at it and the wiki attaches.

Source: https://github.com/nex-crm/wuphf

Install: npx wuphf@latest

Happy to go deep on the substrate tradeoffs, the promotion-flow state machine, the BM25-first retrieval bet, or the canonical-ID stability rules. Also happy to take "why not an Obsidian vault with a plugin" as a fair question.

18

VT Code – Rust TUI coding agent with multi-provider support #

github.com favicongithub.com
2 评论3:17 AM在 HN 查看
Hi HN, I built VT Code, a semantic coding agent. Supports all SOTA and open sources model. Anthropic, OpenAI, Gemini, Codex. Agent Skills, Model Context Protocol and Agent Client Protocol (ACP) ready. All open source models are support. Local inference via LM Studio and Ollama (experiment). Semantic context understanding is supported by ast-grep for structured code search and ripgrep for powered grep.

I built VT Code in Rust on Ratatui. Architecture and agent loop documented in the README and DeepWiki.

Repo: https://github.com/vinhnx/VTCode

DeepWiki: https://deepwiki.com/vinhnx/VTCode

Happy to answer questions!

I believe coding harnesses should be open, and everyone should have a choice of their preferred way to work in this agentic engineering era.

12

LLMs consume 5.4x less mobile energy than ad-supported web search #

dupr.at favicondupr.at
1 评论8:44 AM在 HN 查看
The standard AI energy debate compares server-side LLM inference to a server-side Google query. I think this misses most of what actually happens on a mobile device during a real search session.

I built a parametric model of the full end-to-end mobile search session: 4G/5G radio energy, SoC rendering cost for a 2.5MB page, programmatic advertising RTB auctions running in the background, and network transmission costs for both sides. Then compared it to an equivalent LLM session.

Main finding across 10,000 Monte Carlo draws: on mobile, a standard LLM session uses on average 5.4x less energy than a classic ad-supported web search session. Programmatic advertising alone accounts for up to 41% of device battery drain per session.

Caveats I tried to be explicit about:

- Advantage disappears on fixed Wi-Fi/fiber

- Reverses for reasoning models

- Parametric model, not empirical device measurement. Greenspector has offered to run terminal measurements for v2

- Jevons paradox applies

SSRN working paper, not peer-reviewed. Methodology and Monte Carlo distributions fully documented in the paper. Happy to defend the assumptions.

DOI: 10.2139/ssrn.6287918

11

Agent MCP Studio – build multi-agent MCP systems in a browser tab #

agentmcp.studio faviconagentmcp.studio
6 评论7:06 AM在 HN 查看
I built a browser-only studio for designing and orchestrating MCP agent systems for development and experimental purposes. The whole stack — tool authoring, multi-agent orchestration, RAG, code execution — runs from a single static HTML file via WebAssembly. No backend.

The bet: WASM is a hard sandbox for free. When you generate tools with an LLM (or write them by hand), the studio AST-validates the source, registers it lazily, and JIT-compiles into Pyodide on first call. SQL tools run in DuckDB-WASM in a Web Worker. The built-in RAG uses Xenova/all-MiniLM-L6-v2 via Transformers.js for on-device embeddings. Nothing leaves the browser; close the tab and the stack is gone. The WASM boundary is what makes it safe to execute LLM-generated code locally — no Docker, no per-tenant container, no server.

Above the tool layer sits an agentic system with 10 orchestration strategies:

- Supervisor (router → 1 expert) - Mixture of Experts (parallel + synthesizer) - Sequential Pipeline - Plan & Execute (planner decomposes, workers execute) - Swarm (peer handoffs) - Debate (contestants + judge) - Reflection (actor + critic loop) - Hierarchical (manager delegates via ask_<persona> tools) - Round-Robin (panel + moderator) - Map-Reduce (splitter → parallel → aggregator)

You build a team visually: drag tool chips onto persona nodes on a service graph, pick a strategy, and the topology reshapes to match. Each persona auto-registers as an MCP tool (ask_<name>), plus an agent_chat(query, strategy?) meta tool. A bundled Node bridge speaks stdio to Claude Desktop and WebSocket to your tab — your browser becomes an MCP server.

When you're done, Export gives you a real Python MCP server: server.py, agentic.py, tools/*.py, Dockerfile, requirements.txt, .env.example. The exported agentic.py is a faithful Python port of the same orchestration logic running in the browser, so the deployable artifact behaves identically to the prototype.

Also shipped: Project Packs. Export the whole project as a single .agentpack.json. Auto-detects required external services (OpenAI, GitHub, Stripe, Anthropic, Slack, Notion, Linear, etc.) by scanning tool source for os.environ.get(...) and cross-referencing against the network allowlist. Recipients get an import wizard that prompts for credentials. Manifests are reviewable, sharable, and never carry secrets.

Some things I'm honestly uncertain about:

- 10 strategies might be too many. My guess is most users only need Supervisor, Mixture of Experts, and Debate. Open to data on which ones actually pull weight. - Browser cold-starts (Pyodide warm-up on first load) are a real UX hit despite aggressive caching. - bridge.js is the only non-browser piece. A hosted variant is the obvious next step.

Built with Pyodide, DuckDB-WASM, Transformers.js, OpenAI Chat Completions (or a local Qwen 1.5 0.5B running in-browser via Transformers for fully offline mode). ~5K lines of HTML/CSS/JS in one file.

https://www.agentmcp.studio

Genuinely curious whether running this much LLM-generated code in a browser tab feels reasonable to you, or quietly terrifying.

8

WhiskeySour – A 10x faster drop-in replacement for BeautifulSoup #

1 评论2:23 PM在 HN 查看
The Problem

I’ve been using BeautifulSoup for sometime. It’s the standard for ease-of-use in Python scraping, but it almost always becomes the performance bottleneck when processing large-scale datasets.

Parsing complex or massive HTML trees in Python typically suffers from high memory allocation costs and the overhead of the Python object model during tree traversal. In my production scraping workloads, the parser was consuming more CPU cycles than the network I/O. Lxml is fast but again uses up a lot of memory when processing large documents and has can cause trouble with malformed HTML.

The Solution

I wanted to keep the API compatibility that makes BS4 great, but eliminates the overhead that slows down high-volume pipelines. It also uses html5ever which That’s why I built WhiskeySour. And yes… I *vibe coded the whole thing*.

WhiskeySour is a drop-in replacement. You should be able to swap from "bs4 import BeautifulSoup" with "from whiskeysour import WhiskeySour" and see immediate speedups. Your workflows that used to take more than 30 mins might take less than 5 mins now.

I have shared the detailed architecture of the library here: https://the-pro.github.io/whiskeySour/architecture/

Here is the benchmark report against bs4 with html.parser: https://the-pro.github.io/whiskeySour/bench-report/

Here is the link to the repo: https://github.com/the-pro/WhiskeySour

Why I’m sharing this

I’m looking for feedback from the community on two fronts:

1. Edge cases: If you have particularly messy or malformed HTML that BS4 handles well, I’d love to know if WhiskeySour encounters any regressions.

2. Benchmarks: If you are running high-volume parsers, I’d appreciate it if you could run a test on your own datasets and share the results.

6

Odozi – open-source iOS journaling app #

odozi.app faviconodozi.app
0 评论3:52 PM在 HN 查看
Yeah I know I hate the name too but I wasn't about to pay up for odyssey.app.

It's an open source project so feel free to poke around with it / fork it. I talk about it more on the marketing website, but a few of us have been using it for the past month and kind of fun. Obviously there will be a slew of issues / feedback / nits that come from this, but c'est la vie. GH is here: https://github.com/johnlarkin1/odozi

6

Good AI Task – a tool for asking AI what it can and can't do #

goodaitask.com favicongoodaitask.com
4 评论8:07 PM在 HN 查看
Describe a task, and AI will give you a breakdown of whether it can do your task well, poorly, or somewhere in between. I built it mostly because I kept getting asked "what is AI even good for" and fumbling the answer.

The most fun use is testing it on things you already know it can't do and seeing how it explains why it can't be done.

4

Quay – Menu-bar Git sync #

code.sailorslog.io faviconcode.sailorslog.io
0 评论6:23 PM在 HN 查看
I write Astro blog posts in a text editor; when I'm done I want them pushed to GitHub so Cloudflare deploys the site. To make it comfortable, I built Quay for the menu bar. Also useful for Obsidian vault syncing.

Point it at a folder, connect a GitHub repo, and it stages/commits/pushes/pulls. Multiple repos, editable commit messages, branch switching, merges with conflict detection. Shows open issue and PR counts per repo. But it's is not a full Git client (no diffs, blame, cherry-pick, or rebase) and it doesn't create remote repos.

Native macOS app (Swift/SwiftUI). Wraps the local git binary (prompts to install Xcode Command Line Tools if missing). No custom Git implementation. Sandboxed, no telemetry, GitHub-only.

macOS. 7-day trial, €9 one-time on the App Store.

3

Werner – A native Markdown editor for macOS with four view modes #

apps.apple.com faviconapps.apple.com
5 评论8:08 AM在 HN 查看
I built Werner because I wanted a Markdown editor that didn't make me choose between plain files I own and a writing experience that actually feels good. Most apps pick one side: either a glorified textarea over .md files, or a "nice" editor that swallows my notes into a proprietary database.

Werner stores plain .md files in folders you choose, but offers four ways to work with them, switchable per-document:

Source — CodeMirror 6 with markdown syntax highlighting

Split — source + live preview

Preview — rendered output only

Interactive — a real WYSIWYG (TipTap 3) that round-trips to clean GFM

The interactive mode was the hardest part. TipTap's default Markdown serializer mangles GFM features I care about — table column alignment (:---:), task lists, strikethrough, hard breaks (two-space convention). So I wrote a custom ProseMirror→Markdown serializer that handles all of them and survives round-trips without diff churn.

A few other things that might be of interest:

Unified search across all four modes. One query, three adapters (CM6 search API, a custom ProseMirror decoration plugin, DOM <mark> injection for the preview). Query persists when you switch modes.

Stack: Tauri (Rust backend), React + TypeScript, Tailwind v4 with an OKLCH-based token system, shadcn/ui. Markdown rendering happens in Rust via IPC.

Mac App Store sandbox done properly. Security-scoped bookmarks for "open this folder and remember it," with a reconnect flow when bookmarks fail (e.g., external drive unmounted).

No lock-in. Files stay where you put them. Delete the app and your notes are still just .md files in folders.

It's on the Mac App Store: https://apps.apple.com/ua/app/werner/id6758157982?mt=12. Free

Things I know I haven't solved yet: No plugins yet, no realtime collaboration, no vim mode, no Win/Linux builds yet.

Happy to dig into any of it — the GFM serializer, the search architecture, the sandbox bookmarks, the Tauri tradeoffs vs. Electron, or anything else

3

Xtrace – Unix-Style macOS Profiling for Instruments (CPU/GPU/Memory) #

github.com favicongithub.com
0 评论9:28 AM在 HN 查看
I built this because I wanted a terminal-first profiling workflow on macOS, without constantly switching into the Instruments GUI.

What it does today:

- CPU analysis: summary, timeline, calltree, collapsed stacks, flamegraph, diff

- GPU analysis (trace-gpu.py): active/idle ratios, command-buffer cadence, process ownership

- Memory analysis (trace-memory.py): summary, leaks, growth, heap/regions

- Recording modes: launch, attach by PID/name, wait-for spawn, system-wide

- Root-aware attach flow for protected/root-owned processes

- JSON output for automation/LLM workflows (I use it in autoresearch loops: hypothesis → experiment → keep/discard)

Other notes:

- macOS only (Instruments/xctrace)

- core analysis is Python stdlib (no pip deps)

- optional tools: speedscope + inferno for best visualization

Would love feedback on workflow, rough edges, and what integrations would make this more useful.

3

SVG Fitter – Rust+WASM Vectorizer #

svg.axk.sh faviconsvg.axk.sh
0 评论4:51 PM在 HN 查看
I went crazy with a tool that helps me tracing raster images. Thought other might like it.

It doesn't auto vectorize image, but rather allow for guided process. Final SVG still should be edited.

Few fun features like genetic algorithm fit optimization, semi-manual tracing and color preservation.

Perfect if you want to have lightweight SVG from huge PNG image.

Note: If there's interest I might open-source it, just not sure if anyone would want to see it :)

1

Auto lock your Mac via phone proximity (No Apple Watch) #

0 评论7:39 PM在 HN 查看
Hi HN,

I built Blunlock to solve the "clamshell mode" problem. Since I use my MacBook with the lid closed, Touch ID is useless. I wanted a way to lock/unlock via proximity without a $300 Apple Watch or a battery-draining phone app.

How it works: It pings your phone’s native Bluetooth LE signal and measures RSSI to determine proximity. Walk away -> Mac locks. Walk back -> it wakes up.

The Tech: Built with Electron. It is currently an unsigned .dmg, so Gatekeeper will flag it (Right-click -> Open to bypass). I'm doing a free beta to test reliability before I buy the $99 Apple Cert.

You can find the site and get early access at: blunlock.com

I’d love feedback on handling BT MAC randomization or optimizing scanning!So I built Blunlock.

It relies entirely on the Mac pinging your phone's native Bluetooth LE signal.

How it works: You pair your phone via Bluetooth. The app runs in the background and measures the RSSI (signal strength) to determine proximity. Walk away, the signal drops, the Mac locks. Walk back, it wakes up.

The Tech & The Catch: I built this using Electron to get it out the door quickly. Before I drop the $99 on an Apple Developer Cert to properly sign it and handle the Mac App Store sandbox restrictions, I want to make sure the proximity logic actually holds up across different environments (M-series vs Intel, different BT interference levels). Because it's currently an unsigned .dmg, Gatekeeper will flag it.

I'm doing a closed beta right now for free to anyone willing to right-click -> "Open" and help me test the Bluetooth reliability before I do a real launch. If you want to help me break it, you can grab the beta from the website.

Would love any feedback on handling BT MAC address randomization or optimizing the background scanning!