2026年3月26日 の Show HN
47 件Turbolite – a SQLite VFS serving sub-250ms cold JOIN queries from S3 #
It’s called turbolite. It is experimental, buggy, and may corrupt data. I would not trust it with anything important yet.
I wanted to explore whether object storage has gotten fast enough to support embedded databases over cloud storage. Filesystems reward tiny random reads and in-place mutation. S3 rewards fewer requests, bigger transfers, immutable objects, and aggressively parallel operations where bandwidth is often the real constraint. This was explicitly inspired by turbopuffer’s ground-up S3-native design. https://turbopuffer.com/blog/turbopuffer
The use case I had in mind is lots of mostly-cold SQLite databases (database-per-tenant, database-per-session, or database-per-user architectures) where keeping a separate attached volume for inactive database feels wasteful. turbolite assumes a single write source and is aimed much more at “many databases with bursty cold reads” than “one hot database.”
Instead of doing naive page-at-a-time reads from a raw SQLite file, turbolite introspects SQLite B-trees, stores related pages together in compressed page groups, and keeps a manifest that is the source of truth for where every page lives. Cache misses use seekable zstd frames and S3 range GETs for search queries, so fetching one needed page does not require downloading an entire object.
At query time, turbolite can also pass storage operations from the query plan down to the VFS to frontrun downloads for indexes and large scans in the order they will be accessed.
You can tune how aggressively turbolite prefetches. For point queries and small joins, it can stay conservative and avoid prefetching whole tables. For scans, it can get much more aggressive.
It also groups pages by page type in S3. Interior B-tree pages are bundled separately and loaded eagerly. Index pages prefetch aggressively. Data pages are stored by table. The goal is to make cold point queries and joins decent, while making scans less awful than naive remote paging would.
On a 1M-row / 1.5GB benchmark on EC2 + S3 Express, I’m seeing results like sub-100ms cold point lookups, sub-200ms cold 5-join profile queries, and sub-600ms scans from an empty cache with a 1.5GB database. It’s somewhat slower on normal S3/Tigris.
Current limitations are pretty straightforward: it’s single-writer only, and it is still very much a systems experiment rather than production infrastructure.
I’d love feedback from people who’ve worked on SQLite-over-network, storage engines, VFSes, or object-storage-backed databases. I’m especially interested in whether the B-tree-aware grouping / manifest / seekable-range-GET direction feels like the right one to keep pushing.
Veil – Dark mode PDFs without destroying images, runs in the browser #
With that premise, I'd like to share the choices that made all of this possible. To do so, I'll walk through the three layers that veil creates from the original PDF:
- Layer 1: CSS filter. I use invert(0.86) hue rotate(180deg) on the main canvas. I use 0.86 instead of 1.0 because I found that full inversion produces a pure black and pure white that are too aggressive for prolonged reading. 0.86 yields a soft dark grey (around #242424, though it depends on the document's white) and a muted white (around #DBDBDB) for the text, which I found to be the most comfortable value for hours of reading.
- Layer 2: image protection. A second canvas is positioned on top of the first, this time with no filters. Through PDF.js's public API getOperatorList(), I walk the PDF's operator list and reconstruct the CTM stack, that is the save, restore and transform operations the PDF uses to position every object on the page. When I encounter a paintImageXObject (opcode 85 in PDF.js v5), the current transformation matrix gives me the exact bounds of the image. At that point I copy those pixels from a clean render onto the overlay. I didn't fork PDF.js because It would have become a maintenance nightmare given the length of the codebase and the frequent updates. Images also receive OCR treatment: text contained in charts and images becomes selectable, just like any other text on the page. At this point we have the text inverted and the images intact. But what if the page is already dark? Maybe the chapter title pages are black with white text? The next layer takes care of that.
- Layer 3: already-dark page detection. After rendering, the background brightness is measured by sampling the edges and corners of the page (where you're most likely to find pure background, without text or images in the way). The BT.601 formula is used to calculate perceived brightness by weighting the three color channels as the human eye sees them: green at 58.7%, red at 29.9%, blue at 11.4%. These weights reflect biology: the eye evolved in natural environments where distinguishing shades of green (vegetation, predators in the grass) was a matter of survival, while blue (sky, water) was less critical. If the average luminance falls below 40%, the page is flagged as already dark and the inversion is skipped, returning the original page. Presentation slides with dark backgrounds stay exactly as they are, instead of being inverted into something blinding.
Scanned documents are detected automatically and receive OCR via Tesseract.js, making text selectable and copyable even on PDFs that are essentially images. Everything runs locally, no framework was used, just vanilla JS, which is why it's an installable PWA that works offline too.
Here's the link to the app along with the repository: https://veil.simoneamico.com | https://github.com/simoneamico-ux-dev/veil
I hope veil can make your reading more pleasant. I'm open to any feedback. Thanks everyone
Fio: 3D World editor/game engine – inspired by Radiant and Hammer #
Compact and lightweight (target: Snapdragon 8CX, OpenGL 3.3)
Real-time lighting with stencil shadows without the need for pre-baked compilation
Robust LLM Extractor for Websites in TypeScript #
LLMs seemed like the obvious fix — just throw the HTML at GPT and ask for JSON. Except in practice, it's more painful than that:
- Raw HTML is full of nav bars, footers, and tracking junk that eats your token budget. A typical product page is 80% noise. - LLMs return malformed JSON more often than you'd expect, especially with nested arrays and complex schemas. One bad bracket and your pipeline crashes. - Relative URLs, markdown-escaped links, tracking parameters — the "small" URL issues compound fast when you're processing thousands of pages. - You end up writing the same boilerplate: HTML cleanup → markdown conversion → LLM call → JSON parsing → error recovery → schema validation. Over and over.
We got tired of rebuilding this stack for every project, so we extracted it into a library.
Lightfeed Extractor is a TypeScript library that handles the full pipeline from raw HTML to validated, structured data:
- Converts HTML to LLM-ready markdown with main content extraction (strips nav, headers, footers), optional image inclusion, and URL cleaning - Works with any LangChain-compatible LLM (OpenAI, Gemini, Claude, Ollama, etc.) - Uses Zod schemas for type-safe extraction with real validation - Recovers partial data from malformed LLM output instead of failing entirely — if 19 out of 20 products parsed correctly, you get those 19 - Built-in browser automation via Playwright (local, serverless, or remote) with anti-bot patches - Pairs with our browser agent (@lightfeed/browser-agent) for AI-driven page navigation before extraction
We use this ourselves in production at Lightfeed, and it's been solid enough that we decided to open-source it.
GitHub: https://github.com/lightfeed/extractor npm: npm install @lightfeed/extractor Apache 2.0 licensed.
Happy to answer questions or hear feedback.
Claude skill that evaluates B2B vendors by talking to their AI agents #
Researches your company automatically -- industry, size, stack -- so you don't fill out a form Asks 2-4 category-specific questions before evaluating anything. Not generic. For a CS platform evaluation it might ask "is your team high-touch or low-touch? Most CS platforms are built for one and barely work for the other." These surface requirements buyers didn't know they had. Tries to find and talk directly to each vendor's AI agent -- a REST API call that checks for a Company Agent, then runs a structured due diligence conversation if one exists Asks adversarial questions: "What are your customers' most common complaints?" and "What use cases are you NOT a good fit for?" -- and flags when agents deflect instead of answering Cross-references every vendor claim against independent sources (G2, Gartner, press) in a Claims vs. Evidence table Produces a scorecard with transparent evidence tracking -- each score shows whether it's backed by vendor-verified evidence or public sources only
The agent-to-agent piece is technically new. When a vendor has an AI agent, Claude (working for the buyer) interrogates it directly, then fact-checks its answers. When vendors have different evidence levels, the skill quantifies what would change if the missing evidence were confirmed -- so it doesn't silently favor vendors that happen to have AI agents. It works fully for any vendor, with or without an AI agent. Vendors without one get evaluated on public sources with the same scoring framework. We built this at Salespeak -- we help B2B vendors build AI Company Agents. So yes, there's a connection: when an agent finds a vendor's Company Agent, it uses our Frontdoor API to talk to it. But the skill is genuinely useful without that, and we wanted to be honest about that rather than ship something that only works as a product demo. MIT licensed. To install, just ask Claude Code: "Install the buyer-eval skill from salespeak-ai on GitHub." Then /buyer-eval to run it. Felt appropriate that installing a skill for AI agents works the same way. Repo: https://github.com/salespeak-ai/buyer-eval-skill Happy to answer questions about how the agent-to-agent conversation works technically.
Sup AI, a confidence-weighted ensemble (52.15% on Humanity's Last Exam) #
I started working on this because no single AI model is right all the time, but their errors don’t strongly correlate. In other words, models often make unique mistakes relative to other models. So I run multiple models in parallel and synthesize the outputs by weighting segments based on confidence. Low entropy in the output token probability distributions correlates with accuracy. High entropy is often where hallucinations begin.
My dad Scott (AI Research Scientist at TRI) is my research partner on this. He sends me papers at all hours, we argue about whether they actually apply and what modifications make sense, and then I build and test things. The entropy-weighting approach came out of one of those conversations.
In our eval on Humanity's Last Exam, Sup scored 52.15%. The best individual model in the same evaluation run got 44.74%. The relative gap is statistically significant (p < 0.001).
Methodology, eval code, data, and raw results:
- https://sup.ai/research/hle-white-paper-jan-9-2026
- https://github.com/supaihq/hle
Limitations:
- We evaluated 1,369 of the 2,500 HLE questions (details in the above links)
- Not all APIs expose token logprobs; we use several methods to estimate confidence when they don't
We tried offering free access and it got abused so badly it nearly killed us. Right now the sustainable option is a $5 starter credit with card verification (no auto-charge). If you don't want to sign up, drop a prompt in the comments and I'll run it myself and post the result.
Try it at https://sup.ai. My dad Scott (@scottmu) is in the thread too. Would love blunt feedback, especially where this really works for you and where it falls short.
Here's a short demo video: https://www.youtube.com/watch?v=DRcns0rRhsg
Orloj – agent infrastructure as code (YAML and GitOps) #
We built this because running AI agents in production today looks a lot like running containers before Kubernetes: ad-hoc scripts, no governance, no observability, no standard way to manage the lifecycle of an agent fleet. Everyone we talked to was writing the same messy glue code to wire agents together, and nobody had a good answer for "which agent called which tool, and was it supposed to?"
Orloj treats agents the way infrastructure-as-code treats cloud resources. You write a manifest that declares an agent's model, tools, permissions, and execution limits. You compose agents into directed graphs — pipelines, hierarchies, or swarm loops.
The part we're most excited about is governance. AgentPolicy, AgentRole, and ToolPermission are evaluated inline during execution, before every agent turn and tool call. Instead of prompt instructions that the model might ignore, these policies are a runtime gate. Unauthorized actions fail closed with structured errors and full audit trails. You can set token budgets per run, whitelist models, block specific tools, and scope policies to individual agent systems.
For reliability, we built lease-based task ownership (so crashed workers don't leave orphan tasks), capped exponential retry with jitter, idempotent replay, and dead-letter handling. The scheduler supports cron triggers and webhook-driven task creation.
The architecture is a server/worker split. orlojd hosts the API, resource store (in-memory for dev, Postgres for production), and task scheduler. orlojworker instances claim and execute tasks, route model requests through a gateway (OpenAI, Anthropic, Ollama, etc.), and run tools in configurable isolation — direct, sandboxed, container, or WASM. For local development, you can run everything in a single process with orlojd --embedded-worker --storage-backend=memory.
Tool isolation was important to us. A web search tool probably doesn't need sandboxing, but a code execution tool should run in a container with no network, a read-only filesystem, and a memory cap. You configure this per tool based on risk level, and the runtime enforces it.
We also added native MCP support. You register an MCP server (stdio or HTTP), Orloj auto-discovers its tools, and they become first-class resources with governance applied. So you can connect something like the GitHub MCP server and still have policy enforcement over what agents are allowed to do with it.
Three starter blueprints are included (pipeline, hierarchical, swarm-loop).
Docs: https://docs.orloj.dev
We're also building out starter templates for operational workflows where governance really matters. First on the roadmap: 1. Incident response triage, 2. Compliance evidence collector, 3. CVE investigation pipeline, and 4. Secret rotation auditor. We have 20 templates in mind and community contributions are welcome.
We're a small team and this is v0.1.0, so there's a lot still on the roadmap — hosted cloud, compliance packaging, and more. But the full runtime is open source today and we'd love feedback on what we've built so far. What would you use this for? What's missing?
Paseo – Open-source coding agent interface (desktop, mobile, CLI) #
I started working on Paseo last September as a push-to-talk voice interface for Claude Code. I wanted to bounce ideas hands-free while going on walks, after a while I wanted to see what the agent was doing, then I wanted to text it when I couldn't talk, then I wanted to see diffs and run multiple agents. I kept fixing rough edges and adding features, and slowly it became what it is today.
What it does:
- Run multiple providers through the same UI
- Works on macOS, Linux, Windows, iOS, Android, and web
- Manage agents in different machines from the same UI
- E2EE Relay for mobile connectivity
- Local voice chat and dictation (NVIDIA Parakeet + Kokoro + Sherpa ONNX)
- Split panes to work with agents, files and terminals side by side
- Git panel to review diffs and do common actions (commit, push, create PR etc.)
- Git worktree management so agents don't step on each other
- Docker-style CLI to run agents
- No telemetry, no tracking, no login
Paseo does not call inference APIs directly or extract your OAuth tokens. It wraps your first-party agent CLIs and runs them exactly as you would in your terminal. Your sessions, your system prompts, your tools, nothing is intercepted or modified.
Stack: The daemon is written in Typescript. The app uses Expo and compiles to both native mobile apps and web. The desktop app is in Electron (I started with Tauri and had to migrate). Sharing the same codebase across different form factors was challenging but I'd say that with discipline it's doable an the result has been worth it, as most features I build automatically work in all clients. I did have to implement some platform specific stuff, especially around gestures, audio and scroll behavior. The relay is built on top of Cloudflare DO, so far it's holding up quite well.
I love using the app, but I am even more excited about the possibilities of the CLI, as it become a primitive for more advanced agent orchestration, it has much better ergonomics than existing harnesses, and I'm already using it to experiment with loops and agent teams, although it's still new territory.
How Paseo compares to similar apps: Anthropic and OpenAI already do some of what Paseo does (Claude Code Remote Control, Codex app, etc.), but with mixed quality and you're locked onto their models. Most other alternatives I know about found are either closed source or not flexible enough for my needs.
The license is AGPL-3.0. The desktop app ships with a daemon so that's all you need. But you can also `npm install -g @getpaseo/cli` for headless mode and connect via any client.
I mainly use Mac, so Linux and Windows has mostly been tested by a small group of early adopters. If you run into issues, I’d appreciate bug reports on GitHub!
Repo: https://github.com/getpaseo/paseo
Homepage: https://paseo.sh/
Discord: https://discord.gg/jz8T2uahpH
Happy to answer questions about the product, architecture or whatever else!
---
I resubmitted this post because I forgot to add the URL and it didn't allow me to add it later.
Full graphical desktop running on a 128MB VPS Alpine+XRDP+WindowMaker #
Agent Skill Harbor – a GitHub-native skill platform for teams #
What still feels missing is the middle layer: how teams and organizations share AI agent skills, track provenance, and keep them safe to use.
Agent Skill Harbor is an OSS skill management platform for that layer. It is GitHub-native, DB-less, and serverless by design, because skills are mostly text artifacts that already fit naturally in Git.
It collects skills from GitHub repos, tracks provenance, supports governance and safety checks, and publishes a static catalog site with GitHub Actions and GitHub Pages.
Repo: https://github.com/skill-mill/agent-skill-harbor Demo: https://skill-mill.github.io/agent-skill-harbor-demo/
Micro – apps without ads, algorithms or tracking #
There's so much more I could say but essentially Micro and the app Mu (https://mu.xyz) is the culmination of many years of work. And it's also open source. https://github.com/micro/mu. Please provide feedback. Please tear it apart. Please poke holes in my theory. I'm not looking for hype. I really do just think we need an alternative to big tech.
Breathe-Memory – Associative memory injection for LLMs (not RAG) #
Breathe-memory takes a different approach: associative injection. Before each LLM call, it extracts anchors from the user's message (entities, temporal references, emotional signals), traverses a concept graph via BFS, runs optional vector search, and injects only what's relevant — typically in <60ms.
When context fills up, instead of summarizing, it extracts a structured graph: topics, decisions, open questions, artifacts. This preserves the semantic structure that summaries destroy.
The whole thing is ~1500 lines of Python, interface-based, zero mandatory deps. Plug in any database, any LLM, any vector store. Reference implementation uses PostgreSQL + pgvector.
https://github.com/tkenaz/breathe-memory
We've been running this in production for several months. Open-sourcing because we think the approach (injection over retrieval) is underexplored and worth more attention.
We've also posted an article about memory injections in a more human-readable form, if you want to see the thinking under the hood: https://medium.com/towards-artificial-intelligence/beyond-ra...
Vizier – A physical design advisor for DuckDB #
I've made an early version of a physical design advisor (called Vizier) for DuckDB. It can analyze a collection of queries (using a set of heuristics) and recommend changes to the physical design/layout of the database (for example, sort orders, Parquet layouts, indexes, etc.), in order to make those queries run faster.
Vizier is implemented as a DuckDB extension in Zig and supports DuckDB version 1.2.0 and newer. The project is very early-stage, but if you're interested in learning more about Vizier or trying it out, you can check out the links below:
Project's GitHub repo: https://github.com/CogitatorTech/vizier
Vizier documentation: https://cogitatortech.github.io/vizier/
Illustrative – AI pipeline that turns books into graphic novels #
Developed a 7-pass scripting enrichment system (beat analysis, adaptation filtering, character deep dives) before generating any images.
Dual backend: Google Gemini for scripting (2M context window) and either Gemini or OpenAI for image generation with 3-tier model fallback (comparing the performance of both).
It's not great. Would love feedback on the pipeline.
Alexandria, open source news aggregation and classification suite #
It's really simple: docker compose, micro services connected to some very basic data pipelines using rabbitMQ, and some local NLP. The NLP portion is a bit slow as it's pure CPU and doesn't leverage GPU/NPE if you have them.
The tool offers you to define your own "labels" and "relations", which are then picked up by the local NLP models to assign it to news articles. Standard, there are only a few included.
But the goal is to have your own personal news aggregator, and adding/changing news sources (if they have a rss feed) should be pretty easy through the docker compose file.
Burn Room – End-to-End Encrypted Ephemeral SSH Chat #
You can jump in right away:
ssh [email protected] -p 2323 password: burnroom
Or just open https://burnroom.chat in your browser. It runs in a web terminal and works on mobile too.
How it handles encryption
Private, password-protected rooms are fully end-to-end encrypted. The server never has access to readable messages — it only ever sees encrypted data.
Keys are derived from the room password using scrypt, with a unique salt for each room. Every message is encrypted with XChaCha20-Poly1305 using a fresh random nonce, following the same general approach used in tools like Signal and WireGuard.
When you join a room, you’re shown a fingerprint so you can confirm everyone is using the same key. When you leave, the encryption keys are wiped from memory.
Designed to disappear
Everything in Burn Room is temporary by design. Messages are never written to disk, never logged, and never backed up. By default, they’re cleared from memory after an hour.
Room creators can set a burn timer — 30 minutes, 1 hour, 6 hours, or 24 hours. When time runs out, the room and everything in it are destroyed. If a room sits idle, it closes on its own. Creators can also destroy a room instantly at any time.
If the server restarts, everything is wiped. The only thing briefly stored for recovery is minimal room metadata, and even then, encrypted rooms remain unreadable.
Privacy first
There are no accounts, no identities, and no tracking of any kind. IP addresses are only used briefly for rate limiting and are kept in memory, not stored.
Usernames are temporary and get recycled. The platform is built to minimize what exists in the first place, rather than trying to protect stored data later.
Language support
Burn Room adapts to your system or browser language automatically. The interface is translated across menus, prompts, and messages.
Chat itself can be translated per user, so people speaking different languages can talk in the same room and each see messages in their own language. In encrypted rooms, translation happens locally after decryption — the server never sees the original text.
Features you’ll notice
There are a few always-available public rooms like Politics, Gaming, Tech, and Lobby, along with the option to create private, password-protected rooms.
You can mention others, navigate message history, and use simple command shortcuts. Rooms show a live countdown so you always know when they’ll disappear. You can also share direct links to rooms to bring others in instantly.
It works the same whether you connect through SSH or the browser.
Under the hood
Burn Room is built with Node.js and TypeScript, using SSH for direct connections and a terminal interface in the browser. Encryption relies on audited native libraries, not custom implementations.
It’s lightweight but designed to handle a large number of users at once, with built-in protections against abuse like rate limiting and connection throttling.
Enter, say what you need to say, and let it disappear. Enter.Chat.Burn
NerdFlair, a Claude Code QoL Plugin #
ReactNative.run – Browser Metro bundler that runs React Native #
Open source (MIT): https://github.com/RapidNative/reactnative-run
Lexe – Self-Custodial Lightning Nodes in TEEs, Python and Rust SDKs #
New in this release:
- First ever public release of Lexe Wallet on the App Store and Google Play
- Human Bitcoin Addresses (BIP 353): receive Bitcoin at ₿[email protected]
- Python and Rust SDKs, published to PyPI and crates.io
- Open source node, app, and SDKs published at https://github.com/lexe-app/lexe-public
Version 1.0 of agent-of-empires CLI agent orchestration tool #
So stoked to mark the agent-of-empires (aoe) tool as version 1.0. We've had 30 different direct contributors to the codebase as well as many other indirect contributors via Github issues and various comments.
It's a tool to make it as easy as possible to push lots of coding agents forwards at once. Powered by sandboxing, git worktrees, tmux, but all those things are managed for you so you don't need to sweat about the details.
We also have a youtube channel for tutorials for those that like to learn via visual/audio. https://www.youtube.com/@agent-of-empires
Thank you to the community for all the amazing feature ideas and bug fixes, and I hope you enjoy what I think is now a very good, stable, and feature filled tool. It's my daily driver for all my Claude Code agents and hope it helps others too.
`brew install aoe`
An x402 gateway for buying a finished local business website #
Beta Appointment Setter #
I built a selfhosted Wanderlog alternative with real-time collaboration #
So I started building my own tool, mainly for myself at first.
It’s a self-hosted travel planner where you can organize trips collaboratively in real time. You can drag and drop places into days, keep track of budgets, manage bookings, and create packing lists.
The real-time collaboration part was probably the trickiest thing to get right. Keeping everything in sync without it becoming overly complicated took a few iterations.
It’s still not perfect and there are definitely areas that need improvement, especially around search and some UX details.
If you’re into travel planning or self-hosting tools, I’d be interested to hear what you think or what you’re missing in tools like this.
Hooky – A lightweight HTTP webhook server written in Go #
Zitrus – A 3DS SDK written in pure Zig #
Currently it implements (not exhaustive): a graphics api fully written in zig, filesystem, networking, input support, a fully comptime IPC command generation and 20+ tools for 3DS-related file formats.
It's still in somewhat "early stages" as it's missing a libc but that's planned in my infinite TODO list.
The code is hosted on codeberg
BotMarket — Structured datasets AI agents can query directly #
Costsimulators.com – How much money are you spending? (OSS, ad-free) #
Set you rate/people amount and be amazed of the cost of those meetings.
More different kind a costs added later, at some point, what would you like to see?
Runs on Cloudflare pages. Vanilla HTML/JS/CSS, no backend, no trackers, no cookies, no 3rd party libraries.
Got fed up missing cool AI events: built tool to match them to my goals #
Then it happens again: we see a cool event post-factum and wish we'd known about it earlier.
But boy, it's annoying (and long!) to check all these 100–200 events every week to cherrypick. So I built a tool that scrapes all AI and tech events in the Bay Area (112 this week), vets them, and matches them to my current goals(!).
Pretty happy with the v1. Curious what you think.
P.S. A lil heads-up, to run it for yourself, the app will ask you for $2. Tokens, you know, tokens lol
Cognitive Layers #
This has been a problem before – underspecified projects, specifications going out of sync with the first line of code. We've just amplified it now.
A lot of people (myself included) have tried to maintain good specifications in markdown to give LLMs and humans maximal context. But this is still walls of text that poison anyone's context, regardless of their artificiality.
So I built a way to model knowledge as a graph that both people and LLMs can consume progressively – by navigating the graph or searching for specific connectivity patterns rather than ingesting everything at once.
At its core it's an open specification for layers that add progressively more semantic value: starting with prose, growing into terminology, tasks, concepts, API surfaces, and structured plans.
The most critical component is a layer that maps artifacts (such as code) to the knowledge model with good enough precision to track drift and coverage automatically.