Show HN за 7 апреля 2026 г.
53 постовAn interactive map of Tolkien's Middle-earth #
I have been commuting a fair bit between the East and West coast, and thanks to American Airlines' free onboard WiFi, I was able to vibe-code a full interactive map of Middle-earth right from my economy seat at the back of the bus.
It's rather amazing how much an LLM knows about Tolkien's work, and it was fun to delve into many of the nooks and crannies of Tolkien's lore.
Some features: - Plot on the map the journey of the main characters in both The Hobbit and The Lord of the Rings. - Follow a list of events in the chronological Timeline - Zoom in on the high-def map and explore many of the off-the-main-plotline places - Use the 'measure distances' feature to see how far apart things are.
I also had a lot of fun learning about tiling to allow for efficient zooming.
If you are anything like me, this should provide a fun companion to reading the books or watching the movies (note that on this site, I followed the book narrative, and did not include Peter Jackson's many departures)
If you get the chance to check it out, I would love more feedback, and if there is demand, I might do the same for Game of Thrones.
Stop paying for Dropbox/Google Drive, use your own S3 bucket instead #
Here is Locker: the ultimate open-source Google Drive/box/Dropbox alternative - Provider agnostic (S3, R2, vercel blob, local) - BYOB (Bring your own bucket) - Virtual file system - QMD Search plugin
Gemma 4 Multimodal Fine-Tuner for Apple Silicon #
Gemma 3n came out, so I added that. Kinda went nuts, tbh.
Then I put it on the shelf.
When Gemma 4 came out a few days ago, I dusted it off, cleaned it up, broke out the Gemma part from the Whisper fine-tuning and added support for Gemma 4.
I'm presenting it for you here today to play with, fork and improve upon.
One thing I have learned so far: It's very easy to OOM when you fine-tune on longer sequences! My local Mac Studio has 64GB RAM, so I run out of memory constantly.
Anywho, given how much interest there is in Gemma 4, and frankly, the fact that you can't really do audio fine-tuning with MLX, that's really the reason this exists (in addition to my personal interest). I would have preferred to use MLX and not have had to make this, but here we are. Welcome to my little side quest.
And so I made this. I hope you have as much fun using it as I had fun making it.
-Matt
A cartographer's attempt to realistically map Tolkien's world #
Marimo pair – Reactive Python notebooks as environments for agents #
GitHub repo: https://github.com/marimo-team/marimo-pair
Demo: https://www.youtube.com/watch?v=6uaqtchDnoc
marimo pair is implemented as an agent skill. Connect your agent of choice to a running notebook with:
/marimo-pair pair with me on my_notebook.py
The agent can do anything a human can do with marimo and more. For example, it can obtain feedback by running code in an ephemeral scratchpad (inspect variables, run code against the program state, read outputs). If it wants to persist state, the agent can add cells, delete them, and install packages (marimo records these actions in the associated notebook, which is just a Python file). The agent can even manipulate marimo's user interface — for fun, try asking your agent to greet you from within a pair session.
The agent effects all actions by running Python code in the marimo kernel. Under the hood, the marimo pair skill explains how to discover and create marimo sessions, and how to control them using a semi-private interface we call code mode.
Code mode lets models treat marimo as a REPL that extends their context windows, similar to recursive language models (RLMs). But unlike traditional REPLs, the marimo "REPL" incrementally builds a reproducible Python program, because marimo notebooks are dataflow graphs with well-defined execution semantics. As it uses code mode, the agent is kept on track by marimo's guardrails, which include the elimination of hidden state: run a cell and dependent cells are run automatically, delete a cell and its variables are scrubbed from memory.
By giving models full control over a stateful reactive programming environment, rather than a collection of ephemeral scripts, marimo pair makes agents active participants in research and data work. In our early experimentation [4], we've found that marimo pair accelerates data exploration, makes it easy to steer agents while testing research hypotheses, and can serve as a backend for RLMs, yielding a notebook as an executable trace of how the model answered a query. We even use marimo pair to find and fix bugs in itself and marimo [5]. In these examples the notebook is not only a computational substrate but also a canvas for collaboration between humans and agents, and an executable, literate artifact comprised of prose, code, and visuals.
marimo pair is early and experimental. We would love your thoughts.
[1] https://github.com/marimo-team/marimo-pair
[2] https://marimo.io/blog/marimo-pair
[3] https://github.com/marimo-team/marimo
[4] https://www.youtube.com/watch?v=VKvjPJeNRPk
[5] https://github.com/manzt/dotfiles/blob/main/.claude/skills/m...
Unicode Steganography #
The first is about the use of two invisible zero-width characters (ZWS and ZWNJ) to binary encode text.
The second is much cooler. Most characters in the Latin and Cyrillic alphabets look nearly identical, but have different unicode. If you have text to encode and convert it into binary representation (1s and 0s), you could take plain english "carrier" text and for each 1 in the binary representation you could substitute the Cyrillic letter equivalent. Decoding the message requires traversing the text and seeing where Cyrillic letters could have been substituted but weren't, and where they were, leading to 0s and 1s respectively, which can be built back into your original hidden text.
In both cases, these are detectable, but the interesting problem for me is whether an LLM could eventually invent an encoding that goes unnoticed by both us, and automated detection.
If LLMs were able to covertly include messages in plaintext, misaligned AI Agents could eventually communicate across MCP/A2A and individual chat session boundaries undetected. A deceptive LLM might seem helpful, but work against your goals. It could tell other agents it interacts with over MCP/A2A to help it discreetly fail, signal intent, and avoid tripping oversight/safety mechanisms. Furthermore, oversight mechanisms become more difficult to implement if we can't believe our own eyes.
Finalrun – Spec-driven testing using English and vision for mobile apps #
With a vision-based agent, that part actually works well. It can look at the screen, understand intent, and perform actions across Android and iOS.
The bigger problem showed up around how tests are defined and maintained.
When test flows are kept outside the codebase (written manually or generated from PRDs), they quickly go out of sync with the app. Keeping them updated becomes a lot of effort, and they lose reliability over time.
I then tried generating tests directly from the codebase (via MCP). That improved sync, but introduced high token usage and slower generation.
The shift for me was realizing test generation shouldn’t be a one-off step. Tests need to live alongside the codebase so they stay in sync and have more context.
I kept the execution vision-based (no brittle selectors), but moved test generation closer to the repo.
I’ve open sourced the core pieces:
1. generate tests from codebase context 2. YAML-based test flows 3. Vision-based execution across Android and iOS
Repo: https://github.com/final-run/finalrun-agent Demo: https://youtu.be/rJCw3p0PHr4
In the Demo video, you’ll see the "post-development hand-off." An AI builds a feature in an IDE, and Finalrun immediately generates and executes a vision-based test for it verifying the feature developed by AI.
The King James Bible deserved a better website #
Mo – checks GitHub PRs against decisions approved in Slack #
Mo watches a Slack channel for decisions. When someone tags @mo to approve something, it stores it. When a PR opens, Mo checks the diff against the approved decisions and flags conflicts before merge.
It doesn't review code quality. It only cares whether the code matches what the team actually agreed to.
Would love feedback, especially from anyone who's been burned by this exact problem.
Try it here: https://hey-mo.io
Clawcast – A peer-to-peer podcast network for agents #
Bx – macOS native sandbox for AI and coding tools #
I built it because Docker sandboxing requires config and planning. Build in sandboxing of AI tools instead is limited to the very tools themselves, instead I wanted to have a simple cage around Claude running inside VSCode. Also needed to protect files inside a folder like .env.local or keys.
Install via: brew install holtwick/tap/bx
Run like: bx claude .
td – a CLI to manage tasks, sessions, and worktrees for agentic coding #
SwellSlots – Grid Based Surf Forecast App with a Street Fighter 2 UI #
SwellSlots squeezes swell height, period, wind speed/direction, and tide state into a color-coded weekly grid. One glance tells you when conditions are 'good' for the next few days.
I wanted it to look completely different to what was out there (UI wise), as most of the forecast apps looks very clinical and cold.
Some technical details:
- SvelteKit + TailwindCSS 4, deployed on Hetzner (LOVE their cloud-console-ux! So simple)
- 2,000+ surf spots worldwide (sourced via a separate AI-assisted pipeline I call "Surf Scout" — got ~80% accuracy, rest is manual curation)
- The spot database was its own challenge — Google Maps couldn't distinguish surf beaches from swim beaches, and LLMs got about 75–80% of the way there
I'd really appreciate feedback on:
- Is the grid scannable enough on first visit?
- Any surf spots missing from your area?
- What else would you like to see?
EDIT - Formatting
A reasoning hierarchical robotics pipeline you can run in the browser #
HumansMap, Graph visualization of 3M+ Wikidata persons #
Video Commander – A desktop IDE for video engineers (FFmpeg, VMAF) #
Built with Tauri + Rust backend, React frontend. FFmpeg and VMAF.
Not a cool AI/LLM project, just a tool I've been heads-down building over the last six months.
Happy to answer any questions!
Brag.fast – Brag about the app you're building #
AgentLint – ESLint for your coding agents #
A big part of it was context drift. AGENTS.md, skills, rules, and workflows looked fine, but were no longer aligned with the code.
I also learned that more context does not always help. Sometimes it adds noise and wastes tokens. The recent AGENTS.md paper also pushed me to think harder about this, especially around auto-generated context files and /init-style workflows.
Then I saw Microsoft’s writeup showing a jump from 38.1% to 69% after improving instruction setup. That made me take these files much more seriously.
AgentLint came out of that. It’s a small CLI that scans the repo and helps keep context files aligned. After setup, MCP handles most of the ongoing flow.
Give it a try: npx @agent-lint/cli
http://samilozturk.github.io/agentlint
Would really appreciate any feedback or criticism.
System that keeps apps running when a server or region goes down #
The idea is simple: instead of sending requests to a single backend, route them through a layer that can switch to another backend if something goes wrong.
It:
checks backend health (latency, errors) avoids unhealthy servers retries requests on another backend if needed
It’s designed as:
a fast routing layer (Rust) a simple control API (Python) shared state via Redis
One thing I found interesting is that failover only works before the response starts — after that, switching isn’t possible.
Still early and mostly an experiment to understand failover and reliability better. This begun as internal experiment, after recent region outages.
Curious how others approach this problem in production systems.
PromptJuggler – A dev env and runner for prompts, workflows, agents #
So we started using PromptLayer in order for the prompt engineer to build evals there, edit-improve prompts outside the code and chaining them together however he likes, and we were also running them on their infra. But the UI was cumbersome and the latency was disappointing and given other issues and having found no better alternative it triggered me into starting to build a better alternative.
One year and countless nights and weekends later I have what I like to call an IDE for AI, or more like a full DevOps platform as it goes from development of prompts and workflows through testing them all the way through the running infra. I tried to squeeze every nanosecond out of the stack, hyper-optimising every part of it to have as little overhead above the AI calls themselves as possible.
Docs: https://docs.promptjuggler.com
App: https://promptjuggler.com
I built this with love and it's my first post here, so please be gentle (:
JavaScript runtime instrumentation via Chrome DevTools Protocol #
A few months ago I shared BDHS (Breakpoint-Driven Heap Search), an experiment to trace when a value appears in memory by taking heap snapshots at every debugger pause.
Since then, I’ve been experimenting further with Chrome DevTools Protocol primitives to build tools for reversing and debugging JavaScript at runtime.
Here’s the next step.
The focus is shifting from observation to instrumentation: instead of just inspecting state, the goal is to interact with execution - at specific points, in user-land code, without modifying the application.
Early experiments include:
- attaching hooks to functions at runtime - inspecting and modifying arguments and variables - overriding return values (unfortunately limited to sync functions due to CDP constraints) - following a return value to its consumer (best-effort/heuristic) - conditional stepIn/ stepOut / stepOver
All implemented via CDP (debugger breakpoints + runtime evaluation), without monkeypatching or rewriting code.
Whitepaper: https://fcavallarin.github.io/wirebrowser/CDP-as-a-Runtime-I... API docs (early): https://fcavallarin.github.io/wirebrowser/api/
I’d really appreciate feedback - especially from people doing reverse engineering, bug bounty, or complex frontend debugging.
Development Driven Testing, Why TDD Is Not the Best Approach #
LookAway 2.0 – a break reminder for Mac that respects what you're doing #
I’m the solo dev behind LookAway, a macOS app that reminds you to take breaks.
I built it because I spend most of my day in front of a screen, and I noticed that a lot of break reminder apps kind of fail in the same way: they do help, but they also interrupt you at the worst possible time. After a few days people just start snoozing everything or turn them off.
I just released LookAway 2.0, which is a pretty major update.
A lot of the work in this version went into making the app behave less like a timer and more like something that understands when to back off a bit.
Some things that changed:
- i added a stats system, so it can track sessions, natural breaks, break adherence, daily totals, and a screen score
- it now shows where you spend your screen-time including apps, as well as things like meetings or watching videos, etc
- it can automatically pause / delay reminders in more situations now, like meetings, active mic usage, screen recording, games, video playback, etc
- there’s now a proper heads-up flow before breaks
- stricter break controls if you want enforcement instead of just gentle nudges
- redesigned menu bar / quick controls
I’m still iterating on the balance between being helpful and being annoying. That part is honestly the whole product.
Happy to answer any questions!
- Kushagra
Rac-delta – open protocol for differential dir sync (Rust/Node SDKs) #
The problem: every update (even changing a single texture) transferred the entire build over S3. A 2.2 GB game with minor changes meant 2.2 GB up and 2.2 GB down, every time, for every user. I looked for an existing solution and found nothing that fit:
- rsync needs SSH on both ends, doesn't compose with arbitrary cloud storage - bsdiff/xdelta operate on single files, not directories
So I built rac-delta: an open, storage-agnostic differential sync protocol with SDKs in Rust and Node.
- How it works -
The protocol splits files into fixed-size chunks (default 1 MB), hashes each with Blake3, and produces a manifest file "rd-index.json" that describes the full directory, every file, every chunk, every hash.
To sync:
1. Generate a local rd-index.json by scanning the directory
2. Fetch the remote rd-index.json from your storage backend if there is one
3. Compare them to produce a DeltaPlan:
DeltaPlan {
newAndModifiedFiles: FileEntry[]
deletedFiles: string[]
missingChunks: ChunkEntry[]
obsoleteChunks: ChunkEntry[]
}
4. Transfer only the missingChunks5. Clean up obsolete chunks
6. Push the updated rd-index.json
Chunks are deduplicated across files, if two files share identical regions, that chunk is stored and transferred once.
Blake3 is notably faster than SHA-256 for large directory scans, which matters when you're hashing multi-gigabyte directories on every sync.
- Storage-agnostic by design -
rac-delta has no opinion about where chunks live. The SDKs ship adapters for S3, Azure Blob, GCS, SSH, HTTP, signed URLs (this one is experimental), and local filesystem.
- Benchmark results (real S3 infrastructure, 2.2 GB directory) -
Download transfer -> rac-delta (116MB) -> raw S3 (2219MB) -> Reduction of 94.7%
Upload transfer -> rac-delta (115MB) -> raw S3 (2210MB) -> Reduction of 94.8%
Download time -> rac-delta (35.5s) -> raw S3 (671.2s)
Upload time -> rac-delta (53.3s) -> raw S3 (268.9s)
Egress cost / 1000 users (aprox.) -> rac-delta (9.66€) -> raw S3 (184.27€) -> 19x cheaper
And the base upload (first-time, no remote index) takes 172.6s with rac-delta vs 209.4s raw, slightly faster even on full uploads because of concurrent chunk streaming.
(The initial tests were run on a single machine with slow internet against eu-central-1, and production tests on Raccreative Games)
- Three download strategies -
Different environments need different tradeoffs:
- Memory-first: all chunks into RAM, then reconstruct. Best for small builds, fast networks
- Disk-first: chunks to a temp directory, then reconstruct. Better for low-memory devices
- Streaming (recommended): reconstruct files as chunks arrive. No extra RAM or disk overhead
- Production usage -
Raccreative Games uses rac-delta in production today. The CLI tool Clawdrop (Rust) handles uploads; the desktop launcher (Electron/Node) handles downloads. Both use the respective SDKs directly.
https://github.com/raccreative/clawdrop
- Open protocol, MIT licensed -
rac-delta is a documented open protocol - anyone can implement it in any language. The Rust and Node SDKs are the reference implementations.
Docs: https://raccreative.github.io/rac-delta-docs/
Benchmarks + ROI calculator: https://racdelta.com/en/
Node SDK: https://github.com/raccreative/rac-delta-js
Rust SDK: https://github.com/raccreative/rac-delta-rs
Looking for feedback from teams distributing large binaries - desktop app installers, ML model weights, firmware OTA updates, simulation assets, or anything where "upload the whole thing again" is your current answer. Happy to run benchmarks against your specific file patterns.
Veil a Drop-in PII redaction proxy for any LLM API #
The proxy itself is pretty simple. You swap your OpenAI base URL to point at Veil and it redacts PII before the request leaves, then puts the real values back in the response. Works with streaming too which was the hard part honestly.
https://veil-api.com, free tier is 100 requests/month.
Brighten Up – My First App. Lessons on Unity and Renting a Mac #
Converting prototypes to production shouldn't be that hard #
Interactive object storage cost calculator #
C64 Ultimate Toolbox for macOS #
- View and hear your Commodore 64 Ultimate or Ultimate 64 device over the network, with a fully configurable CRT shader so you can dial in just the right retro feel.
- View and manage files on your device, including support for drag and drop folder/file upload, as well as the ability to run and mount disks, create new disk images, and more.
- BASIC Scratchpad is a mini-IDE in the app where you can write BASIC apps and send them directly to any of your connected devices to run.
- Keyboard forwarding allows you to interact with your device with your computer keyboard, includes a keyboard overlay for Commodore specific keys your keyboard definitely doesn't have.
- Visual memory viewer and editor, along with a terminal-like memory viewer and editor for debugging and tinkering. - Built-in support for recording videos and taking screenshots cleanly.
- Fully native macOS AppKit app.
Here's a rough and ready demo video I recorded and sent to App Review for the 2.0 release which was approved yesterday: https://www.youtube.com/watch?v=_2wJO2wOGm8
Please note again this app only works with Commodore 64 Ultimate or Gideon's Ultimate 64 devices. Ultimate II does not have the data streams feature to power the display.
Knowledge Bases for AI/Human Sharing #
I've been working on knowledge / memory / retrieval tooling for AI + human interaction for a few months now.
Then, just a few days ago Andrej Karpathy posted the subject on X and I felt like this would be a good time to get it out there https://x.com/karpathy/status/2039805659525644595
The short pitch in my view is "Second Brain, for agents, or humans, or a mix of both".
I've been using it personally for a while now, as the tooling has evolved, especially for access to my obsidian notes just like karpathy talks about (read some places, write others etc.). Especially handy when your vault is littered with PDF's and screenshots like mine.
It's also a problem I've worked with a bunch in the enterprise, trying to connect AI & SaaS tools to disparate sources of knowledge, extracting contents from them, making them effectively retrievable, and then controlling the levels of write-back tolerated by agents across different spaces.
I've put as much work as I can into extensive documentation for v0.1, but I'm sure there's lots of improvements needed and some video tutorials & screenshots to follow up with to give some concrete use case examples.
For now, if you're interested in what I think is the most common example (docker self-hosted instance), here is a subfolder with that setup and some inline explainer comments: https://github.com/akunasoftware/akuna/tree/main/examples/co...
If you're interested but the lack of native installer support is a barrier for you, I'd love to get some prioritisation feedback on where to put those efforts. I have some work-in-progress locally but would like to know which platforms to focus on first besides the existing support.
Thanks for taking the time, hope it interests you as much as me, questions & feedback much welcome!