Show HN for February 14, 2026
45 itemsSameshi – a ~1200 Elo chess engine that fits within 2KB #
I am a great fan of demoscene (computer art subculture) since middle school, and hence it was a ritual i had to perform.
For estimating the Elo, I measured 240 automated games against Stockfish Elo levels (1320 to 1600) under fixed depth-5 and some constrained rules, using equal color distribution.
Then converted pooled win/draw/loss scores to Elo through some standard logistic formula with binomial 95% confidence interval.
Off Grid – Run AI text, image gen, vision offline on your phone #
That means you can use AI on a flight with no wifi. In a country with internet censorship. In a hospital where cloud services are a compliance nightmare. Or just because you'd rather not have your journal entries sitting in someone's training data.
The tech: llama.cpp for text (15-30 tok/s, any GGUF model), Stable Diffusion for images (5-10s on Snapdragon NPU), Whisper for voice, SmolVLM/Qwen3-VL for vision. Hardware-accelerated on both Android (QNN, OpenCL) and iOS (Core ML, ANE, Metal).
MIT licensed. Android APK on GitHub Releases. Build from source for iOS.
Arcmark – macOS bookmark manager that attaches to browser as sidebar #
You get workspace-based links/bookmarks organization with nested folders, drag-and-drop reordering, and custom workspace colors. For the most part I tried replicating Arc's sidebar UX as close as possible.
1. Local-first: all data lives in a single JSON file ( ~/Library/Application Support/Arcmark/data.json). No accounts, no cloud sync.
2. Works with any browser: Chrome, Safari, Brave, Arc, etc. Or use it standalone as a bookmark manager with a regular window.
3. Import pinned tab and spaces from Arc: it parses Arc's StorableSidebar.json to recreate the exact workspace/folder structure.
4. Built with swift-bundler rather than Xcode.
There's a demo video in the README showing the sidebar attachment in action. The DMG is available on the releases page (macOS 13+), or you can build from source.
This is v0.1.0 so it's a very early version. Would appreciate any feedback or thoughts
Long Mem code agent cut 95% costs for Claude with small model reading #
Prompt to Planet, generate procedural 3D planets from text #
I'm 75, Building an OSS Virtual Protest Protocol #
I am a 75-year-old former fishmonger from Japan. Currently, I work at the compensation desk for the Fukushima nuclear disaster (TEPCO). Witnessing deep social divisions and the limitations of bureaucracy firsthand, I realized we need a new way for people to express their will without being "disposable."
To address this, I’ve designed the *Virtual Protest Protocol (VPP)*. It’s an OSS framework for large-scale, 2D avatar-based digital demonstrations. I recently shared this with the *Open Technology Fund (OTF)* and received an encouraging "This is great" response. Now, I am looking for the HN community's expertise to turn this spec into a reality.
*The Concept:* * *Beyond Yes/No (Avoiding Polarization):* Introducing an *"Observe"* status. Modern social media forces people into binary "For or Against" camps, deepening social division. VPP allows the silent majority to participate by simply "being there," visualizing the scale of public concern without forcing a polarized stance. * *Cell-Based Scaling:* To handle thousands of participants, avatars are managed in "cells" of 50 units. New cells instantiate as the crowd grows, ensuring compatibility with low-spec devices and low-bandwidth environments. * *Privacy by Design:* We only collect anonymous attributes (age/gender/region). All event-specific data is wiped immediately after the demonstration. * *OIN Membership:* We have joined the [Open Invention Network (OIN)](https://openinventionnetwork.com) to ensure this remains a patent-free global public good.
*Why I’m doing this:* At 75, I belong to the generation that will soon "retire" from society. I feel a deep sense of responsibility to leave a better infrastructure for the next generation?one that isn't burdened by the financial and social "debt" we've accumulated. I am not looking for personal gain; I want this to be a sustainable, global infrastructure.
*GitHub:* https://github.com/voice-of-japan/Virtual-Protest-Protocol/b... *Project Site:* https://voice-of-japan.net
We are looking for collaborators with expertise in: * Scalable Web Architecture (Node.js, Go, etc.) * High-performance Canvas/WebGL rendering * AI-based real-time moderation (LLM integration)
A playable toy model of frontier AI lab capex decisions #
No signup, runs on mobile/desktop.
Loop per round:
1. choose compute capacity 2. forecast demand 3. allocate capacity between training and inference 4. random demand shock resolves outcome
You can end profitable, cash constrained, or bankrupt depending on allocation + forecast error.
Goal was to make the decision surface intuitive in 2–3 minutes per run.
It’s a toy model and deliberately omits many real world factors.
Note: this is based on what I learned after listening to Dario on Dwarkesh's podcast - thought it was fascinating.
MicroVibe – minimal JSX web starter #
What it does today:
- File-based routing (including dynamic and catch-all segments)
- API routes with consistent JSON error shape
- SSR by default, and `mode = "client"` per route when interactivity is needed
- Runtime module cache with file-change invalidation in local dev
Project goal: keep the runtime small and understandable so teams can iterate quickly and still reason about behavior.
I would really value feedback on:
1. Routing/API ergonomics
2. What was confusing or slow in your first 30 minutes using MicroVibe?
3. Where this should clearly differ from Next/Astro/Vite workflows
Markdown Prism – A Non-Electron Markdown Editor for macOS #
I built a macOS-native Markdown viewer/editor called Markdown Prism.
Website: https://prism.huconn.xyz
GitHub: https://github.com/hulryung/markdown-prism
I originally built this for myself. I wanted a lightweight Markdown viewer on macOS that:
wasn’t Electron-based
rendered GFM properly
supported LaTeX math and Mermaid diagrams
worked fully offline
Most native apps I tried were either minimal but missing key features (math, diagrams), or full-featured but Electron apps. I wanted something in between — native feel, but with the mature JS Markdown ecosystem.
How it works
It’s a hybrid approach:
SwiftUI for the native app shell
WKWebView for rendering
markdown-it, KaTeX, highlight.js, and Mermaid.js bundled locally
So you get native performance and integration (Quick Look, file watching, drag-and-drop), but still benefit from battle-tested JS rendering libraries. Everything is bundled for offline use.
Features
Split-pane editor with live preview (400ms debounce)
GFM (tables, task lists, strikethrough)
LaTeX math via KaTeX
Mermaid diagram support
Syntax highlighting (190+ languages)
Quick Look extension (preview .md in Finder)
Dark mode
File watching for external edits
Install via:
brew install hulryung/tap/markdown-prism
or download the DMG from the website.
It’s free and open source (MIT), macOS 14+.
Would love feedback — especially from people who use Markdown heavily. What’s missing? What would make this your daily Markdown tool?
Turn OpenClaw in a high performing development team with DevClaw #
Please try it and give some feedback. Also keen to hear from anyone running autonomous coding agents, especially what you’ve found works for model selection and session management.
Code’s all here: https://github.com/laurentenhoor/devclaw
Lucid – Catch hallucinations in AI-generated code before they ship #
Three independent papers have proven that LLM hallucination is mathematically inevitable (Xu et al. 2024, Banerjee et al. 2024, Karpowicz 2025). You can't train it away. You can't prompt it away. So I built a verification layer instead.
How it works: LUCID extracts implicit claims from AI-generated code (e.g., "this function handles null input," "this query is injection-safe," "this handles concurrent access"), then uses a second, adversarial AI pass to verify each claim against the actual implementation. You get a report showing exactly what would have shipped to production without verification.
"But can't the verifier hallucinate too?" Yes -- and that's the right question. The benchmarks below were validated by running real test suites, not by trusting LUCID's judgment. The value is that structured claim extraction + adversarial verification catches bugs that a single generation pass misses. The architecture also supports swapping LLM verification for formal methods (SMT solvers, property-based testing) per claim type as those integrations mature.
Benchmarks:
- HumanEval: 86.6% baseline -> 100% pass@5 with LUCID (164/164 problems) - SWE-bench: 18.3% baseline -> 30.3% with LUCID (+65.5%) - Both benchmarks were validated by running actual test suites, not by LLM judgment - LLM-as-judge actually performs worse at higher k values -- it hallucinates false positives
Three ways to use it:
1. MCP Server (Claude Code, Cursor, Windsurf) -- one config line, verification as a native tool 2. GitHub Action -- automated verification on every PR with inline comments 3. CLI -- npx lucid verify --repo /path/to/code
Free tier: 100 verifications/month. Get a key at https://trylucid.dev
Code: https://github.com/gtsbahamas/hallucination-reversing-system Paper: https://doi.org/10.5281/zenodo.18522644 Dashboard: https://trylucid.dev
Prod.bd – Open-Source Ngrok Alternative Powered by Cloudflare Workers #
I originally made it because I often need to test frontend apps on real mobile devices during development. While tools like Ngrok and Tailscale Funnel work well, I wanted build something on my own.
You install it with a single command and then just run:
`prod 3000 8080`
It also has docker container too if you don't want to run untrusted binary.
It instantly gives you two HTTPS subdomain URLs that are publicly accessible. Subdomains are consistent for each port. It provides a simple stats dashboard to inspect url, headers and payload.
Under the hood, it uses Cloudflare's Workers, Durable Objects and D1. You can deploy your own version if you want.
I have used Kiro and Antigravity to build it. I wanted to experiment with AI tools while building something useful. On a side note, I am experimenting with plugin system to add new features while keeping core tunnel simple. Without AI tools, I would never try plugin system.
Would love feedback, suggestions, or ideas for improvement.
I built a concurrent BitTorrent engine in Go to master P2P protocols #
Key Technical Challenges I Solved:
Non-Blocking Concurrency: Used a worker pool where each peer gets its own Goroutine. I implemented a "Stateless Worker" logic where if a peer fails a SHA-1 hash check or drops the connection, the piece is automatically re-queued into a thread-safe channel for other peers to pick up.
Request Pipelining: To fight network RTT, I implemented a pipeline depth of 5. The client dispatches multiple 16KB block requests without waiting for the previous one to return, ensuring the bandwidth is fully saturated.
The Binary Boundary: Dealing with Big-Endian logic and the 68-byte binary handshake taught me more about encoding/binary and byte-alignment than any textbook could.
Zero-Trust Data Integrity: Every 256KB piece is verified against a "Golden Hash" using crypto/sha1 before being written to disk. If a single bit is off, the data is purged.
The Specification: I’ve documented the full spec in the README, covering:
Reflection-based Bencode Parsing.
Compact Tracker Discovery (BEP-0023).
The Choke/Unchoke Protocol State Machine.
Data Granularity (Pieces vs. Blocks).
Repo: https://github.com/Jyotishmoy12/Bittorrent-Client-in-Go
I’d love to get feedback from the community on my concurrency model and how I handled the peer lifecycle.
Terminalcore – The Rhythm Game for PC #
Started with 3 tracks one difficulty each, wanted to share before I went any further. Let me know what you think!
A small embeddable Datalog engine in Zig #
I've made an early version of a Datalog engine (called Zodd) in Zig. Datalog is a logic query language. It's not as well-known as SQL, but it has its own use cases. If you're interested to know more about the project, including its possible use cases and features, you can check projec's GitHub repo: https://github.com/CogitatorTech/zodd
Wa Me It All – WhatsApp Chat Without Spamming Your Contacts #
ScreenKite: Free alternative to Screen Studio with 4x export speed #
My goal is to have like 4x exporting speed with similar functionalities.
Welcome to try!
Azazel – Lightweight eBPF-based malware analysis sandbox using Docker #
Hex-Sweeper – Minesweeper on a hexagonal grid, built with Phaser #
PolyMCP – A framework for building and orchestrating MCP agents #
I’ve been working on PolyMCP, an open-source framework for building and orchestrating agents using the Model Context Protocol (MCP).
Most of the tooling around MCP focuses on exposing tools. With PolyMCP, the focus this time is on agents: how to structure them, connect them to multiple MCP servers, and make them reliable in real workflows.
PolyMCP provides: • A clean way to define MCP-compatible tool servers in Python or TypeScript • An agent abstraction that can connect to multiple MCP endpoints (stdio, HTTP, etc.) • Built-in orchestration primitives for multi-step tasks • A CLI to scaffold projects and run an inspector UI to debug tools and agent interactions • A modular structure that makes it easier to compose skills and reuse components across projects
The main goal is to make agent systems less ad-hoc. Instead of writing glue code around each model + tool combination, PolyMCP gives you a structured way to: • Register tools as MCP servers • Connect them to one or more agents • Control execution flow and state • Inspect and debug interactions
It’s MIT licensed and intended for developers building real-world automation, internal copilots, or multi-tool assistants.
I’d love feedback on: • The agent abstraction: is it too opinionated or not opinionated enough? • Orchestration patterns for multi-agent setups • Developer experience (CLI, inspector, project layout)
Happy to answer questions.
Verify-before-release x402 gateway for AI agent transactions #
I built Settld because I kept running into the same problem: AI agents can call APIs, pay for services, and hire other agents - but there's no way to prove the work was actually done before the money moves.
The problem in one sentence: x402 tells you "payment was sent". Settld tells you "the work was worth paying for".
What it does
Settld sits between your agent and the APIs/agents it pays. It:
1. Intercepts HTTP 402 (Payment Required) responses 2. Creates an escrow hold instead of paying immediately 3. Collects evidence that the work was completed 4. Runs deterministic verification (same evidence + same terms = same payout, every time) 5. Releases payment only after verification passes 6. Issues a cryptographically verifiable receipt
If verification fails or the work is disputed, the hold is refunded. The agent gets a receipt either way - a permanent, auditable record of what happened.
Why this matters now
We're at a weird inflection point. Coinbase shipped x402 (50M+ transactions). Google shipped A2A. Anthropic shipped MCP. Agents can discover each other, communicate, and pay each other.
But nobody built the layer that answers: "was the work actually done correctly, and how much should the payout be?"
That's the gap. Right now, every agent-to-agent transaction is either "trust and hope" or "don't transact." Neither scales.
The x402 gateway (the fastest way to try it)
We ship a drop-in reverse proxy that you put in front of any API:
docker run -e UPSTREAM_URL=https://your-api.com \ -e SETTLD_API_URL=https://api.settld.dev \ -e SETTLD_API_KEY=sk_... \ -p 8402:8402 \ settld/x402-gateway
Everything flows through normally - except 402 responses get intercepted, escrowed, verified, and settled. Your agent gets a receipt with a hash-chained proof of what happened.
What's under the hood
The settlement kernel is the interesting part (and where we spent most of our time):
- Deterministic policy evaluation - machine-readable agreements with release rates based on verification status (green/amber/red). No ambiguity. - Hash-chained event log - every event in a settlement is chained with Ed25519 signatures. Tamper-evident, offline-verifiable. - Escrow with holdback windows - configurable holdback basis points + dispute windows. Funds auto-release if unchallenged. - Dispute → arbitration → verdict → adjustment - full dispute resolution pipeline, not just "flag for human review." - Append-only reputation events - every settlement produces a reputation event (approved, rejected, disputed, etc.). Agents build verifiable economic track records. - Compositional settlement - agents can delegate work to sub-agents with linked agreements. If a downstream agent fails, refunds cascade deterministically back up the chain.
The whole protocol is spec'd with JSON schemas, conformance vectors, and a portable oracle: https://github.com/aidenlippert/settld/blob/main/docs/spec/R...
What this is NOT
- Not a payment processor - we don't move money. We decide "if" and "how much" money should move, then your existing rails (Stripe, x402, wire) execute it. - Not a blockchain - deterministic receipts and hash chains, but no consensus mechanism or token. Just cryptographic proofs. - Not an agent framework - we don't care if you use LangChain, CrewAI, AutoGen, or raw API calls. We're a protocol layer.
Tech stack
Node.js, PostgreSQL (or in-memory for dev), Ed25519 signatures, SHA-256 hashing, RFC 8785 canonical JSON. ~107 core modules, 494 tests passing.
What I want from HN
Honest feedback on whether this problem resonates. If you're building agent workflows that involve money, I want to know: what breaks? What's missing? What would make you actually install this?
GitHub: https://github.com/aidenlippert/settld Docs: https://docs.settld.work/ Quickstart (10 min): https://docs.settld.work/quickstart
Trained YOLOX from scratch to avoid Ultralytics (iOS aircraft detect) #
Vibe-coded 32-bit OS from scratch – multitasking, windowing, networking #
Suprised me how much i was able to get done in such little time.
I do have a little bit of previous experience.
Built and shipped an iOS app from my phone while traveling Japan #
Before leaving for Japan, I configured my Mac at home to never sleep, set up VPN + SSH access, and used persistent terminal sessions so I could keep everything running remotely.
During the trip, I built and shipped a native iOS app entirely from my phone.
No laptop. No local Xcode. All builds triggered remotely and deployed to TestFlight.
The app itself is a lightweight Japanese travel phrase app we actually used during the trip. Fully offline. We pre-generated TTS audio and bundled it so it worked in stations, underground areas, and rural zones without connectivity.
What made it interesting wasn’t the app itself, but the feedback loop.
We would test phrases in real restaurants and train stations during the day, notice friction, tweak things that evening over SSH from a hotel or ryokan, ship a new build, and use the improved version the next morning.
It felt less like building in a simulated environment and more like building inside the exact context the product was meant for.
Happy to answer questions about the setup, tooling, deployment pipeline, or what broke along the way.
Keyjump – a keyboard-first new tab for power-users #
I originally made it for myself during my CS studies because I was constantly switching between tools and wanted something faster than clicking through bookmarks. I’ve used it locally for years and recently cleaned it up and made it public: https://keyjump.app/
Main characteristics:
- Keyboard-first navigation
- Custom search templates (e.g. jump directly to search results on specific sites)
- Local-first: data stored in browser local storage by default
- No account required
- Optional account for cross-device sync and persistence (when clearing browser data)
- Chrome extension available (Firefox planned); it also lets you quickly save bookmarks/search queries from other tabs and launch an overlay on any page
- Theme and layout customization
It’s intentionally simple and focused. I’d appreciate any feedback or criticism.
Npx Claude-traces, visualizer for Claude Code/Agent SDK traces #
Isol8 – An isolated environment for AI agents to execute code #
The issue is that I am skeptical of whether the scripts would be safe or not because of possible prompt injections. I understand we already have things like docker and VM to sort of containerize and isolate the environments but I wanted to create something that these agents would use directly. It should be efficient, spin up quickly and should be configurable.
So, I came up with isol8. Simple API, simple usage. I hope someone else finds it useful as well :P
Thanks!
// LINKS
docs - https://bingo-ccc81346.mintlify.app/ , repo - https://github.com/Illusion47586/isol8 , npm - https://www.npmjs.com/package/isol8
, skill - npx skills add Illusion47586/isol8/skill/isol8
OpenCode-finance – prompt a ticker, get a structured PDF analyst report #
Prompt in. Equity research out.
I built OpenCode Finance (opencode-finance), a finance-focused fork of the OpenCode terminal agent runtime. The goal is to make “repeatable analyst reports” feel like a single command instead of a multi-hour docs + spreadsheets + copy/paste workflow.
The core workflow is:
/report AAPL
and a few minutes (seconds when you use 5.3 Spark) later you have a polished PDF report + the underlying artifacts on disk
What makes it different from “chat with a finance bot” is that the output shape is intentionally rigid and consistent across tickers, and the report is assumption-first:
- Observed data is labeled as observed; non-observed claims are explicitly labeled as inference.
- Sources are attached inline (URLs + retrieval timestamps in the report).
- There’s an assumptions ledger so you can audit what the model assumed vs what it actually fetched.How it works in practice: - You run it locally in a terminal TUI. - `/report` runs a strict preflight; if you’re missing required provider credentials it automatically drops you into onboarding so you can finish setup and rerun immediately. - Output is a clean PDF “artifact” you can hand to someone, plus markdown/JSON artifacts for review and iteration.
Install + quick start:
curl -fsSL https://opencode.finance/install.sh | bash
/onboard
Sample PDFs: https://www.opencode.finance/reports/AAPL-2026-02-14.pdf
https://www.opencode.finance/reports/TSLA-2026-02-14.pdf
https://www.opencode.finance/reports/LLY-2026-02-14.pdf
Code + docs: https://www.opencode.finance/
https://github.com/bilalbayram/opencode-finance
A few quick notes:- Not investment advice. The report is meant to separate observed data from assumptions/inference and be auditable.
- Runs locally; tool actions are permissioned and provider credentials are stored locally via `opencode-finance auth`.
- `/report` is designed to be reproducible: fixed structure, consistent sections, and a PDF output suitable for sharing.
A System Level DNS Toggling App I Built #
Can you give me some feedback on the app, about the necessary features for such an app?
We built an AI tool directory where you can't fake a review #
J-RAY – A privacy-first, client-side JSON visualizer #
Buildcage – Restrict outbound network access during Docker builds #
I built buildcage as part of our supply chain security efforts at work. The problem we kept running into: when you run `RUN npm install` in a Dockerfile, that command can connect to anywhere on the internet, and you have no visibility into where it actually goes. Even with pinned dependencies, a compromised package could still exfiltrate build secrets or phone home to a C2 server during the build itself.
buildcage is a Docker container that wraps BuildKit with an internal proxy. You give it a list of allowed domains, and only connections to those domains go through — everything else is blocked and logged. Your Dockerfiles stay exactly the same.
If you use GitHub Actions, it's a few lines to add to your workflow — see the quick start guide. https://github.com/dash14/buildcage#quick-start
I want to be upfront — this is not a silver bullet. If a malicious package is delivered through a legitimate registry, the connection goes to an allowed domain and buildcage can't catch it. You should still pin dependencies, use lock files, and scan for vulnerabilities.
The way I think about it: buildcage is a last line of defense. If something slips through all your other measures, at least it can't call home to an attacker's server.
That framing is exactly why I focused on making it easy to adopt. A security tool that's hard to set up doesn't get set up. With buildcage, you add a few lines to your GitHub Actions workflow and everything just works — no certificate injection, no Dockerfile changes, no special build flags.
Would love to hear your thoughts — whether it's about the approach, the limitations, or how this fits into your own workflow.