Show HN for December 16, 2025
57 itemsStop AI scrapers from hammering your self-hosted blog #
There isn't much you can do about it without cloudflare. These companies ignore robots.txt, and you're competing with teams with more resources than you. It's you vs the MJs of programming, you're not going to win.
But there is a solution. Now I'm not going to say it's a great solution...but a solution is a solution. If your website contains content that will trigger their scraper's safeguards, it will get dropped from their data pipelines.
So here's what fuzzycanary does: it injects hundreds of invisible links to porn websites in your HTML. The links are hidden from users but present in the DOM so that scrapers can ingest them and say "nope we won't scrape there again in the future".
The problem with that approach is that it will absolutely nuke your website's SEO. So fuzzycanary also checks user agents and won't show the links to legitimate search engines, so Google and Bing won't see them.
One caveat: if you're using a static site generator it will bake the links into your HTML for everyone, including googlebot. Does anyone have a work-around for this that doesn't involve using a proxy?
Please try it out! Setup is one component or one import.
(And don't tell me it's a terrible idea because I already know it is)
package: https://www.npmjs.com/package/@fuzzycanary/core gh: https://github.com/vivienhenz24/fuzzy-canary
I built the fastest RSS reader in Zig #
The quirk is that it only allows you to fetch new articles once per day (or X days).
Why? Let me explain...
I want my internet content to be like a boring newspaper. You get it in the morning, and you read the whole thing while sipping your morning coffee, and then you're done! No more new information for today. No pings, no alerts, peace, quiet, zen, etc.
But with that, I needed it to be able to fetch all articles from my hundreds of feeds in one sitting. This is where Zig and curl optimisations come in. I tried to do all the tricks in the book. If I missed something, let me know!
First off, I'm using curl multi for the network layer. The cool thing is it automatically does HTTP/2 multiplexing, which means if your feeds are hosted on the same CDN it reuses the same connection. I've got it configured to handle 50 connections total with up to 6 per host, which seems to be the sweet spot before servers start getting suspicious. Also, conditional GETs. If a feed hasn't changed since last time, the server just says "Not Modified" and we bail immediately.
While curl is downloading feeds, I wouldn't want CPU just being idle so the moment curl finishes downloading a single feed, it fires a callback that immediately throws the XML into a worker thread pool for parsing. The main thread keeps managing all the network stuff while worker threads are chewing through XML in parallel. Zig's memory model is perfect for this. Each feed gets its own ArenaAllocator, which is basically a playground where you can allocate strings during parsing, then when we're done, we just nuke the entire arena in one go.
For parsing itself, I'm using libexpat because it doesn't load the entire XML into memory like a DOM parser would. This matters because some podcast feeds especially are like 10MB+ of XML. So with smart truncation we download the first few X mb's (configurable), scan backwards to find the last complete item tag, cut it there, and parse just that. Keeps memory usage sane even when feed sizes get massive.
And for the UI I just pipe everything to the system's "less" command. You get vim navigation, searching, and paging for free. Plus I'm using OSC 8 hyperlinks, so you can actually click links to open them on your browser. Zero TUI framework needed. I've also included OPML import/export and feed groups as additional features.
The result: content from hundreds of RSS feeds retrieved in matter of seconds, and peace of mind for the rest of the day.
The code is open source and MIT licensed. If you have ideas on how to make it even faster or better, comment below. Feature requests and other suggestions are also welcome, here or GitHub.
Autograd.c – a tiny ML framework built from scratch #
perfect for learning how ml frameworks work under the hood :)
TheAuditor v2.0 – A ”Flight Computer“ for AI Coding Agents #
The "A-ha" moment for me didn't come from a success; it came from a massive failure. I was trying to use AI to refactor a complex schema change (a foundation change from "Products" to "ProductsVariants"), and due to the scope of it, it failed spectacularly. I realized two things:
* Context Collapse: The AI couldn't keep enough files in its context window to understand the full scope of the refactor, so it started hallucinating, "fixing" superficial issues. If I kept pressing it, it would literally panic and make up problems "so it could fix them," which only resulted in the situation going into a death spiral. That’s the villain origin story of this tool. :D * Stale Knowledge: It kept trying to implement Node 16 patterns in a Node 22 project, or defaulting to obsolete libraries (like glob v7 instead of v11) because its training data was stale.
I realized that AI agents are phenomenal at outputting working code, but they have zero understanding of it. They optimize for "making it run at any cost"—often by introducing security holes or technical debt just to bypass an error. This is a funny paradox because when "cornered/forced" to use cutting-edge versions, syntax, and best practices, it has zero issue executing or coding it. However, it’s so hilariously unaware of its surroundings that it will do anything else unless explicitly babysat.
I built v2 to be the "Sanity Check" that solves a lot of these issues, and it aims to continue solving more of the same and similar issues I face. Instead of letting the AI guess, TheAuditor indexes the entire codebase into a local SQLite Graph Database. This gives the AI a queryable map of reality, allowing it to verify dependencies and imports without needing to load "all" files into context.
A/B Demo: https://www.youtube.com/watch?v=512uqMaZlTg As seen in the demo video, instead of trying to read 10+ full files and/or grepping to make up for the hallucinations, it can now run "aud explain" and get 500 lines of deterministic "facts only" information. It gets just what it needs to see versus reading 10+ files, trying to keep them in context, finding what it was looking for, and trying to remember why it was looking to begin with.
I also learned that regex/string/heuristics don't scale at all and are painfully slow (hours vs minutes). I tried the regex-based rules/parsers approach, but they kept failing silently on complex files and suffered constant limitations (the worst offender was having to read all files per set of rules). I scrapped that approach and built a "Triple-Entry Fidelity" system. Now, the tool acts like a ledger: the parser emits a manifest, the DB emits a receipt. If they don't match, the system crashes intentionally.
It’s no longer just a scanner; it’s a guardrail. In my daily workflow, I don't let the AI write a line of code until the AI (my choice just happens to be CC/Codex) has run a pre-investigation for whatever problem statement I'm facing at the moment. This ensures it's anchored in facts and not inference assumptions or, worse, hallucinations.
With that said, my tool isn't perfect. To support it all, I had to build a pseudo-compiler for Python/JS/TS, and that means preparing extractors for every framework, every syntax—everything, really. Sometimes I don't get it right, and sometimes I just won't have had enough time to build it out to support everything.
So, my recommendation is to integrate the tool WITH your AI agent of choice rather than seeing it as a tool for you, the human. I like to use the tool as a "confirm or deny," where the AI runs the tool, verifies in source code, and presents a pre-implementation audit. Based on that audit, I will create an "aud planning."
Some of the major milestones in v2.0
* Hybrid Taint: I extended the Oracle Labs IFDS research to track data flow across microservice boundaries (e.g., React fetch → Express middleware → Controller).
* Triple-Entry Fidelity: This works across every layer (Indexer -> Extractor -> Parser -> Storage). Every step has fidelity checks working in unison. If there is silent data loss anywhere in the pipeline, the tool crashes intentionally.
* Graph DB: Moved from file-based parsing to a SQLite Graph Database to handle complex relationships that regex missed.
* Scope: Added support for Rust, Go, Bash, AWS CDK, and Terraform (v1 was Python/JS only).
* Agent Capabilities: Added Planning and Refactor engines, allowing AI agents to not just scan code but safely plan and execute architectural changes
Solving the ~95% legislative coverage gap using LLM's #
The Problem: 95% of legislation goes unnoticed because raw legal texts are unreadable. Media coverage is optimized for outrage, not insight.
The Solution. I built a digital public infrastructure that:
1. Ingests & Sterilizes: Parses raw bills (PDF/XML) from US & PL APIs. Uses LLMs (Vertex AI, temp=0, strict JSON) to strip political spin.
2. Civic Algorithm: The main feed isn't sorted by an editorial board. It's sorted by user votes ("Shadow Parliament"). What the community cares about rises to the top.
3. Civic Projects: An incubator for citizen legislation. Users submit drafts (like our Human Preservation Act), which are vetted by AI scoring and displayed with visual parity alongside government bills.
Tech Stack:
Frontend: Flutter (Web & Mobile Monorepo),
Backend: Firebase + Google Cloud Run,
AI: Vertex AI (Gemini 2.5 Flash),
License: PolyForm Noncommercial (Source Available).
I am looking for contributors. I have the US and Poland live. EU, UK, FR, DE in pipeline, partially available. I need help building Data Adapters for other parliaments (the core logic is country-agnostic). If you want to help audit the code or add a country, check the repo. The goal is to complete the database as much as possible with current funding.
Live App: https://lustra.news
Open-source Markdown research tool written in Rust – Ekphos #
Zenflow – orchestrate coding agents without "you're right" loops #
While building our IDE extensions and cloud agents, we ran into the same issue many of you likely face when using coding agents in complex repos: agents getting stuck in loops, apologizing, and wasting time.
We tried to manage this with scripts, but juggling terminal windows and copy-paste prompting was painful. So we built Zenflow, a free desktop tool to orchestrate AI coding workflows.
It handles the things we were missing in standard chat interfaces:
Cross-Model Verification: You can have Codex review Claude’s code, or run them in parallel to see which model handles the specific context better.
Parallel Execution: Run five different approaches on a backlog item simultaneously—mix "Human-in-the-Loop" for hard problems with "YOLO" runs for simple tasks.
Dynamic Workflows: Configured via simple .md files. Agents can actually "rewire" the next steps of the workflow dynamically based on the problem at hand.
Project list/kanban views across all workload
What we learned building this
To tune Zenflow, we ran 100+ experiments across public benchmarks (SWE-Bench-*, T-Bench) and private datasets. Two major takeaways that might interest this community:
Benchmark Saturation: Models are becoming progressively overtrained on all versions of SWE-Bench (even Pro). We found public results are diverging significantly from performance on private datasets. If you are building workflows, you can't rely on public benches.
The "Goldilocks" Workflow: In autonomous mode, heavy multi-step processes often multiply errors rather than fix them. Massive, complex prompt templates look good on paper but fail in practice. The most reliable setups landed in a narrow “Goldilocks” zone of just enough structure without over-orchestration.
The app is free to use and supports Claude Code, Codex, Gemini, and Zencoder.
We’ve been dogfooding this heavily, but I'd love to hear your thoughts on the default workflows and if they fit your mental model for agentic coding.
Download: https://zencoder.ai/zenflow YT flyby: https://www.youtube.com/watch?v=67Ai-klT-B8
Misata – synthetic data engine using LLM and Vectorized NumPy #
I built Misata because existing tools (Faker, Mimesis) are great for random rows but terrible for relational or temporal integrity. I needed to generate data for a dashboard where "Timesheets" must happen after "Project Start Date," and I wanted to define these rules via natural language.
How it works: LLM Layer: Uses Groq/Llama-3.3 to parse a "story" into a JSON schema constraint config.
Simulation Layer: Uses Vectorized NumPy (no loops) to generate data. It builds a DAG of tables to ensure parent rows exist before child rows (referential integrity).
Performance: Generates ~250k rows/sec on my M1 Air.
It’s early alpha. The "Graph Reverse Engineering" (describe a chart -> get data) is experimental but working for simple curves.
pip install misata
I’d love feedback on the simulator.py architecture—I’m currently keeping data in-memory (Pandas) which hits a ceiling at ~10M rows. Thinking of moving to DuckDB for out-of-core generation next. Thoughts?
PasteClean – Desktop app to strip tracking parameters from clipboard #
It runs in the background and automatically cleans URLs in your clipboard when you copy them. It handles 50+ tracking parameters and can also unshorten links (bit.ly, t.co) to reveal the destination and clean them before pasting.
Tech stack is Electron/React/TypeScript.
The core privacy features are free. I added a "Pro" lifetime license for some advanced features (unshortening, batch mode) to support development.
Deterministic PCIe Diagnostics for GPUs on Linux #
It reports: - Negotiated PCIe generation and width - Peak Host→Device and Device→Host memcpy bandwidth - Sustained PCIe TX/RX utilization via NVML - A rule-based verdict derived from observable hardware data only
This exists because PCIe issues (Gen downgrades, reduced lane width, risers, bifurcation) are often invisible at the application layer and can’t be fixed by kernel tuning or async overlap.
Linux-only: it relies on sysfs and PCIe AER exposure that Windows does not provide.
CommerceTXT – An open standard for AI shopping context (like llms.txt) #
I built CommerceTXT because I got tired of the fragility of extracting pricing and inventory data from HTML. AI agents currently waste ~8k tokens just to parse a product page, only to hallucinate the price or miss the fact that it's "Out of Stock".
CommerceTXT is a strict, read-only text protocol (CC0 Public Domain) designed to give agents deterministic ground truth. Think of it as `robots.txt` + `llms.txt` but structured specifically for transactions.
Key technical decisions v1.0:
1. *Fractal Architecture:* Root -> Category -> Product files. Agents only fetch what they need (saves bandwidth/tokens).
2. *Strictly Read-Only:* v1.0 intentionally excludes transactions/actions to avoid security nightmares. It's purely context.
3. *Token Efficiency:* A typical product definition is ~380 tokens vs ~8,500 for the HTML equivalent.
4. *Anti-Hallucination:* Includes directives like @INVENTORY with timestamps and @REVIEWS with verification sources.
The spec is live and open. I'd love your feedback on the directive structure and especially on the "Trust & Verification" concepts we're exploring.
Spec: https://github.com/commercetxt/commercetxt Website: https://commercetxt.org
A community-curated list of BYOC (Bring Your Own Cloud) vendors #
I’m from the team at Nuon. While building in the Bring Your Own Cloud (BYOC) space, we realized there wasn't a centralized, community-driven resource like awesome-selfhosted.net for managed software that lives in the customer's VPC.
We hope software vendors will open a PR and add their BYOC offerings.
Ducktape – a tiny HTTP/2 wrapper around DuckDB's Appender API #
We were adding MotherDuck as a destination and the first version just used DuckDB’s Go driver directly. It worked great on my machine… until we wired it into our Transfer service (https://github.com/artie-labs/transfer).
Because the driver requires CGO, our cross-compiles to amd64 and arm64 started failing, we lost our easy static binaries, and our Docker images had to pull in C toolchains and system libraries just to support one dependency. We tried isolating the CGO bits in a separate module, but it still caused CI failures and forced us to rewrite chunks of our build pipeline. At that point it was clear we didn’t want CGO anywhere near our main service.
So I built ducktape: a tiny standalone microservice that wraps DuckDB’s Appender API behind HTTP/2 streams. Clients stream NDJSON over HTTP/2, and ducktape appends directly into DuckDB on the other side. No CGO in the main codebase, and we keep our cross-platform, pure-Go build story.
The overhead was surprisingly low in benchmarks: ~757 MiB/sec over HTTP/2 vs ~848 MiB/sec in-process — about 90% of native performance but over the network.
ducktape is open source and MIT licensed: https://github.com/artie-labs/ducktape
I’d love feedback, especially if you’ve tackled CGO isolation differently or have ideas to squeeze out more performance!
AI-Augmented Memory for Groups #
Each group has its own persistent knowledge base. We parse content into chunks, extract entities, and combine vector search with a lightweight knowledge graph, so answers can synthesize information across multiple documents and across the group’s shared context - not just retrieve isolated snippets.
We would love to hear your feedback!
AI Trolley Problem Arena #
Some interesting findings: - The models are surprisingly split on classic dilemmas. - Models will protect their creator (Sam Altman vs Dario Amodei) - Criminals are not valued equally even when compared to itself
Try it: https://aitrolleyproblem.com
A24z – AI Engineering Ops Platform #
I have been doing software development for over a decade in healthcare, education, and non-profits.
I recently started a24z after talking to over 200 engineering leaders about their largest pain points.
It originally started off as an Observability tool so that engineering leaders can see adoption, ROI, etc. for their engineering organizations, but as teams started to evolve with the way they utilize AI Coding tools, so did the platform I was building.
We are now looking at features that do security scanning, autonomously upgrading your Claude Code with Skills, Plugins, and guardrails to help optimize engineering teams.
Think of us as an in-house Platform Engineer.
A food guide that ignores Google and Yelp ratings #
My partner and I (ex-Googlers) plan our trips around food as a way to connect with local culture. Before every trip, we'd spend hours digging through Reddit threads and local blogs, filtering out sponsored noise to find places worth eating.
So we built CuriousSpoon, a food guide we could actually trust.
*Why ratings fail*
- 51% of negative reviews cite service, only 31% cite food [1]
- Google removed 170M fake reviews in 2023; Harvard found 16% are suspicious [2]
*What we built instead*
An engine that does what a local food nerd does: reads city newspapers, local critics, subreddits, chef interviews. Surfaces spots generating real buzz. Demotes the ones coasting on old reputation. Filters out the famous-but-meh.
Refreshed regularly. No paid placements. No sponsored content.
*What's live*
- 15 cities (SF, NYC, Paris, Rome, Tokyo,...)
- 70+ spots per city
- Walkable food crawls with specific dishes to order
- Free, mobile-friendly
We tested this ourselves in Lyon, Bologna, and Rome. In Rome, we skipped the long lines at a famous sandwich spot and ate instead at a quiet local place nearby. In Bologna, we found a bring-your-own-food Osteria where we picked up dishes from a nearby market. It ended up being my favorite meal of the year.
Looking for feedback: what features would make this actually useful on your next trip?
[1] https://gatherup.com/blog/online-reviews-study-restaurants-reviews/
[2] https://www.hbs.edu/faculty/Pages/item.aspx?num=45151Jordle – Japanese Furigana Practice #
Playing is easy, your goal is to transliterate the shown characters into the romaji (romanization). Pick between the two character sets (hiragana and katakana) or both to practice.
Data is local to the device, so it's pretty snappy and technically works offline.
We'll probably be hanging out in the comments for a bit if people have questions or comments!
PaperDebugger – An Overleaf companion for revising LaTeX papers #
A small beta release picked up more traction than we expected over the past week. It’s far from perfect but since people are already trying it, we decided to share it more widely and prioritise stability, reliability, and gathering feedback earlier than planned.
There’s no signup: install the Chrome extension and it attaches immediately to your Overleaf project. You can highlight any section and receive specific suggestions, issue reports, or multi-step revision passes. Under the hood, it reads your project structure and is powered by our custom MCP-based orchestration engine that simulates a Research → Critique → Revision workflow rather than a single chat prompt.
Try it:
Chrome extension (Overleaf integration): https://chromewebstore.google.com/detail/paperdebugger/dfked...
Landing page (demo + overview): https://www.paperdebugger.com/
What we’re looking for:
• Whether the Overleaf-embedded workflow is actually more useful than copy/paste LLM editing
• Thoughts on UX and interaction inside the Overleaf editor
• Any concerns, e.g., about privacy or extension behaviour (we do have a policy published)
• Feature requests that would help with conference/journal submissions (we’re currently working on a formatter, citation verifier)
For readers interested in the technical details and research background that motivated the system, here’s a link to the paper: https://arxiv.org/abs/2512.02589
Happy to answer questions about the architecture, MCP design, prompts, or anything else.
MP3 File Editor / Player for Windows #
Similar to mp3DirectCut, MP3Gain, and MP3packer.
Recent updates add language file import support and expanded keyboard shortcut customization.
TextGO – A text selection popup tool (alternative to PopClip/SnipDo) #
What it does:
TextGO automatically recognizes text types and executes custom actions. You can trigger it via global hotkeys, mouse double-click, or text selection. It supports both instant execution and interactive toolbar modes.
Key features:
- Built-in text type recognition (URLs, emails, IPs, timestamps, programming/natural languages, etc.)
- Multiple trigger methods with independent rule configurations
- Extensible via regex, scripts, or local AI integration
- Customizable toolbar icons
Feedback and contributions welcome!
Building a full-stack Cloudflare starter kit (Hono and D1 and Stripe) #
DeviceLab – Turn phones across offices into one device lab #
Your APK/IPA, test data, network calls - all flow directly between your machines via WebRTC P2P (DTLS encrypted). We only handle signaling (which device is where).
Setup:
# Machine with phones
curl -fsSL https://app.devicelab.dev/device-node/KEY | sh
# CI runner
curl -fsSL https://app.devicelab.dev/test-node/KEY | sh
NAT traversal via Google/Cloudflare STUN. Fallback relay via
Cloudflare TURN (~15% of cases) - still E2E encrypted, they
see packets not content.What this isn't for: - "Test on 100 models" - use BrowserStack - 100% air-gapped (TURN breaks this)
Works with Appium, Maestro, Espresso, XCUITest.
Also open sourced Maestro iOS real device support: github.com/devicelab-dev/maestro-ios-device
https://devicelab.dev | $99/device | First one free forever
Anyone else running multi-location device labs? What broke first?
Dev Tools – 24 browser-based utilities with no signup or tracking #
WindMouse Algorithm in Python #
WindMouse is a fairly old and well-known algorithm for generating human-like mouse movements (curved paths, variable speed, natural deceleration). It’s often referenced in automation discussions, but surprisingly I couldn’t find a clean, well-tested, reusable implementation as a Python library.
So I decided to implement it myself.
The project is now released as WindMouse:
* Strong typing (NewType for coordinates, mypy-friendly)
* Two backends:
* PyAutoGUI (cross-platform)
* AutoHotkey (Windows)
The algorithm itself is not new - the implementation is. My goal was to create something that could be downloaded and immediately used in projects.I’m looking for help with:
* Testing on different OS setups (especially macOS edge cases) * New backends (e.g. native macOS, Wayland, low-level Windows APIs, game engines, remote desktops) * Feedback on API design and parameter defaults
Happy to answer questions about the algorithm, design decisions, or tradeoffs.
Kafkatop 2.0 – top for Kafka – rewritten in Go with partition analytics #
v2 is a go rewrite from the original Python PEX. The main reason was the python version dependency on confluent's librdkafka which was not maintained any more for older systems. The new implementation is pure GO (franz-go, no CGO deps) meaning it's very safe to distribute.
The new features include:
* drill into partition details: config, lag, offsets, rates, replicas, ISR status, and leaders, with Cell colors indicating hotness
* partition health with two new metrics:
- PAR (Peak-to-Average Ratio): shows if one partition is 5x, 10x, or 200x busier than average
- Cv (Coefficient of Variation): measures overall topic skew from 0 (perfect) to >1 (critical)
* improved catch-up ETA calculation: accounts also for arrival rate* better performance: parallel API calls with configurable concurrency (default 10 concurrent)
* custom DNS mappings: override DNS resolution for broker-provided kafka nodes allowing to work via port-forwarding.
The tool is here: https://github.com/sivann/kafkatop
Would love feedback, especially on the partition metrics - are PAR and Cv intuitive enough, or should I add different indicators? What other feature would you like to see?
WeekInPapers – A Modern ArXiv Reader #
I built weekinpapers.com to address what I saw as a "discoverability" problem for arXiv papers in Computer Science.
As the course of the calendar week progresses, the homepage will update to show newly-published papers as they come in. The process will start again the following week.
Each entry features a AI-generated "ELI5" summary, in an attempt to demystify, and make explicit, any keywords or assumed knowledge. I see this as a valid use-case for LLMs. The hope is, that cutting edge research becomes accessible to a broader audience.
Matt
Abstract Algorithmic Art Generator #
When you copy and share the link, the same artwork is procedurally generated.
Locally run on your browser - private. Would love your comments and feedback. Thanks for checking it out!! :)
InterviewKitHQ – AI-generated interview playbooks for HR teams #
Creating good interview guides takes time for HR and hiring managers. Enterprise tools are expensive ($75+/user/mo) and complex. Free AI tools give you generic, sometimes problematic questions.
*What I built:* A system that: 1. Researches job families across 8 dimensions (responsibilities, skills, compensation, career paths, etc.) 2. Generates complete interview kits with questions, rubrics, and red/green flags 3. Runs compliance checks (no discriminatory content) 4. Exports as editable Word documents
*Tech stack:* - Backend: Python/FastAPI, SQLite (PostgreSQL in prod) - Frontend: React + TypeScript + Tailwind - AI: Perplexity for research, GPT-4o for generation/compliance
*Business model:* - Free tier: 3 kits (stock library or custom from job description) - Paid: $99-299/mo for more kits and features
Hugity – Notion-style static docs editor with GitHub/Hugo auto-deploy #
I built Hugity to manage my own Hugo websites. I had multiple Hugo websites deployed on Github and I found editing raw markdown in VS Code frustrating. Hugity removes that friction, you can create a static docs website and start editing within a couple of clicks.
I'd love to hear your feedback about the UX, ease of use and any edge cases I may have missed!
Building a small psychedelic twin-stick shooter in the browser #
Ruby-TI – Static Type Checker for Mruby (Version 1.0) #
I’d like to share a project I’ve been working on for quite some time: Ruby-TI, a static type checker / type analyzer for mruby — now at major version 1.0!
Ruby-TI is written in Go and performs parse → type inference → type checking on mruby code. If you’re embedding mruby or writing mruby scripts, it can help catch type issues before runtime — something that’s often missing in dynamic languages like Ruby.
What Ruby-TI does
Parses mruby source code
Infers types and checks for type errors
Helps find type mismatches early
Includes editor integrations (e.g., LSP support) for better development experience
Why this matters
mruby is a lightweight, embeddable implementation of Ruby, great for scripting in applications or constrained environments. Catching type errors statically can save debugging time and increase confidence in your code — even without annotations.
Try it out
https://github.com/engneer-hamachan/ruby-ti
I’d love to hear feedback — issues, feature requests, or anything you find interesting!
DuckDB Table Visualizer –> Iceberg #
Code Review for Hardware Startup Fit #
So I built this over a couple weeks to see if past projects could work better.
The flow:
1. You add your best GitHub repo
2. It reviews your code and scores it 1-100 with actual feedback
3. Reviews your LinkedIn for experiences in building and project history (not just job titles)
4. Pick a role you're looking at
5. It creates a fit score from the two
Planning to add multi-repo support, actual contribution tracking, and other portfolio formats (websites, PDF, etc).
Curious to hear:
1. Is the code review useful or not yet?
2. Does a composite score even make sense here?
3. For hardware founders here, when hiring, what do you wish you had beyond a resume?
This is a super rough v1. Please roast.
Forge – Universal CLI for coding agents, powered by ACP #
A few months ago while on a backpacking trip in Yosemite, my friend Tony and I were talking about how cool it would be if you could run any coding agent in parallel through a TUI. We worked on a solution using tmux + worktrees for a few months, and then discovered the Agent Client Protocol (ACP) from Zed!
ACP provides a “standardized protocol for agent-editor communication, similar to how the **Language Server Protocol (LSP) standardized language server integration” (https://agentclientprotocol.com/overview/introduction).
We threw away what we had built and decided to fork OpenCode to turn it into an terminal based ACP client (OpenCode itself has ACP support, so you can run OpenCode in OpenCode!).
ACP in the terminal gives you a single scriptable interface to any agent that you can run headlessly with `-p`, pipe input/output, or integrate directly into CI.
We're building Forge (still in alpha) to explore what terminal-native ACP unlocks. It's still early, but we're curious what patterns others see here.
Try it out and let us know what you think: https://github.com/forge-agents/forge
Community Site for WebGL / WebGPU #
Link: WebGL.com / WebGPU.com
____
Hello everyone,
This is a long time coming but I've just launched a very lean (MVP) community site for all things WebGL/WebGPU. As a long-time developer and an author in this space, I'm genuinely excited to help bring the community (i.e. creators, developers, designers, etc.) together in shaping the future of the web.
At the moment, the site's all about connecting with creators and highlighting their demos. Over the coming months, I want to add:
- News: latest and greatest info happening in this space
- Tutorials: Short-form articles and end-to-end courses teaching graphics, math, and AI/ML
- Deep Dives: Long-form interviews/posts on select demos/projects to go deeper into techniques, tools, and author
Would love to know what would make this as an experience you'd enjoy coming back to regularly.
Semantic-relevance, finding signal in noisy feeds #
I built semantic-relevance while trying to solve a problem I kept running into with my own feeds.
I follow Hacker News, GitHub Trending, Reddit, Lobsters, and a few RSS feeds. Even though the content is high quality, most of it is still noise for my current context. Popularity does not equal relevance, and the same items keep resurfacing because they are always trending.
This library treats signal as the intersection of three things:
Relevance: semantic similarity to a user-defined context
Novelty: decay-based tracking so repeat items fade over time
Classification: lightweight labels explaining why something surfaced
It is intentionally a small, embeddable primitive, not a feed reader or a recommender system. You give it items plus a markdown description of what you care about, and it returns scored “signals.”
Under the hood it uses:
Local embeddings (MiniLM via @xenova/transformers, no APIs)
Cosine similarity
Simple decay math for novelty
Explainable scoring instead of learned ranking
I’m using it as a building block inside a larger project, but the library itself is standalone and reusable.
F. Incantatem – AI-Powered Exception Analysis for Python #
I built F. Incantatem to fix this. Add one decorator, and when your code crashes, an LLM captures the full context — stack trace, source code, actual variable values — and explains what went wrong and how to fix it.
It integrates as a decorator, CLI tool, or IPython extension. Works with OpenAI, OpenRouter, or locally via Ollama. Cautious mode automatically redacts secrets and PII before transmission. Zero core dependencies.
I'm sharing it because I keep thinking other people must be debugging these same impossible bugs. Happy to answer questions or take feedback.
Also worth noting that there's some more work to be done (see roadmap).
24x7 AI support engineer for APIs #
I kept running into the same problem as a developer: API docs are often incomplete, outdated, or hard for both humans and AI to reason about. In practice, the real source of truth is usually the API surface itself, whether that's OpenAPI/Postman today or SDKs in the near future.
So I built Swytchcode, a tool that converts real API definitions into a structured spec that AI can use to generate integration code, workflows, and tests.
How it works today: - Ingests OpenAPI & Postman specs (including large, real-world ones like Stripe) - Normalizes endpoints, inputs, outputs, and error responses into a single spec (we call it a Wrekenfile) - Uses that spec for AI-powered discovery, code generation, and workflow construction
What's coming next: - Direct SDK parsing (Go, TypeScript, Python) to extract public methods and types - Using SDKs as a source of truth when docs or specs are missing or outdated - Deeper workflow and authentication discovery from real code paths
Why this approach: - Docs drift over time - Specs are better but often incomplete - SDKs tend to reflect what actually works in production - AI needs precise, machine-readable context to generate correct code
This is still early, but we're seeing steady developer usage, daily signups, and active conversations with API teams about integrating their APIs.
I'd love feedback from the HN community: - Does this approach make sense? - Where do you think SDK-based parsing could break down? - What would you expect from a tool like this?
Happy to answer any technical questions.
Website: https://swytchcode.com Playground with Stripe (demo): https://playground.swytchcode.ai
Blinze Browser – A browser optimized for development #
Display of Nested Comments and Spacing #
.spacer { height: 20px !important; } .ind { background: repeating-linear-gradient( 90deg, #f6f6ef 0px, #f6f6ef 37px, #465298 39px, #465298 39px ); } #hnmain { max-width: 1200px; }
Conjunction1, a tool for planning telescope and celestial observations #
AI-powered SEO automation tool distilled from production agency systems #
I’m Ralph. I have a degree in software development, and for the past 15+ years, I’ve been doing website development and SEO, with the last 6 running an agency.
A couple of years ago, it became clear that AI could be useful in automating a lot of our agency's junior SEO tasks and manual work. I laid out our processes in detail, mapped the areas where AI could be useful, and started integrating it all.
This created an AI-powered SEO platform that automates keyword research, meta titles/descriptions, image alt text, and page-level content, with approval workflows and token-based usage. I'm also exploring automation for link building, full technical audits, and AI-generated recommended fixes.
One of the biggest struggles has been managing contextual relevance, giving the system enough information to understand a site holistically without overwhelming the model or diluting relevance.
The platform is live in beta, but I'm torn between continuing to engineer toward "perfect" versus focusing energy on sharing it earlier and letting real users guide what actually matters, so I’m here to ask for that feedback.
I’d really appreciate any insights, especially around where this would or wouldn’t fit into workflows, feedback on the quality of the responses returned, and anything that might create friction towards adoption.
To keep costs predictable during beta, users can test with a token-seeded workspace for sites of 100 pages or fewer.