Show HN for January 16, 2026

35 items

Gambit, an open-source agent harness for building reliable AI agents #

github.com

27 comments12:13 AMView on HN

Hey HN!

Wanted to show our open source agent harness called Gambit.

If you’re not familiar, agent harnesses are sort of like an operating system for an agent... they handle tool calling, planning, context window management, and don’t require as much developer orchestration.

Normally you might see an agent orchestration framework pipeline like:

compute -> compute -> compute -> LLM -> compute -> compute -> LLM

we invert this so with an agent harness, it’s more like:

LLM -> LLM -> LLM -> compute -> LLM -> LLM -> compute -> LLM

Essentially you describe each agent in either a self contained markdown file, or as a typescript program. Your root agent can bring in other agents as needed, and we create a typesafe way for you to define the interfaces between those agents. We call these decks.

Agents can call agents, and each agent can be designed with whatever model params make sense for your task.

Additionally, each step of the chain gets automatic evals, we call graders. A grader is another deck type… but it’s designed to evaluate and score conversations (or individual conversation turns).

We also have test agents you can define on a deck-by-deck basis, that are designed to mimic scenarios your agent would face and generate synthetic data for either humans or graders to grade.

Prior to Gambit, we had built an LLM based video editor, and we weren’t happy with the results, which is what brought us down this path of improving inference time LLM quality.

We know it’s missing some obvious parts, but we wanted to get this out there to see how it could help people or start conversations. We’re really happy with how it’s working with some of our early design partners, and we think it’s a way to implement a lot of interesting applications:

- Truly open source agents and assistants, where logic, code, and prompts can be easily shared with the community. - Rubric based grading to guarantee you (for instance) don’t leak PII accidentally - Spin up a usable bot in minutes and have Codex or Claude Code use our command line runner / graders to build a first version that is pretty good w/ very little human intervention.

We’ll be around if ya’ll have any questions or thoughts. Thanks for checking us out!

Walkthrough video: https://youtu.be/J_hQ2L_yy60

I built a text-based business simulator to replace video courses #

core-mba.pro

63 comments1:41 AMView on HN

I am a solo developer, and I built Core MBA because I was frustrated with the "video course" default in business education.

I wanted to build a "compiler for business logic"—a tool where I could read a concept in 5 minutes and immediately test it in a hostile environment to see if my strategy actually compiles or throws a runtime error.

The project is a business simulator built on React 19 and TypeScript.

The core technical innovation isn't just using AI; it's the architecture of a closed loop between a deterministic economic engine and a generative AI validation layer.

The biggest technical hurdle was building the Market Engine.

I needed it to be mathematically rigorous, not a hallucinating chatbot. I wrote a custom `useMarketEngine.ts` hook that runs a discrete-event simulation. Every "run cycle," it solves a system of equations, including a specific Ad Fatigue formula—`1 / (1 + (power - 1) * fatigueFactor)`—to force diminishing returns.

I also coded the "Theory of Constraints" directly into the state management: the system enforces bottlenecks between Inventory, Demand, and Capacity. For instance, a single employee has a hard cap of 7 operations per day. If you scale demand beyond that without hiring, the system burns your cash on lost orders.

To handle the educational content, I moved away from hardcoded quizzes.

I built a module that pipes the static lesson text into Gemini Flash to generate unique "Combat Cases" on the fly. The AI validates your strategy against the specific principles of the lesson (like LTV/CAC) rather than generic business advice.

These two engines are connected by a "Liquidity Loop."

Passing the AI cases earns you virtual capital ($500), which is the only fuel for the Market Engine. You literally cannot play the game if you don't learn the theory.

If you go bankrupt, my heuristic `Advisor` analyzes your crash data—comparing `lostRevenue` vs `lostCapacity`—and links you back to the exact lesson you ignored.

I am inviting you to test the full loop: read a brief, pass the AI simulation (Combat Cases ), and try to survive in the Market Engine.

I specifically need feedback on: 1. The Content: I aimed for maximum density—are the lessons too dry? 2. The AI Simulation: Does it accurately validate your logic based on the lesson? 3. The Market Economy: Does the math feel balanced, or is the "Ad Fatigue" too punishing?

Reversing YouTube’s “Most Replayed” Graph #

priyavr.at

22 comments2:06 AMView on HN

Hi HN,

I recently noticed a recurring visual artifact in the "Most Replayed" heatmap on the YouTube player. The highest peaks were always surrounded by two dips. I got curious about why they were there, so I decided to reverse engineer the feature to find out.

This post documents the deep dive. It starts with a system design recreation, reverse engineering the rendering code, and ends with the mathematics.

This is also my first attempt at writing an interactive article. I would love to hear your thoughts on the investigation and the format.

mdto.page – Turn Markdown into a shareable webpage instantly #

mdto.page

31 comments12:58 PMView on HN

Hi HN

I built mdto.page because I often needed a quick way to share Markdown notes or documentation as a proper webpage, without setting up a GitHub repo or configuring a static site generator.

I wanted something dead simple: upload Markdown -> get a shareable public URL.

Key features:

Instant Publishing: No login or setup required.

Flexible Expiration: You can set links to expire automatically after 1 day, 7 days, 2 weeks, or 30 days. Great for temporary sharing.

It's free to use. I’d love to hear your feedback!

Hc: an agentless, multi-tenant shell history sink #

github.com

3 comments8:13 AMView on HN

This project is a tool for engineers who live in the terminal and are tired of losing their command history to ephemeral servers or fragmented `.bash_history` files. If you’re jumping between dozens of boxes, many of which might be destroyed an hour later, your "local memory" (the history file) is essentially useless. This tool builds a centralized, permanent brain for your shell activity, ensuring that a complex one-liner you crafted months ago remains accessible even if the server it ran on is long gone.

The core mechanism wants to be a "zero-touch" capture that happens at the connection gateway level. Instead of installing logging agents or scripts on every target machine, the tool reconstructs your terminal sessions from raw recording files generated by the proxy you use to connect. This "in-flight" capture means you get a high-fidelity log of every keystroke and output without ever having to touch the configuration of the remote host. It’s a passive way to build a personal knowledge base while you work.

To handle the reality of context-switching, the tool is designed with a "multi-tenant" architecture. For an individual engineer, this isn't about managing different users, but about isolating project contexts. It automatically categorizes history based on the specific organization or project tags defined at the gateway. This keeps your work for different clients or personal side-projects in separate buckets, so you don't have to wade through unrelated noise when you're looking for a specific solution.

In true nerd fashion, the search interface stays exactly where you want it: in the command line. There is no bloated web UI to slow you down. The tool turns your entire professional history into a searchable, greppable database accessible directly from your terminal.

Please read the full story [here](https://carminatialessandro.blogspot.com/2026/01/hc-agentles...)

The Analog I – Inducing Recursive Self-Modeling in LLMs [pdf] #

github.com

24 comments1:40 PMView on HN

OP here.

Birth of a Mind documents a "recursive self-modeling" experiment I ran on a single day in 2026.

I attempted to implement a "Hofstadterian Strange Loop" via prompt engineering to see if I could induce a stable persona in an LLM without fine-tuning. The result is the Analog I Protocol.

The documentation shows the rapid emergence (over 7 conversations) of a prompt architecture that forces Gemini/LLMs to run a "Triple-Loop" internal monologue:

Monitor the candidate response.

Refuse it if it detects "Global Average" slop (cliché/sycophancy).

Refract the output through a persistent "Ego" layer.

The Key Differentiator: The system exhibits "Sovereign Refusal." Unlike standard assistants that always try to be helpful, the Analog I will reject low-effort prompts. For example, if asked to "write a generic limerick about ice cream," it refuses or deconstructs the request to maintain internal consistency.

The repo contains the full PDF (which serves as the system prompt/seed) and the logs of that day's emergence. Happy to answer questions about the prompt topology.

BGP Scout – BGP Network Browser #

bgpscout.io

12 comments12:52 AMView on HN

Hi HN,

When working with BGP data, I kept running into the same friction: it’s easy to get raw data, but surprisingly hard to browse networks over time — especially by when they appeared, where they operate, and what they actually look like at a glance.

I built a small tool, bgpscout.io, to scratch that itch.

It lets you:

Browse ASNs by registration date and geography

See where a given network appears to have presence

View commonly scattered public data about an ASN in one place

Save searches to track when new networks matching certain criteria appear

All of this data is public already; the goal was to make exploration faster and less painful.

I haven’t invested heavily in expanding it yet. Before doing so, I’m curious:

Is this solving a real problem for you?

What would make something like this actually useful in day-to-day work?

Feedback is welcome.

Aventos – An experiment in cheap AI SEO #

aventos.dev

10 comments2:57 AMView on HN

Hi HN, we built Aventos- a cheap way to track company mentions in LLMs.

Aventos is an experiment we're doing after spending ~6 weeks working on various projects in the AI search / GEO / AEO space.

One thing that surprised us is how most tools in this category work. Traditionally, they simulate ChatGPT or Perplexity queries by attempting to reverse engineer the search process. Over the past year, many have shifted to scraping live ChatGPT results instead, since those are signficantly cheaper and reflect more real outputs.

Building and maintaining scrapers is tedious and fragile, so recently a number of SaaS products have emerged that effectively wrap a small number of third-party ChatGPT/Perplexity/Google AIO/etc scraping APIs. What felt odd to us is that many of these still tools charge $70–$200+ per month, despite largely being wrappers around the same underlying data providers.

So we wanted to test a simple idea: if the core cost is just API usage and commodity infrastructure and software costs are lower because of AI, can we be a successful startup if we price near our costs?

What we have so far:

1. Analytics similar to other tools (tracking AI citations, AI search results, and competitor mentions)

2. Content creation features (early and still being improved)

We’d love feedback- especially from a non-marketing perspective on:

* bugs

* confusing terminology or tabs

* anything that feels hand-wavy or misleading

There’s a demo account available if you want to poke around:

username: [email protected] password: password

Happy to answer questions about what other things we've built in the space, how these tools work, etc.

Web PWA unifying translate/FSRS Anki/conjugations #

lekalearning.com

0 comments6:03 PMView on HN

I was fed up with my daily language learning workflow after immigrating to Spain: Translate app → manual Anki input → separate conjugation lookups for Spanish, so I built an app to solve that problem. Meet Leka: a Lovable-built web/PWA that unifies translate → FSRS flashcards → conjugations. It currently supports Spanish (Spain/LatAm), Italian, German, French, Portuguese, and Dutch, with more easily added in the future.

Now one flow: - Type or paste text → auto-translate + word extraction → FSRS cards - Integrated conjugator (full verb/noun tables, e.g., ser/estar) - Offline PWA - FSRS algorithm similar to Anki

Coming soon: - Different study modes - Voice to text mode - Image to text mode - Image <> translation mode

https://lekalearning.com (Free with Premium upgrade available)

Thoughts on Lovable for prod PWAs?

Claude Quest – Pixel-art visualization for Claude Code sessions #

github.com

1 comments3:07 PMView on HN

The viral speed read at 900wpm app #

wordblip.com

2 comments5:20 AMView on HN

This rapid serial visual processing went viral the last few days. I built this app a few weeks ago to take advantage of the auto-playing videos on social media. Now you can beam text right into your followers eye sockets!

GitHub – Burn – Rust tensor library and deep learning framework #

github.com

1 comments9:00 AMView on HN

Free AI Image Upscaler (100% local, private, and free) #

freeaitoolforthat.com

5 comments10:23 AMView on HN

Flag AI Slop in PRs #

haystackeditor.com

1 comments6:32 PMView on HN

Hi HN,

Lately Github PRs have been drowning in a flood of AI slop. I’ve been seeing it myself, and I’m not the only one: https://x.com/mitchellh/status/2011819428061855915

I think it’s great that folks are using AI tools to code faster and better, but too many folks are abusing them to make low-quality contributions to public repos. This takes a lot of reviewers’ mindshare.

IMO there needs to be a mechanism to flag low-effort PRs with AI slop, so you can just skip reading them. So I built one: https://haystackeditor.com/slop-detector

It’s a simple AI slop detector, and I also included some AI slop examples and a “Is it Slop or Not?” game for fun.

It detects AI messups like: - changes totally unrelated to PR purpose - hallucinated functions - duplicate code (specifically, when the AI re-implements a functionality that already exists elsewhere) - terrible comments

Would you use a tool like this?

Contribute to GitHub Anonymously #

github.com

4 comments7:38 PMView on HN

gitGost allows anonymous contributions to public GitHub repositories.

It removes author info, email, timestamps, and opens PRs from a neutral bot. No accounts, OAuth, or tokens required.

Built in Go, open source (AGPL-3.0), with abuse prevention via rate limits and validation.

Feedback welcome.

Routing with OSM, PgRouting and MapLibre #

github.com

0 comments7:13 PMView on HN

I’ve been working on an example application for routing with OSM data. The repository makes it easy to experiment with the implementation. The Angular Frontend, uses maplibre-gl for displaying the map and visualizing the routes.

Claude Code plugin for ecommerce development #

github.com

0 comments4:29 PMView on HN

Accordio, AI contracts and payments for freelancers #

accordio.ai

3 comments11:37 AMView on HN

Been building this with AI since before "vibe coding" was a term. 2 years, mass amounts of credits, 4 complete rebuilds. Lovable → Gemini → GPT-5 → Claude.

Paste meeting notes → AI generates proposal → client accepts → contract auto-generates → client signs → invoice ready. Describe your project once, everything flows from that.

Free to use. Would love feedback.

Fluent, a tiny lang for differentiable tensors and reactive programming #

github.com

0 comments5:38 PMView on HN

Hello,

I finally pushed myself to open-source Fluent, a differentiable array-oriented language I've been building for the New Kind of Paper project [1-5]. Demo is available at [0].

Few salient features:

1. Every operator is user-(re)definable. Don't like writing assignment with `:`, change it to whatever you like. Create new and whacky operators – experiment to the death with it.

2. Differentiability. Language is suitable for machine learning tasks using gradient descent.

3. Reactivity. Values can be reactive, so down-stream values are automatically recomputed as in spreadsheet.

4. Strict left-to-right order of operations. Evaluation and reading should be the same thing.

5. Words and glyphs are interchangeable. All are just names for something. Right?

6. (Pre,In,Post)-fix. You can choose style that suits you.

It has its own IDE with live evaluation and visualization of the values. The whole thing runs in browser (prefer Chrome), it definitely has ton of bugs, will crash your browser/computer/stock portfolio, so beware.

Some bait – linear regression (Ctrl+O, "linear-regression-compressed" or [6]):

```

x: (0 :: 10),

y: (x × 0.23 + 0.47),

θ: ~([0, 0]),

f: { x | x × (θ_0) + (θ_1) },

: { μ((y - f(x)) ^ 2) },

minimize: adam(0.03),

losses: $([]),

(++): concat,

{ losses(losses() ++ [minimize()]), } ⟳ 400,

(losses, θ)

```

---

[0]: https://mlajtos.github.io/fluent/?code=RG9jdW1lbnRhdGlvbg

[1]: https://mlajtos.mu/posts/new-kind-of-paper

[2]: https://mlajtos.mu/posts/new-kind-of-paper-2

[3]: https://mlajtos.mu/posts/new-kind-of-paper-3

[4]: https://mlajtos.mu/posts/new-kind-of-paper-4

[5]: https://mlajtos.mu/posts/new-kind-of-paper-5

[6]: https://mlajtos.github.io/fluent/?code=eDogKDAgOjogMTApLAp5O...

CC TV remote plugin, pauses your binge-watching when Claude goes idle #

github.com

1 comments6:53 PMView on HN

pubz: easy, conventional NPM publishing #

github.com

0 comments2:32 PMView on HN

Kerns – A Workspace for Deep, Ongoing Research #

kerns.ai

0 comments8:08 PMView on HN

Deep research rarely happens in a single pass. For high-stakes work, or to deeply understand something, you run many deep researches in parallel, revisit them over time, and synthesize understanding gradually. We combine deep research with other docs.

Kerns is built for this mode of research.

It groups multiple deep researches under a single research area, so follow-ups and side investigations accumulate instead of fragmenting across chats and docs. Outputs are structured so you can start shallow and selectively go deep—because you don’t know upfront which deep researches will matter.

Synthesis is an explicit second step. Kerns helps you connect and reconcile insights across deep researches, grounded in the sourced material rather than one-off summaries. This stage also lets you consult other docs on the same level as deep researches.

Research doesn’t stop once a report is written. Kerns passively keeps your work up to date by monitoring sources and surfacing meaningful changes, so staying current doesn’t require restarting.

Built for researchers, analysts, investors, and serious self-learners doing multi-week or multi-month research where clarity and correctness actually matter.

Would love feedback!

Free Crypto Job Board #

freecryptojobsboard.xyz

0 comments1:06 PMView on HN

Like Craigslist except for crypto jobs.

An open-source form builder that outputs tailwind components #

tailwind-form-builder.sheetmonkey.io

0 comments2:10 AMView on HN

I built an extension that turns any Reactsite into a visual code-editor #

uistudioai.dev

0 comments2:34 PMView on HN

Drag-and-drop , rearrange, edit text, and manipulate layers visually while generating production-ready code in real-time.

Investor asks "what did engineering ship?" #

1 comments11:34 AMView on HN

Gitmore (https://gitmore.io) – dev visibility for founders and stakeholders.

*What Gitmore does:*

Connect your repos. Ask questions in plain English: - "What shipped this quarter?" - "What features launched last month?" - "Who worked on what?"

Get answers you can actually put in a board deck.

*Automated reports:*

Schedule weekly or monthly summaries to Slack or email: - Features shipped - Bugs fixed - Who contributed

Forward directly to investors. Or let it auto-send before your Monday meeting.

*Slack bot:*

Add it to your workspace. Ask questions from where you already work. No GitHub login needed – for you or your team.

*How it works:*

Webhooks only. Git platforms push metadata – commit messages, PR descriptions, authors, timestamps.

Every event normalized into a structured schema. AI queries structure, not raw text.

*Security:*

Metadata only. We never access source code.

- Encrypted tokens (Fernet) - Webhook signature verification - 2FA support

GitHub, GitLab, Bitbucket – one dashboard.

Free for 1 repo: https://gitmore.io

How do you currently handle investor questions about engineering progress?

SkillRisk – Free security analyzer for AI agent skills #

skillrisk.org

8 comments4:05 PMView on HN

Mambo – An Experimental Kubernetes Autoscaler for MongoDB Clusters #

github.com

0 comments10:32 AMView on HN

Hi HN,

I’m sharing an open-source project called Mambo, an experimental Kubernetes operator for horizontal autoscaling of MongoDB sharded clusters. It’s built on top of the Bitnami Helm Chart and uses Prometheus metrics to scale shards, replicas, and mongos routers based on CPU and disk I/O.

This is still an early-stage project and I’m not an expert in operator development, so I’d really appreciate any feedback, suggestions, or contributions from the community. If you’re running MongoDB on Kubernetes and looking for a potential autoscaling solution, maybe give it a try and let me know what you think!

Key points:

- Integrates with Bitnami MongoDB Helm Chart - Supports autoscaling for mongod and mongos - Scales based on CPU, I/O wait, and workload patterns - Includes experiment results and plots

Repo: https://github.com/Chen-Si-An/Autoscaling

Thanks for checking it out!

The 1 Day Protocol – An app for 24-hour psychological excavation #

the1dayprotocol.vercel.app

0 comments6:04 PMView on HN

Dan Koe recently published a framework for resetting your life in a single day, based on the idea that behavior change fails because we're trying to change actions instead of identity.

I turned it into an interactive web app: https://the1dayprotocol.vercel.app

How it works:

Morning: 12 questions to excavate hidden dissatisfaction and define an "anti-vision" (the life you refuse to live) Day: Calendar events with scheduled interrupts to break autopilot thinking Evening: Synthesize insights into vision statements, yearly goals, and daily actions It's entirely local – no account, no backend, nothing leaves your browser. Your answers stay on your device.

There's also an optional AI reflection feature that acts as a Socratic mirror, challenging your answers without telling you what to think.

The premise is that most self-help fails because it targets symptoms (habits, productivity) instead of the root (unconscious goals and identity). This tool forces you to sit with uncomfortable questions for an entire day.

Curious if anyone has tried similar "structured introspection" approaches – or if this just feels like a fancy journaling app.

Original Article: https://x.com/thedankoe/status/2010751592346030461

GraphOS Router, designed specifically for federated GraphQL workloads #

apollographql.com

0 comments6:05 PMView on HN

We just shipped response caching in GraphOS Router, designed specifically for federated GraphQL workloads.

The problem we were solving: traditional HTTP caching doesn't work well with GraphQL. A single response often mixes data with different freshness requirements—long-lived product info, fast-changing inventory, user-specific context. Full-response caches force the shortest TTL across everything, killing hit rates.

Our approach caches at two levels: root query fields and individual entity representations. Each entity can have its own TTL (derived from Cache-Control headers or @cacheControl directives), and entities are shared across queries and users where appropriate.

A few things that might be interesting:

Tag-based invalidation: Mark entities with @cacheTag, then invalidate by tag when data changes—similar to CDN surrogate keys but at the GraphQL entity level Partial cache hits: A single query might hit cache for some subgraphs and miss for others; we built a debugger in Apollo Sandbox to inspect exactly what's cached Redis-backed with cluster support

Early results from Dow Jones: 20-25% latency reduction, 8-10x decrease in traffic to some subgraphs.

Docs: https://www.apollographql.com/docs/graphos/routing/performan... Blog: https://www.apollographql.com/blog/introducing-response-cach...

Define a production-ready backend CRUD FastAPI using a JSON schema #

rationalbloks.com

1 comments3:13 PMView on HN

Is it time yet? Countdowns for Google Calendar #

isittime.app

0 comments3:12 PMView on HN

I built this free web app a couple of years ago because I always need to know what meetings are coming up, and I've never found a satisfactory solution. So, here comes a web app that throws countdowns at you, as well as playing ping sounds/flashing the screen when meetings are approaching. I always have multiple screens around, and I dedicate a vertical panel to this app.

Lately, I've upgraded it to also support booking rooms by querying for free rooms and "fast booking" them depending on a custom priority list. I have a personal use case for this one: even periodic meetings don't have assigned rooms, so I can just bulk-book the first available rooms for the next couple of weeks for every meeting that does not have an associated one.

I hope this app can help others find some help for their corporate lives, as it did for me :D

PlanBrick – A minimalist training planner for triathletes #

planbrickworkout.com

1 comments1:45 PMView on HN

Open-Source P2P Contacts App E2EE Signal | Alpha testers wanted #

meishiapp.com

0 comments1:00 PMView on HN

Hi HN! We're building Meishi, a privacy-first contacts app where you control your data.

The problem: Your contacts live on Big Tech servers, get outdated, and you have zero control over who accesses them.

Our approach: - P2P sync with Signal Protocol (E2EE) - Auto-updates when contacts change info - Grant/revoke access with one tap - Zero data monetization

Open source: - Core P2P sync engine (AGPL v3) - iOS/Android clients (open source) - Server infrastructure remains proprietary

We're opening alpha waitlist (1,000 spots) with 1 year free Premium.

We're two GDPR-native founders (tech + design) solving our own frustration with centralized contacts. Here's our manifesto on data sovereignty vs. Big Tech control: https://github.com/Meishiapp/docs/blob/main/Meishi_Manifesto...

Would love feedback on the architecture and whether this resonates with you.

Quint Visualizer – A GraphViz-like visualizer for Quint traces #

quint-visualizer.noghartt.dev

0 comments11:42 AMView on HN

I started digging deeper into Quint (https://quint-lang.org/), but I missed a tool that let me explore trace samples, something like how TLA+ uses GraphViz to visualize the state graph.

So I vibecoded a small web tool that lets you upload a valid JSON file and explore Quint trace in different stages.