Daily Show HN

Upvote0

Show HN for January 22, 2026

57 items
158

Text-to-video model from scratch (2 brothers, 2 years, 2B params) #

huggingface.co faviconhuggingface.co
24 comments4:31 PMView on HN
Writeup (includes good/bad sample generations): https://www.linum.ai/field-notes/launch-linum-v2

-------

We're Sahil and Manu, two brothers who spent the last 2 years training text-to-video models from scratch. Today we're releasing them under Apache 2.0.

These are 2B param models capable of generating 2-5 seconds of footage at either 360p or 720p. In terms of model size, the closest comparison is Alibaba's Wan 2.1 1.3B. From our testing, we get significantly better motion capture and aesthetics.

We're not claiming to have reached the frontier. For us, this is a stepping stone towards SOTA - proof we can train these models end-to-end ourselves.

--------------------------------

Why train a model from scratch?

--------------------------------

We shipped our first model in January 2024 (pre-Sora) as a 180p, 1-second GIF bot, bootstrapped off Stable Diffusion XL. Image VAEs don't understand temporal coherence, and without the original training data, you can't smoothly transition between image and video distributions. At some point you're better off starting over.

For v2, we use T5 for text encoding, Wan 2.1 VAE for compression, and a DiT-variant backbone trained with flow matching. We built our own temporal VAE but Wan's was smaller with equivalent performance, so we used it to save on embedding costs. (We'll open-source our VAE shortly.)

The bulk of development time went into building curation pipelines that actually work (e.g., hand-labeling aesthetic properties and fine-tuning VLMs to filter at scale).

What works: Cartoon/animated styles, food and nature scenes, simple character motion

What doesn't: Complex physics, fast motion (e.g., gymnastics, dancing), consistent text

-----------------------------------

Why build this when Veo/Sora exist?

-----------------------------------

Products are extensions of the underlying model's capabilities. If users want a feature the model doesn't support (character consistency, camera controls, editing, style mapping, etc.), you're stuck.

To build the product we want, we need to update the model itself. That means owning the development process.

It's a bet that will take time (and a lot of GPU compute) to pay off, but we think it's the right one.

What’s next? - Post-training for physics/deformations - Distillation for speed - Audio capabilities - Model scaling

Happy to answer questions about building a model from 0 → 1. We kept a “lab notebook” of all our experiments in Notion and we'll be blogging about our learnings throughout the year.

93

I've been using AI to analyze every supplement on the market #

pillser.com faviconpillser.com
47 comments2:09 PMView on HN
Hey HN! This has been my project for a few years now. I recently brought it back to life after taking a pause to focus on my studies.

My goal with this project is to separate fluff from science when shopping for supplements. I am doing this in 3 steps:

1.) I index every supplement on the market (extract each ingredient, normalize by quantity)

2.) I index every research paper on supplementation (rank every claim by effect type and effect size)

3.) I link data between supplements and research papers

Earlier last year, I took pause on a project because I've ran into a few issues:

Legal: Shady companies are sending C&Ds letters demanding their products are taken down from the website. It is not something I had the mental capacity to respond to while also going through my studies. Not coincidentally, these are usually brands with big marketing budgets and poor ingredients to price ratio.

Technical: I started this project when the first LLMs came out. I've built extensive internal evals to understand how LLMs are performing. The hallucinations at the time were simply too frequent to passthrough this data to visitors. However, I recently re-ran my evals with Opus 4.5 and was very impressed. I am running out of scenarios that I can think/find where LLMs are bad at interpreting data.

Business: I still haven't figured out how to monetize it or even who the target customer is.

Despite these challenges, I decided to restart my journey.

My mission is to bring transparency (science and price) to the supplement market. My goal is NOT to increase the use of supplements, but rather to help consumers make informed decisions. Often times, supplementation is not necessary or there are natural ways to supplement (that's my focus this quarter – better education about natural supplementation).

Some things that are helping my cause – Bryan Johnson's journey has drawn a lot more attention to healthy supplementation (blueprint). Thanks to Bryan's efforts, I had so many people in recent months reach out to ask about the state of the project – interest I've not had before.

I am excited to restart this journey and to share it with HN. Your comments on how to approach this would be massively appreciated.

Some key areas of the website:

* Example of navigating supplements by ingredient https://pillser.com/search?q=%22Vitamin+D%22&s=jho4espsuc

* Example of research paper analyzed using AI https://pillser.com/research-papers/effect-of-lactobacillus-...

* Example of looking for very specific strains or ingredients https://pillser.com/probiotics/bifidobacterium-bifidum

* Example of navigating research by health-outcomes https://pillser.com/health-outcomes/improved-intestinal-barr...

* Example of product listing https://pillser.com/supplements/pb-8-probiotic-663

87

BrowserOS – "Claude Cowork" in the browser (open source) #

github.com favicongithub.com
35 comments4:30 PMView on HN
Hey HN! We're Nithin and Nikhil, twin brothers building BrowserOS (YC S24). We're an open-source, privacy-first alternative to the AI browsers from big labs.

On BrowserOS, we provide first-class support to bring your own LLMs either local models or via API keys and run the agent entirely on the client side, so your data stays on your machine!

Today we're launching filesystem access... just like Claude Cowork, our browser agent can read files, write files, run shell commands! But honestly, we didn't plan for this. It turns out the privacy decision we made 9 months ago accidentally positioned us for this moment.

--- The architectural bet we made 9 months ago Unlike other AI browsers (ChatGPT Atlas, Perplexity Comet) where the agent loop runs server-side, we decided early on to run our agent entirely on your machine (client side).

But building everything on the client side wasn't smooth.

We initially built our agent loop inside a Chrome extension. But we kept hitting walls:

1) JS (background service worker) is single-threaded, so we couldn't start multiple agents in parallel.

2) Not having access to a NodeJS-like runtime meant we couldn't use many great npm packages (Vercel AI SDK, Anthropic's MCP SDK, etc)

3) And finally, there was no good way to expose our agent and tools as an API

So we made the hard decision 2 months ago to throw away everything we built and start from scratch.

In the new architecture, we went with a sidecar approach. We put our agent loop in a standalone Bun binary and ship it alongside our Chromium binary. We also decided not to rewrite our own agent loop, but borrowed gemini-cli's loop with some tweaks! We wrote a neat adapter to translate between Gemini format and Vercel AI SDK format. You can look at our entire codebase here: https://git.new/browseros-agent

--- How this helped build filesystem access When Claude Cowork launched, we realized something: because Atlas and Comet run their agent loop server-side, there's no good way for their agent to access your files without uploading them to the server first.

But our agent was already local. Adding filesystem access meant just... opening the door (with your permissions ofc). Our agent can now read and write files just like Claude Code. No uploads, no cloud storage, no sync.

--- What you can actually do today

a) Organize files in my desktop folder https://youtu.be/NOZ7xjto6Uc

b) Open top 5 HN links, extract the details and write summary into a HTML file https://youtu.be/uXvqs_TCmMQ

--- Where we are now If you haven't tried us since the last Show HN, give us another shot. The new architecture unlocked a ton of new features, and we've grown to 8.5K GitHub stars and 100K+ downloads:

c) You can now build more reliable workflows using n8n-like graph https://youtu.be/H_bFfWIevSY

d) You can also use BrowserOS as an MCP server in Cursor or Claude Code https://youtu.be/5nevh00lckM

e) You can also schedule repetitive tasks!

--- Why we think browser is the right platform We are very bullish on browser being the right platform for a Claude Cowork like agent. Browser is the most commonly used app by knowledge workers (emails, docs, spreadsheets, research, etc). And it seems like even Anthropic recognizes this -- for Claude Cowork, they have janky integration with browser via a chrome extension. But owning the entire stack allows us to provide a much smoother experience. It also lets us build differentiated features that wouldn't be possible otherwise. One example: Browser ACLs.

Agents can do dumb or destructive things, so we're adding browser-level guardrails (think IAM for agents): "role(agent): can never click buy" or "role(agent): read-only access on my bank's homepage." We have a prototype already—curious to hear your take on this and the overall thesis.

We’ll be in the comments. Thanks for reading!

GitHub: https://git.new/browseros Download: https://browseros.com (available for Mac, Windows, Linux!)

55

Bible translated using LLMs from source Greek and Hebrew #

biblexica.com faviconbiblexica.com
73 comments4:00 PMView on HN
Built an auditable AI (Bible) translation pipeline: Hebrew/Greek source packets -> verse JSON with notes rolling up to chapters, books, and testaments. Final texts compiled with metrics (TTR, n-grams).

This is the first full-text example as far as I know (Gen Z bible doesn't count).

There are hallucinations and issues, but the overall quality surprised me.

LLMs have a lot of promise translating and rendering 'accessible' more ancient texts.

The technology has a lot of benefit for the faithful, that I think is only beginning to be explored.

45

Postgres and ClickHouse as a unified data stack #

9 comments6:21 PMView on HN
Hello HN, this is Sai and Kaushik from ClickHouse. Today we are launching a Postgres managed service that is natively integrated with ClickHouse. It is built together with Ubicloud (YC W24).

TL;DR: NVMe-backed Postgres + built-in CDC into ClickHouse + pg_clickhouse so you can keep your app Postgres-first while running analytics in ClickHouse.

Try it (private preview): https://clickhouse.com/cloud/postgres Blog w/ live demo: https://clickhouse.com/blog/postgres-managed-by-clickhouse

Problem

Across many fast-growing companies using Postgres, performance and scalability commonly emerge as challenges as they grow. This is for both transactional and analytical workloads. On the OLTP side, common issues include slower ingestion (especially updates, upserts), slower vacuums, long-running transactions incurring WAL spikes, among others. In most cases, these problems stem from limited disk IOPS and suboptimal disk latency. Without the need to provision or cap IOPS, Postgres could do far more than it does today.

On the analytics side, many limitations stem from the fact that Postgres was designed primarily for OLTP and lacks several features that analytical databases have developed over time, for example vectorized execution, support for a wide variety of ingest formats, etc. We’re increasingly seeing a common pattern where many companies like GitLab, Ramp, Cloudflare etc. complement Postgres with ClickHouse to offload analytics. This architecture enables teams to adopt two purpose-built open-source databases.

That said, if you’re running a Postgres based application, adopting ClickHouse isn’t straightforward. You typically end up building a CDC pipeline, handling backfills, and dealing with schema changes and updating your application code to be aware of a second database for analytics.

Solution

On the OLTP side, we believe that NVMe-based Postgres is the right fit and can drastically improve performance. NVMe storage is physically colocated with compute, enabling significantly lower disk latency and higher IOPS than network-attached storage, which requires a network round trip for disk access. This benefits disk-throttled workloads and can significantly (up to 10x) speed up operations incl. updates, upserts, vacuums, checkpointing, etc. We are working on a detailed blog examining how WAL fsyncs, buffer reads, and checkpoints dominate on slow I/O and are significantly reduced on NVMe. Stay tuned!

On the OLAP side, the Postgres service includes native CDC to ClickHouse and unified query capabilities through pg_clickhouse. Today, CDC is powered by ClickPipes/PeerDB under the hood, which is based on logical replication. We are working to make this faster and easier by supporting logical replication v2 for streaming in-progress transactions, a new logical decoding plugin to address existing limitations of logical replication, working toward sub-second replication, and more.

Every Postgres comes packaged with the pg_clickhouse extension, which reduces the effort required to add ClickHouse-powered analytics to a Postgres application. It allows you to query ClickHouse directly from Postgres, enabling Postgres for both transactions and analytics. pg_clickhouse supports comprehensive query pushdown for analytics, and we plan to continuously expand this further (https://news.ycombinator.com/item?id=46249462).

Vision

To sum it up - Our vision is to provide a unified data stack that combines Postgres for transactions with ClickHouse for analytics, giving you best-in-class performance and scalability on an open-source foundation.

Get Started

We are actively working with users to onboard them to the Postgres service. Since this is a private preview, it is currently free of cost.If you’re interested, please sign up here. https://clickhouse.com/cloud/postgres

We’d love to hear your feedback on our thesis and anything else that comes to mind, it would be super helpful to us as we build this out!

44

First Claude Code client for Ollama local models #

twitter.com favicontwitter.com
27 comments5:26 PMView on HN
Just to clarify the background a bit. This project wasn’t planned as a big standalone release at first. On January 16, Ollama added support for an Anthropic-compatible API, and I was curious how far this could be pushed in practice. I decided to try plugging local Ollama models directly into a Claude Code-style workflow and see if it would actually work end to end.

Here is the release note from Ollama that made this possible: https://ollama.com/blog/claude

Technically, what I do is pretty straightforward:

- Detect which local models are available in Ollama.

- When internet access is unavailable, the client automatically switches to Ollama-backed local models instead of remote ones.

- From the user’s perspective, it is the same Claude Code flow, just backed by local inference.

In practice, the best-performing model so far has been qwen3-coder:30b. I also tested glm-4.7-flash, which was released very recently, but it struggles with reliably following tool-calling instructions, so it is not usable for this workflow yet.

39

Synesthesia, make noise music with a colorpicker #

visualnoise.ca faviconvisualnoise.ca
13 comments5:52 AMView on HN
This is a (silly, little) app which lets you make noise music using a color picker as an instrument. When you click on a specific point in the color picker, a bit of JavaScript maps the binary representation of the clicked-on color's hex-code to a "chord" in the 24 tone-equal-temperament scale. That chord is then played back using a throttled audio generation method which was implemented via Tone.js.

NOTE! Turn the volume way down before using the site. It is noise music. :)

31

High speed graphics rendering research with tinygrad/tinyJIT #

github.com favicongithub.com
10 comments3:26 AMView on HN
I saw a tweet that tinygrad is so good that you could make a graphics library that wraps tg. So I’ve been hacking on a gtinygrad, and honestly it convinced me it could be used for legit research.

The JIT + tensor model ends up being a really nice way to express light transport all in simple python, so I reimplemented some new research papers from SIGGRAPH like REstir PG and SZ and it just works. instead of complicated cpp its just a 200 LOC of python.

28

Dotenv Mask Editor: No more embarrassing screen leaks of your .env #

marketplace.visualstudio.com faviconmarketplace.visualstudio.com
27 comments12:04 AMView on HN
Hi HN,

I built this because I often work in coworking spaces or do screen sharing, and I've always had this fear of accidentally flashing my .env file with production secrets to the whole room (or recording).

It’s a simple VS Code extension that opens .env files in a custom grid editor. It automatically masks any value longer than 6 characters so I can safely open the file to check keys without exposing the actual secrets.

It runs 100% locally with zero dependencies (I know how sensitive these files are). It just reads the file, renders the grid, and saves it back as standard text.

It's open source (MIT) and I'd love any feedback on the masking logic or other features that would make it safer to use.

Marketplace: https://marketplace.visualstudio.com/items?itemName=xinbenlv... Github https://github.com/xinbenlv/dotenv-mask-editor

15

I'm writing an alternative to Lutris #

github.com favicongithub.com
4 comments8:30 PMView on HN
It's free and open source. The aim is to have more transparent access to wine prefixes and the surrounding tooling (winetricks, proton configuration, etc...) per game in comparison to Lutris. Same features like statistics (time played, times launched, times crashed, and so on) per game is available in the app.
10

Figr – AI that thinks through product problems before designing #

figr.design faviconfigr.design
5 comments8:12 PMView on HN
Built Figr AI because I got tired of AI builder tools market themselves as design tools and end up skipping the hard part.

Every tool I tried would jump straight to screens. But that's not how product design actually works. You don't just design screens. You think through the problem first. The flows, the edge cases, the user journey, where people will get stuck. Then the design comes finally.

Figr does that thinking layer first. It parses your existing product via a chrome extension or takes in screen-records, then works through the problem with you before designing. Surfaces edge cases, maps flows, generates specs, reviews UX. The design comes after the thinking.

It is able to do so because we trained it on over 200k+ real UX patterns and UX principles. Our major focus is on helping in building the right UX by understanding the product.

The difference from Lovable/Bolt/V0: I think those are interface builders. They are good when you know exactly what you want to build but they don't truly help in finding the right solution to the problem. Our aim with Figr is to be more like an AI PM that happens to also design.

Some difficult UX problems we've worked through with it: https://figr.design/gallery

Would love feedback, especially from folks who've hit the same wall with other AI builder/design tools.

9

OpenSkills – Stop bloating your LLM context with unused instructions #

3 comments6:41 AMView on HN
Hello HN,

I’ve been building AI agents lately and ran into a common "Context Bloat" problem. When an agent has 20+ skills, stuffing every system prompt, reference doc, and tool definition into a single request quickly hits token limits and degrades model performance (the "lost in the middle" problem).

To solve this, I built OpenSkills, an open-source SDK that implements a Progressive Disclosure Architecture for agent skills.

The Core Concept: Instead of loading everything upfront, OpenSkills splits a skill into three layers:

Layer 1 (Metadata): Light-weight tags and triggers (always loaded for discovery).

Layer 2 (Instruction): The core SKILL.md prompt (loaded only when the skill is matched).

Layer 3 (Resources): Heavy reference docs or scripts that are conditionally loaded based on the specific conversation context.

Why this matters:

Scalability: You can have hundreds of skills without overwhelming the LLM's context window.

Markdown-First: Skills are defined in a simple SKILL.md format. It’s human-readable, git-friendly, and easy for the LLM to parse.

Conditional Resources: For example, a "Finance Skill" only pulls in the tax-code.pdf reference if the query actually mentions tax compliance.

Key Features:

Python 3.10+ SDK.

Automatic skill matching and invocation.

Support for script execution (via [INVOKE:script_name] syntax).

Multimodal support (Images via URL/base64).

GitHub: [https://github.com/twwch/OpenSkills] PyPI: pip install openskills-sdk

7

A Node Based Editor for Three.js Shading Language (TSL) #

tsl-graph.xyz favicontsl-graph.xyz
1 comments5:05 PMView on HN
Three.js recently introduced TSL (Three.js Shading Language), a way to write shaders in pure JavaScript/TypeScript that compiles to both GLSL and WGSL. I built this editor to provide a visual interface for the tsl ecosystem. It allows developers to prototype shaders for WebGPU/WebGL and see the results in real-time. This is a beta release and I'm looking for feedback.
7

FOMOmeter – crypto sentiment based on crowd emotions, not price #

fomometer.ai faviconfomometer.ai
1 comments9:26 PMView on HN
Hi HN,

Over the past few years I kept running into the same problem while investing in crypto: most “sentiment” tools either lag price or quietly repackage price data itself.

I wanted something simpler and more honest — a way to see how the crowd actually feels, even when price is flat or misleading.

So I built FOMOmeter.

It’s a sentiment platform that analyzes real social signals (X, Reddit, etc.) and maps crowd mood.

What I’m testing right now: • Does pure crowd sentiment add value during uncertainty and sideways markets? • Can it help avoid emotional entries instead of chasing narratives? • Is this something experienced investors would actually use regularly?

There’s a free public version and an early Pro version. I’m mainly looking for thoughtful feedback — what feels useful, what feels redundant, and what’s missing.

Happy to answer questions and share how the sentiment model works under the hood if there’s interest.

Thanks for checking it out.

7

Laptop Stickers – cheap individual short run stickers #

laptopstickers.store faviconlaptopstickers.store
1 comments3:26 AMView on HN
My friends and I love putting stickers on our laptops, but most printing companies have a minimum order of a single design. I have a little hobby where I print and cut multiple single designs in one order at home and get great results. Perfect for folks who want to decorate and personalise their laptops!

You can get a dozen or so stickers for just AUD$20 (approx USD$14, €12, £10) and free worldwide shipping.

5

I'm tired of my LLM bullshitting. So I fixed it #

9 comments4:50 PMView on HN
As a handsome local AI enjoyer™ you’ve probably noticed one of the big flaws with LLMs:

It lies. Confidently. ALL THE TIME.

I’m autistic and extremely allergic to vibes-based tooling, so … I built a thing. Maybe it’s useful to you too.

The thing: llama-conductor

llama-conductor is a router that sits between your frontend (eg: OWUI) & backend (llama.cpp + llama-swap). Local-first but it should talk to anything OpenAI-compatible if you point it there (note: experimental so YMMV).

LC is a glass-box that makes the stack behave like a deterministic system, instead of a drunk telling a story about the fish that got away.

TL;DR: “In God we trust. All others must bring data.”

Three examples:

1. KB mechanics (markdown, JSON, checksums)

You keep “knowledge” as dumb folders on disk. Drop docs (.txt, .md, .pdf`) in them. Then:

>>attach <kb> - attaches a KB folder

>>summ new - generates SUMM_.md files with SHA-256 provenance baked in + moves the original to a sub-folder

Now, when you ask something like:

> “yo, what did the Commodore C64 retail for in 1982?”

..it answers from the attached KBs only.

If the fact isn’t there, it tells you - explicitly - instead of winging it. Eg:

"The provided facts state the Commodore 64 launched at $595 and was reduced to $250, but do not specify a 1982 retail price. The Amiga’s pricing and timeline are also not detailed in the given facts.

Missing information includes the exact 1982 retail price for Commodore’s product line and which specific model(s) were sold then."

[Confidence: medium | Source: Mixed]

No vibes. Just: here’s what’s in your docs, here’s what’s missing, don't GIGO yourself into stupid.

Then, if you're happy with the summary, you can:

>>move to vault

2. Mentats: proof-or-refusal mode (Vault-only)

Mentats is the “deep think” pipeline against your curated sources.

* no chat history

* no filesystem KBs

* no Vodka

* Vault-only grounding (Qdrant)

It runs a triple-pass (thinker → critic → thinker). It’s slow on purpose. You can audit it. And if the Vault has nothing relevant? It refuses and tells you to go pound sand:

FINAL_ANSWER:

The provided facts do not contain information about the Acorn computer or its 1995 sale price.

Sources: Vault

FACTS_USED: NONE

[ZARDOZ HATH SPOKEN]

Also yes, it writes a mentats_debug.log. Go look at it any time you want.

The flow is basically:

Attach KBs → SUMM → Move to Vault → Mentats.

No mystery meat. No “trust me bro, embeddings.”

3. Vodka: deterministic memory on a potato budget

Potato PCs have two classic problems: goldfish memory + context bloat that murders your VRAM.

Vodka fixes both without extra model compute.

* !! stores facts verbatim (JSON on disk)

* ?? recalls them verbatim (TTL + touch limits so memory doesn’t become landfill)

* CTC (Cut The Crap)* hard-caps context (last N messages + char cap) and creates a concatenated summary (not LLM) so you don’t get VRAM spikes after 400 messages

So instead of:

“Remember my server is 203.0.113.42” → “Got it!” → [100 msgs later] → “127.0.0.1”

you get:

!! my server is 203.0.113.42` ?? server ip → 203.0.113.42 (with TTL/touch metadata)

And because context stays bounded: stable KV cache, stable speed, your potato PC stops crying.

There’s more (a lot more) in the README, but I’ve already over-autism’ed this post.

TL;DR:

If you want your local LLM to shut up when it doesn’t know and show receipts when it does, come poke it:

Primary (Codeberg) https://codeberg.org/BobbyLLM/llama-conductor

Mirror (GitHub): https://github.com/BobbyLLM/llama-conductor

PS: Sorry about the AI slop image. I can't draw for shit.

PPS: A human with ASD wrote this using Notepad++. If it the formatting or language are weird, now you know why.

5

ProblemHunt – A place to collect real problems before building startups #

problemhunt.pro faviconproblemhunt.pro
4 comments8:23 PMView on HN
Hi HN,

I’m a solo founder. I built ProblemHunt after noticing a common pattern: many founders start with ideas and solutions, but don’t spend enough time understanding real problems.

ProblemHunt is a simple place where people share problems they actually face in work or life. Developers and founders can browse these problems and use them as a starting point for building products.

So far: – ~5,000 users – 150+ real problems submitted – 100% organic growth

This is still early, and I’m trying to understand if this is genuinely useful.

I’d really appreciate feedback: – Would you use something like this? – What feels unnecessary or confusing? – What would make it more valuable?

Link: https://problemhunt.pro

5

Lima-devbox – Claude skill for creating a VM dev sandbox on your Mac #

github.com favicongithub.com
0 comments5:54 PMView on HN
As I've increasingly pursued orchestrated agentic coding - I needed a safe way to run Claude Code in YOLO mode on my mac.

I looked at a lot of options including Docker but then came across Lima (Linux for Mac) and this great guide by Chris Hager on how create a dev sandbox for your mac using Lima. This creates an Ubuntu VM, mapped to your dev directory with RW access, port and SSH forwarding, Mise support for language installs and automatic code agent installs.

https://www.metachris.dev/2025/11/sandbox-your-ai-dev-tools-...

I followed his guide. Got it working on my own then thought I'd try packaging it up in a claude skill to share. And holy yep works great!

To setup just go to /plugin -> marketplace in Claude Code and add

recodelabs/lima-devbox

Then install the skill and ask Claude to create you a VM. Claude will ask you some questions and then proceed to build you your VM.

Side note - this really hits home for me that Agent Skills is really all you need. Instead of writing a guide or wrapping this up in a app/TUI just bake any knowledge or process you want to automate and share into a skill. Definitely feels like the future.

Would love your feedback and PRs if you have ideas on how to improve.

https://github.com/recodelabs/lima-devbox

Thanks! Matt

4

SGR – A Linear-Complexity "Living Cell" Outperforming Transformers #

0 comments12:03 PMView on HN
I am developing an architecture called Sparse Gated Resonance (SGR). It is a sequence modeling approach designed to avoid the quadratic scaling of traditional Self-Attention. I have been benchmarking a 722k-parameter SGR against a 921k-parameter Transformer on Victor Hugo’s "Notre-Dame de Paris" (English).

The SGR replaces the attention mechanism with a "Causal Pulse." It uses gated 1D convolutions to generate a navigation vector that resonates against a brain-map of character embeddings. This allows the model to maintain a "Living Cell" state that updates with linear complexity.

Full source and implementation: https://github.com/MrPan2048/GeometricTransformer/

Benchmarking Data (Notre-Dame de Paris):

STEP 3900 ARCH | LOSS | PPL | ENT | TIME SGR | 1.4481 | 4.26 | 1.5476 | 19.0ms STD | 2.0275 | 7.59 | 2.1476 | 40.3ms

Semantic Comparison (Generation from "Quasimodo"):

SGR: "Quasimodo. Then minds that the accasteady which which the" STD: "Quasimododo ng, o uer tre the todo hemo’He wand at tine."

Technical Observations:

Computational Efficiency: SGR maintains a significant latency advantage, consistently running at ~19ms compared to the Transformer's ~40ms. This confirms the efficiency of the linear pulse over quadratic attention.

Convergence Quality: By Step 3700, SGR reached a Perplexity (PPL) of 4.46, whereas the Transformer lagged at 8.36. SGR successfully produces recognizable English phrases and punctuation, while the Transformer still exhibits "stuttering" artifacts (e.g., "Quasimodododod").

Entropy Stability: SGR has stabilized at an entropy of ~1.54, which represents the optimal "Mastery Zone" for English text. The Transformer’s higher entropy (~2.14) correlates with its lack of structural coherence.

I am seeking an endorsement to publish a formal paper on this architecture to arXiv (CS.ML). I believe these results demonstrate that "Living Cell" resonance models can outperform Attention in parameter-constrained and latency-sensitive environments. If you are a researcher willing to endorse or review the mathematical formalization, please contact me via GitHub.

3

SkillLens – scan and audit locally installed agent skills #

skilllens.run faviconskilllens.run
0 comments2:37 PMView on HN
Hi HN — I built a small CLI called SkillLens to help answer: “What agent skills do I have installed, and are any of them sketchy?”

A lot of agent ecosystems (Claude/Codex/OpenCode, etc.) store skills as folders with a SKILL.md. These files can contain surprisingly powerful instructions (and sometimes unsafe patterns), but they’re easy to forget once installed. We're also tend to run them with --dangerously-skip-permissions and let them install whatever they want, but I got a bit anxious about it so decided to build a tool to have some peace of mind.

I decided to not go with AST static check but instead use whatever CLI you have locally to validate it.

SkillLens does two things:

1. Discovery: it scans common local skill locations (configurable) and lists what it finds. 2. Optional audit: if you have an auditor CLI installed (claude or codex), it sends each SKILL.md (currently truncated to ~12k chars) to the auditor and asks for structured JSON output:

- verdict: safe | suspicious | unsafe - risk: 0–10 - summary + issues with evidence

It also caches audit results locally so reruns won't check skills again unless those were updated, you installed anything new or you explicitly asked it to do so with --force flag.

Install/run:

npx skilllens scan # or pnpm dlx skilllens scan

Notes / caveats:

- v0.1; I’m still iterating on the prompt/schema and the “what counts as suspicious” heuristics. - Today it sends the skill text to whatever your auditor CLI uses (so treat it like sharing the skill contents with that provider). “Redacted evidence extraction” is planned, but not implemented yet. - If the auditor CLI isn’t installed, it still produces a scan report and marks audits as skipped.

3

Wisp: Stateful Claude Code Management #

github.com favicongithub.com
0 comments3:34 AMView on HN
yeah, the readme is vibed, but the idea and the compression kept the whole session alive for like 30 minutes from start to finish with the HinkyPunk VPN. I'm sure there are improvements.
3

I built a JSON viewer that decodes Base64 media inline #

viewjson.net faviconviewjson.net
0 comments2:45 PMView on HN
I built this for myself while debugging multimodal AI APIs.

The problem: I kept running into JSON responses stuffed with Base64-encoded images. Every time, I had to copy the string, find an online decoder, paste, preview. Repeat for each image. It got old fast.

So I made ViewJSON. Paste your JSON, and it automatically detects and renders Base64 media inline: images, audio, video, even PDFs. No more copy-paste decode loops.

Other things it does: - Format/minify JSON - Paste an image, get Base64 string (reverse direction) - Build API requests, copy as cURL - Variable substitution for testing - Share JSON via URL

No login, completely free. Would love feedback on edge cases I might have missed!

3

I built a sandboxed VM for letting AI agents go wild without risks #

0 comments9:28 PMView on HN
I wanted to run Claude Code in full autonomous mode (--dangerously-skip-permissions) but was terrified of what it might do to my system. One hallucinated rm -rf and my weekend is gone.

So I built YOLO Kingdom: a pre-configured Ubuntu VM (UTM/Apple Silicon) with Claude Code + Chrome DevTools MCP for autonomous browser testing. Share your project folders via virtiofs, SSH in, run `yolo`, and let it cook.

The VM is your sacrificial sandbox. It can delete everything inside and you just restore from snapshot.

V1 includes: Ubuntu 24.04, Claude Code, headless Chromium with DevTools protocol, smart welcome screen that guides setup.

Future plans: autonomous agents that can architect and build entire projects while you sleep.

https://github.com/pancakeInDev/yolo-kingdom

3

Aident, agentic automations as plain-English playbooks #

aident.ai faviconaident.ai
0 comments6:04 AMView on HN
Hi HN — I’m Kimi, founder of Aident AI.

A couple years ago, I cold-started a startup and tried to automate my outreach pipeline with a patchwork: RPA + Zapier + "just some ChatGPT".

It worked… until it didn’t. Every time I tweaked the strategy, the automation snapped. The system wasn’t built for change — it was built for wiring.

And that’s when it hit me: AI is finally smart enough to do real work, but we still don’t have a clean way to communicate the work to it: Drag-and-drop flows are too rigid; Scripts are too fragile; Prompts are too ephemeral. I kept thinking: why can’t automation be something you can simply read, edit, and trust?

So we built Aident around one idea: the source of truth is a document.

You write a Playbook in plain English (objectives, roles, handoffs), Aiden helps refine it, then we compile it into a reliable agent team (with built-in PM + QA roles) that executes skills with your 250+ tools (Gmail, Slack, Notion, Google Sheets/Calendar, GitHub, etc.). You can test it, schedule it, and then forget about it - it will keep delivering, even when you sleep.

It’s still early beta and can be bumpy sometimes, but I’d really appreciate you give it a try and let me know your feedbacks:

What workflow would you try first? Where would this be a bad fit (compliance, data sensitivity, team size)? What would make you trust it enough to run unattended?

Try @ https://app.aident.ai

3

LaReview, local open-source CodeRabbit alternative #

github.com favicongithub.com
0 comments6:46 PMView on HN
hihi,

LaReview is a dev-first code review workbench for complex changes.

You give it a PR (GitHub/GitLab) or a diff, and it builds a structured review plan grouped by flows (auth, API, billing) and ordered by risk. The goal is to make big reviews feel like a plan you can actually follow, not an endless scroll.

It runs locally and is designed to work with your existing AI coding agent (bring your own agent). No bot comment spam. You decide what feedback gets posted back to the PR.

Highlights:

  - AI review planning: flow-based tasks + risk ordering
  - Task-focused diffs: isolate only the hunks relevant to one concern
  - Custom rules: enforce standards like “DB queries must have timeouts”
  - Optional diagrams to understand flows before reading code (requires D2)
  - GitHub/GitLab sync to submit selected feedback + generate a summary
  - Export summary to Markdown
CLI:

  lareview
  lareview pr owner/repo#123
  git diff | lareview
  lareview --agent claude
Install:

  brew install --cask puemos/tap/lareview
Repo:

  https://github.com/puemos/lareview
I would love feedback on the workflow, what you would want it to catch, and what would make you trust it in real reviews.
3

A Gmail extension that stops you from sending emails you'll regret #

chromewebstore.google.com faviconchromewebstore.google.com
3 comments6:55 PMView on HN
Hi HN,

I built a tiny Gmail extension that adds a quick safety check before emails are sent.

It catches common mistakes like: - mentioning an attachment but forgetting to attach it - sending emails late at night - messages that sound more heated than intended

There’s no AI rewriting or analytics — it just pauses the send when something looks off and lets you decide.

This started as a personal itch after sending one too many “oops” emails. Would love feedback from people who live in Gmail all day.

Link: https://chromewebstore.google.com/detail/jnlgfolocbbfhmgaljk...

2

An AI-powered web video editor built with Next.js and Fabric.js #

pablituuu.space faviconpablituuu.space
0 comments6:27 PMView on HN
Key Features: - *AI-Powered*: Uses Gemini Pro for intelligent chat and content manipulation. - *Tech Stack*: Built with Next.js 15, Fabric.js for canvas operations, and Tailwind CSS 4. - *Responsive*: Engineered to handle complex timelines directly in the browser with high performance. I'm particularly looking for feedback on performance and the AI interaction flow. Happy to answer any technical questions about the implementation!
2

Infrastructure for multi-agent AI memory #

nexuswaitlist.framer.website faviconnexuswaitlist.framer.website
0 comments7:45 PMView on HN
I built a memory layer so multiple AI agents can share context, decisions, and files instead of forgetting and duplicating work.

Its not a chatbot. Its literally startup infrastructure.

Demo attached (waitlist is open)

2

Meter – data feed monitoring changes on any site #

meter.sh faviconmeter.sh
0 comments6:30 PMView on HN
Officially launching meter today - it tracks the changes on any website and sends you a webhook with the changes.

Supports proxies, antibot handling, and schedule management.

We’ve had a large amount of users migrate from traditional scrapers to only track the content changes.

We have a generous free tier - give it a try!

2

Take a Break – a gentle extension to stop autoplay late at night #

hardiksondagar.me faviconhardiksondagar.me
0 comments1:24 PMView on HN
Hey HN — I built Take a Break, a Chrome extension that starts a timer on streaming sites and gently reminds you when it’s time to sleep. It shows a countdown with snooze, and a midnight reminder without auto‑close. You can customize timers and sites.

Would love feedback on UX and if the reminders feel too aggressive or too soft.

Demo / listing: https://chromewebstore.google.com/detail/ehhhemnkemficpheoid...

Repo: https://github.com/hardiksondagar/take-a-break

2

A quiet, offline-first reading app for reading books #

deep-reader-page.vercel.app favicondeep-reader-page.vercel.app
0 comments1:38 PMView on HN
I built this after realizing that most reading tools optimize for extracting information, not for reading itself.

This app is designed for staying with a book: - minimal, distraction-free UI - offline-first, local data - no forced notes, no summaries, no gamification

You open a book, read a chapter, and think. Nothing interrupts you unless you ask.

It’s not about reading faster. It’s about reading more attentively.

I’m sharing this to get feedback from people who care about books and deep reading.

2

Website template for iOS apps #

appview.dev faviconappview.dev
0 comments9:23 PMView on HN
I’m a web developer at my day job and an iOS dev on evenings and weekends.

I see often that websites for mobile apps are sort of an afterthought and they go stale pretty quickly as the app progresses and there is no time or energy to update the website as well. Which is a shame, as I see in my analytics that around 15% of all downloads originate from web sources, mainly my app's homepage, search, and AI bots recently.

I've decided to solve this issue for myself (and hopefully others) with a clean, customizable website template that I could re-use across my apps and that would require minimal effort to update and maintain. Hope you give it a try, and in case you have any feedback, feel free to let me know

2

Browser Code – Coding Agent Inside a Web Page #

github.com favicongithub.com
1 comments9:05 PMView on HN
I (vibe)coded a Claude Code like agent that runs directly in the web browser.

With it you can pretty much have a coding agent for the currently opened website. You can ask it things like:

- Extract all links from this page and save them to CSV - Switch this site to dark mode - Copy the page content into a Google Sheet - Remove ads

The agent writes JS script that automatically loads every time you visit the page. It is heavily using the userScripts API so you need to enable a lot of permissions to run the extension, and I'm not sure it can be published anywhere.

Under the hood, scripts and styles live in a virtual filesystem where each website is a directory. The agent can search and edit the DOM as a file, which makes it work more or less reliably similar to claude code. Currently it only support Claude models, and I've tested it on Opus 4.5.

Demo video: https://x.com/ichebykin/status/2014258108500177255

2

AI Coding Toolkit. Low-overhead workflow for reliable AI coding #

benjaminshoemaker.github.io faviconbenjaminshoemaker.github.io
0 comments11:51 PMView on HN
I started using CLI tools (Claude Code, Codex CLI) after vibe coding tools like Replit and Lovable left me wanting more.

With most of the workflows that I found online and in AI coding communities, they were both complex (87 agents! Ralph! Run 10 Claude's at once!), and they were relatively opinionated. I found these workflows to be too restrictive, particularly in the dynamic, fast-moving AI coding space.

This workflow has a few goals:

  - Facilitate consistent AI-assisted coding that follows SDLC best practices.

  - Accentuate the strengths of AI coding agents while mitigating their weaknesses.

  - Accomplish that with low mental overhead and complexity.
That's why it's a Git repo that you can clone, fork, inspect, and modify as you see fit.

There are a number of features - the link & the README have more information.

If folks have thoughts or feedback, I'd be interested to hear it.

1

Chatter: Automated feedback aggregation for dev communities #

demo.chatter.plus favicondemo.chatter.plus
0 comments9:06 PMView on HN
I used to do DevRel at Roblox and Rec Room. Way too much of my week went to copying feedback from Discord, forums, and support tickets into a spreadsheet. Then trying to figure out if the complaint on Discord was the same thing someone posted about on the forum three days ago. Built Chatter to automate that:

- Pulls feedback from Discourse, GitHub Issues, Discord, Webhooks, with more integrations on the way - Clusters related items using AI (batch classification on Haiku) - Can find community solutions and mark them for internal visibility - Custom scoring axes for prioritization if needed - Can email users when their reported issues get resolved

Tech stack: Node/Express, React, PostgreSQL, Railway, and Claude Haiku. I got classification costs to pennies a day with filtering and batching.

Demo (no login): https://demo.chatter.plus (Pulls data from the Godot engine community (Github Issues and Forums) every morning)

https://chatter.plus

Really excited to start sharing this wider. I started with a much bigger idea a few months ago and feel like I've honed it to a clean and lean offering that just needs some tweaking.

Curious how others handle this? This is definitely a problem for projects with big followings, but I think people would be surprised by how fast feedback can get out of hand.

1

Markdown TOC generator for JavaScript builds.Works recursively for repo #

0 comments3:37 AMView on HN
Unlike most tools that auto-generate Table Of Contents (TOCs) for a single Markdown file, update-markdown-toc also supports:

– recursive TOC updates across all Markdown files within a folder hierarchy

– a --check mode for CI use, which only passes if all TOCs in all Markdown files are up-to-date with respect to the section headers in their respective files

Single-file operation is still supported, but the primary motivation for this tool was repository-wide correctness in CI and build-system workflows.

Behavior is intentionally conservative: – explicit start/end markers (no heuristic detection) – deterministic output – idempotent writes – strict single-file mode vs lenient recursive traversal

Posting in case this is useful to others with similar workflows.

Repo + docs: https://github.com/datalackey/build-tools/tree/main/javascri...

npm: https://www.npmjs.com/package/@datalackey/update-markdown-to...

1

Roo Code Slack: end to end agentic workflow in Slack #

youtube.com faviconyoutube.com
0 comments3:40 AMView on HN
Roo Code integrates directly with Slack to create an end to end agentic workflow. Start a discussion in Slack, get an actionable plan, modify the plan if needed, and then generate code all within Slack. You can preview changes, push to GitHub, and run tests without leaving your chat. Watch the video linked above for a full demonstration of how it works.
1

Curor/Lovable for Writing #

bluefeather.ai faviconbluefeather.ai
0 comments2:38 PMView on HN
Cursor changed how I write code, and wanted the same for writing. Bluefeather lives in your doc, suggests edits inline for you to accept/reject instead of copy-pasting to ChatGPT. For papers, contracts etc. Still early in building so alpha testers are welcome.
1

Open-source-ish chart pattern detection using Gemini Vision API #

trinith-ai.vercel.app favicontrinith-ai.vercel.app
0 comments4:01 PMView on HN
I built an AI that detects chart patterns to fight my own confirmation bias I kept losing money on trades because I'd "see" patterns that weren't there. Classic confirmation bias — when you're already in a position, your brain lies to you.

So I built a tool: upload any chart screenshot, get pattern detection in seconds.

Why Gemini over GPT-4V? Tested both. Gemini 1.5 Flash is: - Faster (~2s vs ~5s) - Cheaper (~$0.0001 per analysis) - More consistent structured outputs for this specific task GPT-4V sometimes gave me essays. Gemini stayed focused.

Stack: Next.js 14, Supabase (auth + postgres), Stripe, Vercel

The hard parts weren't the AI: - Auth edge cases (email confirmation flows, session refresh) - Credit system (what happens when API fails mid-request? refund?) - Making it not look like a hackathon project (I'm a backend guy, CSS is pain)

Is it always right? No. TA itself is debatable. But it's a second opinion that doesn't care what positions I'm holding. That's the value.

https://trinith-ai.vercel.app

Would love feedback from: - Traders who can tell me if the output format is useful - Anyone who's built with vision APIs (optimization tips?) - Skeptics who think this is dumb (genuinely want to hear why)

1

VibeFarm – A non-generative IDE for composing AI prompts #

vibefarm.ai faviconvibefarm.ai
0 comments4:02 PMView on HN
Creator here.

I built VibeFarm because prompt work kept collapsing into scattered notes, version chaos, and lost “recipes” across Midjourney, Sora, DALL·E, ChatGPT, etc. I wanted something closer to an IDE: structured, reusable, and model-agnostic, not a chat wrapper.

VibeFarm is a non-generative prompt composition workspace. Prompts are built from semantic slots (subject/context/style/etc.), optional layers, and saved snapshots (“VibeCards”) that export clean prompts to any model or to a portable .vibe JSON format.

Design choices: - No generation inside the app, it’s intentionally just composition (no model lock-in, no API costs). - Static curated vocabulary: 20,000+ palettes, ~1M fragments, instant drag-drop, no runtime calls. - Reuse-first: variables for series swaps, video-mode timeline overrides, and versioned snapshots.

Try it instantly (guest mode): https://app.vibefarm.ai Homepage: https://vibefarm.ai Optional demo video: https://youtu.be/IgEly7VpwwI

Happy to answer questions or hear feedback, especially from people doing structured or repeatable prompt work.

1

Swish Finance – a zero-click way to watch your portfolio #

swishfinance.app faviconswishfinance.app
0 comments5:07 PMView on HN
The goal of Swish is to provide a zero-click way to view your portfolio.

I made Swish after finding it too annoying and tedious to check my portfolio in my broker or Google Finance. This app removes the conscious decision to "poll" your portfolio by _always_ showing it on your desktop. While this might not be everyone's cup of tea, I hope that you find it interesting nevertheless.

After graduating high school (in 2020) and first getting started with stocks, I was surprised there were no simple "widget" apps for stock tickers. Somehow, what used to exist in the Windows Vista era ceased to exist. So, I made this app purely to satisfy my own needs, but then I quickly started getting email suggestions from customers to add new features. Even though my initial goal was to create something similar to the widgets in Windows Vista, it has now evolved into a much more refined experience.

I render the widgets using native Direct2D/Direct3D11/DComp APIs. There are still optimizations to be made in the rendering pipeline (hardware accelerated tessellation, using DirectComposition swap chain, etc.). I strive for high performance as I know many folks use my app on their laptop, where battery life is important. There are also a ton of features I'd like to add, but haven't had the time.

Let me know what you think!

1

Presidents Flights – map of world leaders' trips #

presidentsflights.com faviconpresidentsflights.com
0 comments9:23 PMView on HN
I built an interactive map of international trips by world leaders with filters for departures, arrivals, and time period. Data is compiled from public sources (Wikipedia and official press releases), and each trip links to its source. Feedback welcome.