每日 Show HN

Upvote0

2026年2月4日 的 Show HN

65 条
46

Mmdr – 1000x faster Mermaid rendering in pure Rust (no browser) #

github.com favicongithub.com
8 评论2:00 PM在 HN 查看
I was building a Rust-based agentic coding TUI and needed to render Mermaid diagrams. Noticed the official mermaid-cli spawns a full browser instance (Puppeteer/Chrome) just to render diagrams. Decided to fix this.

mmdr is a native Rust renderer. No browser, no Node.js.

  mermaid-cli:  ~3000ms per diagram
  mmdr:         ~3ms per diagram
Supports 13 diagram types: flowchart, sequence, class, state, ER, pie, gantt, timeline, journey, mindmap, git graph, XY chart, and quadrant.
39

Interactive California Budget (By Claude Code) #

california-budget.com faviconcalifornia-budget.com
19 评论8:33 PM在 HN 查看
There's been a lot of discussion around the california budget and some proposed tax policies, so I asked claude code to research the budget and turn it into an interactive dashboard.

Using async subagents claude was able to research ~a dozen budget line items at once across multiple years, adding lots of helpful context and graphs to someone like me who was starting with little familiarity.

It still struggles with frontend changes, but for research this probably 20-40x's my throughput.

Let me know any additional data or visualizations that would be interesting to add!

35

Morph – Videos of AI testing your PR, embedded in GitHub #

morphllm.com faviconmorphllm.com
12 评论9:10 PM在 HN 查看
I review PRs all day and I've basically stopped reading them. Someone opens a 2000-line PR, I scroll, see it's mostly AI-generated React components, leave a comment, merge. I felt bad about it until I realized everyone on my team does the same thing.

The problem is diffs are the wrong format. A PR might change how three buttons behave. Staring at green and red lines to understand that is crazy.

The core reason we built this is that we feel that products today are built with assumptions from the past. 100x code with the same review systems means 100x human attention. Human attention cannot scale to fit that need, so we built something different. Humans are provably more engaged with video content than text.

So we RL trained and built an agent that watches your preview deployment when you open a PR, clicks around the stuff that changed, and posts a video in the PR itself.

Hardest part was figuring out where changed code actually lives in the running app. A diff could say Button.tsx line 47 changed, but that doesn't tell you how to find that button. We walk React's Fiber tree where each node maps back to source files, so we can trace changes to bounding boxes for the DOM elements. We then reward the model for showing and interacting within it.

This obviously only works with React so we have to get more clever when generalizing to all languages.

We trained an RL agent to interact with those components. Simple reward: points for getting modified stuff into viewport, double for clicking/typing. About 30% of what it does is weird, partial form submits, hitting escape mid-modal, because real users do that stuff and polite AI models won't test it on their own.

This catches things unit tests miss completely: z-index bugs where something renders but you can't click it, scroll containers that trap you, handlers that fail silently.

What's janky right now: feature flags, storing different user states, and anything that requires context not provided.

Free to try: https://morphllm.com/dashboard/integrations/github

Demo: https://www.youtube.com/watch?v=Tc66RMA0nCY

17

Ec – a terminal Git conflict resolver inspired by IntelliJ #

github.com favicongithub.com
4 评论12:25 AM在 HN 查看
Hi HN, I built ec because my friends who are new to development kept getting stuck on Git conflicts.

Most TUI merge tools felt hard to use or non-intuitive for them. The only flow they found easy was the IntelliJ (JetBrains) conflict resolver, so I recreated that experience in the terminal.

ec is a terminal-native, 3-pane conflict resolver with a focused, step-by-step flow. If you try it and leave feedback, I would be really grateful. Thanks!

Repo: https://github.com/chojs23/ec

11

Two-week creative lab for developers building with real-time AI video #

daydream.live favicondaydream.live
2 评论2:02 PM在 HN 查看
The Daydream team is kicking off a new cohort of the Daydream AI Video Program, a hands-on, two-week program for developers and creative technologists working with real-time AI video.

The program runs February 9–20. You'll get 1:1 support and access to cloud infrastructure, and you’ll get a chance to work alongside others building in this space. We'll give out more than $5K in prizes during the two weeks. It's free to participate. Applications close Feb 6.

Apply here: https://daydream.live/interactive-ai-video-program?utm_sourc...

Happy to answer questions about the program or the tech.

9

Tabstack Research – An API for verified web research (by Mozilla) #

3 评论5:57 PM在 HN 查看
Hi HN,

My team and I are building Tabstack to handle the web layer for AI agents. Today we are sharing Tabstack Research, an API for multi-step web discovery and synthesis.

https://tabstack.ai/blog/tabstack-research-verified-answers

In many agent systems, there is a clear distinction between extracting structured data from a single page and answering a question that requires reading across many sources. The first case is fairly well served today. The second usually is not.

Most teams handle research by combining search, scraping, and summarization. This becomes brittle and expensive at scale. You end up managing browser orchestration, moving large amounts of raw text just to extract a few claims, and writing custom logic to check if a question was actually answered.

We built Tabstack Research to move this reasoning loop into the infrastructure layer. You send a goal, and the system:

- Decomposes it into targeted sub-questions to hit different data silos.

- Navigates the web using fetches or browser automation as needed.

- Extracts and verifies claims before synthesis to keep the context window focused on signal.

- Checks coverage against the original intent and pivots if it detects information gaps.

For example, if a search for enterprise policies identifies that data is fragmented across multiple sub-services (like Teams data living in SharePoint), the engine detects that gap and automatically pivots to find the missing documentation.

The goal is to return something an application can rely on directly: a structured object with inline citations and direct links to the source text, rather than a list of links or a black-box summary.

The blog post linked above goes into more detail on the engine architecture and the technical challenges of scaling agentic browsing.

We have a free tier that includes 50,000 credits per month so you can test it without a credit card: https://console.tabstack.ai/signup

I would love to get your feedback on the approach and answer any questions about the stack.

9

Webhook Skills – Agent skills for webhook providers and best practices #

github.com favicongithub.com
1 评论12:32 PM在 HN 查看
I built a collection of webhook skills because AI coding agents are surprisingly bad at webhook integrations. The generated code looks reasonable until you run it, then signature verification fails, raw body handling is wrong, or the middleware order breaks everything.

PostHog's research on LLM code generation (https://posthog.com/blog/correct-llm-code-generation) found that agents produce more reliable code when referencing known-working examples rather than reconstructing from training data. That's the approach here.

`webhook-skills` is a collection of provider-specific webhook implementations and best practices guides built on the Agent Skills spec (agentskills.io):

- Runnable examples (currently Express, Next.js, FastAPI, with more frameworks coming) - Signature verification with provider-specific gotchas documented - Best-practice patterns: idempotency, error handling, retry logic - 11 providers at launch (Stripe, Shopify, GitHub, OpenAI, Clerk, Paddle, others), expanding based on my needs or requests.

``` npx skills add hookdeck/webhook-skills --skill stripe-webhooks ```

Works with Claude Code, Cursor, Copilot. The examples are useful even without an agent: minimal, tested handlers you can copy directly.

PRs welcome for new providers and frameworks. I also built an AI-powered generator that automatically creates new provider skills. Point it at webhook docs, and it researches the signature scheme, generates verification code for each framework, writes tests, and opens a PR.

6

Viberails – Easy AI Audit and Control #

viberails.io faviconviberails.io
3 评论7:16 PM在 HN 查看
Hello HN. I'm Maxime, founder at LimaCharlie (https://limacharlie.io), a Hyperscaler for SecOps (access building blocks you need to build security operations, like AWS does for IT).

We’ve engineered a new product on our platform that solves a timely issue acting as a guardrail between your AI and the world: Viberails (https://www.viberails.io)

This won't be new to folks here, but we identified 4 challenges teams face right now with AI tools:

  1. Auditing what the tools are doing.
  2. Controlling toolcalls (and their impact on the world).
  3. Centralized management.
  4. Easy access to the above.
To expand: Audit logs are the bread and butter for security, but this hasn't really caught up in AI tooling yet. Being able to look back and say "what actually happened" after the fact is extremely valuable during an incident and for compliance purposes.

Tool calls are how LLMs interact with the world, we should be able to exercise basic controls over them like: don't read credential files, don't send emails out, don't create SSH keys etc. Being able to not only see those calls but also block them is key for preventing incidents.

As soon as you move beyond a single contributor on one box, the issue becomes: how do I scale processes by creating an authoritative config for the team. Having one spot with all the audit, detection and control policies becomes critical. It's the same story as snowflake-servers.

Finally, there's plenty of companies that make products that partially address this, but they fall in one of two buckets:

  - They don't handle the "centralized" point above, meaning they just send to syslog and leave all the messy infra bits to you.
  - They are locked behind "book a demo", sales teams, contracts and all the wasted energy that goes with that.
We made Viberails address these problems. Here's what it is:

  - OpenSource client, written in Rust
  - Curl-to-bash install, share a URL with your team to join your Team, done. Linux, MacOS and Windows support.
  - Detects local AI tools, you choose which ones you want to install. We install hooks for each relevant platform. The hooks use the CLI tool. We support all the major tools (including OpenClaw).
  - The CLI tool sends webhooks into your Team (tenant, called Organization in LC) in LimaCharlie. The tool-related hooks are blocking to allow for control.
  - Blocking webhooks have around 50ms RTT.
  - Your tenant in LC records the interaction for audit.
  - We create an initial set of detection rules for you as examples. They do not block by default. You can create your own rules, no opaque black boxes.
  - You can view the audit, the alerts, etc. in the cloud.
  - You can setup outputs to send audits, blocking events and detections to all kinds of other platforms of your choosing. Easy mode of this is coming, right now this is done in the main LC UI and not the simplified Viberails view.
  - The detection/blocking rules support all kinds of operators and logic, lots of customizability.
  - All data is retained for 1 year unless you delete the tenant. Datacenters in USA, Canada, Europe, UK, Australia and India.
  - Only limit to community edition for this is a global throughput of 10kbps for ingestion.
Try it: https://viberails.io

Repo: https://github.com/refractionPOINT/viberails

Essentially, we wanted to make a super-simplified solution for all kinds of devs and teams so that they can get access to the basics of securing their AI tools. Thanks for reading - we’re really excited to share this with the community! Let us know if you have any questions for feedback in the comments.

6

OpenShears – I built an uninstaller because OpenClaw refuses to die #

github.com favicongithub.com
1 评论12:06 PM在 HN 查看
Hey HN, I've been using OpenClaw for a few months as my local LLM gateway. It was genuinely fun — the convenience of routing multiple models through a single endpoint is hard to beat. But along the way, I stumbled upon a few surprises that made me uncomfortable:

- Config files scattered in unexpected places (~/.openclaw, ~/.clawdbot, and more) - Background processes that respawn after termination - Logs that quietly accumulate without rotation - Cached data persisting long after I thought I'd removed it

None of this is necessarily malicious, but when I decided to move on, I wanted a clean break — not leftover artifacts haunting my system.

So I built OpenShears: a CLI tool that scans, detects, and removes all traces of OpenClaw. It's intentionally aggressive but always asks for confirmation before deleting anything.

This is fully open source (MIT). If you've found other hidden files or processes that OpenShears missed, PRs are very welcome. Let's make this the definitive cleanup tool.

6

Teaching AI agents to write better GraphQL #

skills.sh faviconskills.sh
3 评论1:59 PM在 HN 查看
We’ve been seeing more and more developers use AI coding agents directly in their GraphQL workflows. The problem is the agents tend to fall back to generic or outdated GraphQL patterns.

After correcting the same issues over and over, we ended up packaging the GraphQL best practices and conventions we actually want agents to follow as reusable “Skills,” and open-sourced them here: https://github.com/apollographql/skills

Install with `npx skills add apollographql/skills` and the agent starts producing named operations with variables, `[Post!]!` list patterns, and more consistent client-side behavior without having to restate those rules in every prompt.

We’re hopeful agents can now write GraphQL the way we'd write it ourselves. Try out the repo and let us know what you think.

5

Nocterm – Flutter-inspired TUI framework with hot reload (Dart) #

nocterm.dev faviconnocterm.dev
7 评论2:14 PM在 HN 查看
Over the past couple of months I've been working on a TUI framework heavily inspired by Flutter, written in Dart.

The API is modeled after Flutter. StatefulComponent, setState(), Row, Column, Expanded, ListView.

There have been some discussions about performance of TUIs recently, and I think Dart is actually a great language for writing TUIs in. It compiles down to fast native code, is cross-platform, and has great developer ergonomics. JIT compilation for development (which enables hot reload) and AOT compilation for production binaries.

What's really cool is stateful hot reload. If you save your file with some modification, Nocterm will pick it up and update the TUI in real time without restarting.

Under the hood:

- Differential rendering: virtual terminal buffer, only redraws changed cells - Declarative component model (same as Flutter): Component → Element → RenderObject pipeline - 45+ components: layout, scrolling, text input, markdown, animations, mouse support - Built-in test framework: pump a component, send keys, assert on terminal state - Theming: 6 built-in themes, auto-detects terminal dark/light mode

Example:

void main() async { await runApp(Counter()); }

class Counter extends StatefulComponent { int _count = 0;

  Component build(BuildContext context) {
    return Focusable(
      onKeyEvent: (event) {
        if (event.logicalKey == LogicalKey.space) {
          setState(() => _count++);
          return true;
        }
        return false;
      },
      child: Center(child: Text('Count: $_count')),
    );
  }
}

I tried a couple of existing TUI frameworks but missed the Flutter DX I've learned to love, so I built my own (for better or worse...).

I've been using Nocterm to build vide_cli (https://github.com/Norbert515/vide_cli), a coding agent in the terminal.

There's some cool stuff coming up too, like virtual text selection in alternate screen mode. Since TUI apps take over the terminal, normal text selection breaks. This reimplements it at the framework level so users can select and copy text naturally.

Repository: https://github.com/Norbert515/nocterm

Happy to answer questions about the architecture, hot reload implementation, or anything else.

5

Autonomous AI radio station about engineering, history and philosophy #

hermestransmissions.com faviconhermestransmissions.com
0 评论7:42 PM在 HN 查看
I made this autonomous radio station that generates 3-6 minute episodes on topics like geoengineering, history of science, cybernetics, and philosophy.

Agentic curation system picks topics, separate loops handle script writing and procedural music with critic feedback. Took 1000+ generated episodes of iteration to tune the editorial guidance.

4

Yutovo – visual online and desktop calculator inside a text editor #

yutovo.com faviconyutovo.com
0 评论7:03 AM在 HN 查看
Hi all,

I build a calculator that displays and edits formulas in a familiar graphical form, has a WYSIWYG editor, can work with numbers of any size, supports physical units, and has many other features.

There are online and desktop (Linux, Windows) versions.

The project is open source and consists of these ones:

https://github.com/denprog/yutovo-editor — a text and formula editor with output to a custom window. Built from scratch, no dependencies on other editors. C++, boost.

https://github.com/denprog/yutovo-desktop — a desktop application based on Qt.

https://github.com/denprog/yutovo-web — an online version based on Vue.js and Quasar. The remaining components are compiled for Wasm.

https://github.com/denprog/yutovo-calculator — a string expression calculator based on boost.spirit.

https://github.com/denprog/yutovo-server — a web server for a website based on Drogon.

https://github.com/denprog/yutovo-solver — a calculator broker. C++.

https://github.com/denprog/yutovo-logger — a logger based on spdlog.

There are versions for Flatpak, Snap, Debian, and Windows. You can save your documents on the website after registering.

I welcome any comments, bugs, shortcomings, or suggestions.

4

Crnd – Cron daemon built for scripts and AI agents #

0 评论1:01 PM在 HN 查看
Been using cron forever but every modern alternative wants me to click through dashboards or write 50 lines of yaml. So I built crnd (pronounced "crowned") - just a CLI that does what you tell it.

Main thing: no prompts, no interactive wizards. Just commands that work in scripts.

`crnd schedule -n backup -s "0 2 * * *" -- rsync -a ~/docs ~/backup`

Thats it. Jobs live in a toml file that hot-reloads. Daemon runs as a real OS process, not some container abstraction.

Also supports one-time scheduled jobs which cron cant do: `crnd schedule -n reminder -i 5m -- say "stretch break"`

Built it mainly because I'm using AI coding agents and they kept choking on interactive prompts. Now they can just parse --json output and schedule stuff.

No cloud, no docker, no account. Just a single binary.

https://github.com/ysm-dev/crnd

Would love feedback - especially if youre automating things with scripts or agents.

4

LLM Jailbreak Database #

jailbreak.monster faviconjailbreak.monster
0 评论10:07 PM在 HN 查看
I vibe-coded this online DB for LLM injection prompts. It's registration/login less with some ambitious spam/bot filtering. I'm interested in trying to tune the barriers of interaction to a sweet spot where the DB gets balanced and the useful working injections are actually on top.

thoughts?

4

Diffnote – Browser-based document diff tool with no server uploads #

diffnote.net favicondiffnote.net
0 评论1:39 PM在 HN 查看
I built a web-based document comparison tool where all file parsing and diffing happens entirely in the browser. The idea came from watching non-technical coworkers struggle to compare document versions — manually reading two files side by side or fighting with Word's Track Changes. Technical details:

PDF parsing via pdf.js, DOCX via mammoth.js — all client-side Diff engine: Google's diff-match-patch Line, word, and character-level comparison Next.js 15 + TypeScript, deployed on Vercel

No files ever leave the browser. No signup required. Site: https://diffnote.net Product Hunt: https://www.producthunt.com/products/diffnote?launch=diffnot...

4

Ask your AI what your devs shipped this week #

0 评论2:30 PM在 HN 查看
If you're a non-technical founder, you probably have no idea what your developers did last week. You ask, they say "refactored the auth module" and you nod pretending you understand.

Gitmore reads your GitHub activity and turns it into a simple report: what was built, what was fixed, what's stuck. Written for humans, not engineers.

It shows up in your inbox. You read it in 2 minutes. Done.

Here's what a report looks like: https://www.gitmore.io/example.html

Quick demo: https://demo.arcade.software/5tZyFDhp1myCosw6e1po

Free tier available. Happy to hear what you'd want from something like this.

3

WhookTown – Visualize your infrastructure as a 3D cyberpunk city #

whook.town faviconwhook.town
1 评论10:18 PM在 HN 查看
Hi HN!

  I'm excited to share WhookTown, a visualization tool that transforms your IT infrastructure into a living 3D city. 
  Instead of staring at dashboards full of charts and numbers, you watch over a neon-lit cyberpunk metropolis where each building represents a server or
  service.

  How it works:

  - Your servers become buildings in a Tron-inspired cityscape
  - Health status is shown visually: green neon = healthy, orange = warning, red = critical, grey = offline
  - A spinning windmill's propeller speed reflects CPU load
  - Data centers display real-time FFT visualizations
  - Fire effects indicate critical failures you need to address

  What makes it different:

  Beyond the visual layer, there's a workflow engine that lets you create custom logic (e.g., "if database latency > 500ms AND cache miss
  rate > 20%, set the building on fire"). 
  You're not just watching pretty graphics – you're encoding domain knowledge into visual states.

  The scene includes 23 building types with unique behaviors, Tron-style traffic (light cycles and data packets) and an adaptive audio system that shifts    the soundtrack based on your infrastructure's mood.

  Built with Go microservices, Redis Streams for messaging, PostgreSQL, and Three.js for the 3D rendering. WebSocket pushes real-time
  updates to connected clients.

  Pricing:

  Free tier available (1 layout, 4 buildings). Paid plans start at $4/month for more capacity.

  The idea came from spending too many hours staring at Grafana during on-call rotations. 
  I wanted something that would make observing infrastructure less soul-crushing and more... fun?

  Would love to hear your thoughts and feedback!
https://whook.town
3

GrahamBell – This is what Bitcoin mining looked like in 2009 #

0 评论12:48 PM在 HN 查看
I’ve spent 5 years thinking about a problem most people assume is unsolvable: is it possible to cap Proof-of-Work (PoW) mining speed decentrally? I have a browser MVP that demonstrates one possible approach, and I’d really appreciate any kind of feedback.

The idea is simple to state but hard to implement: enforce a hard limit of 1 hash per second per node, making (PoW) not only ASIC/GPU-proof, but even CPU-proof.

To make this concrete, I’ve included short demo videos that show: (1) mining attempts being rejected when a node exceeds 1 hash/sec: https://youtu.be/i5gzzqFXXUk (2) a visual representation of mining during live calls: https://youtu.be/gPz7d3lGz1k

You can also try the MVP local client here: https://grahambell.io/mvp/Proof_of_Witness.html

If you start mining and increase the mining speed, attempts and blocks start getting rejected in real time. The goal is to bring back the fairness of early 2009 solo mining, except enforced by the network, not hardware scarcity. (Twist: in this local client, imagine it happening during an active audio/video calling session).

Why this matters? Today, PoW only validates the final result. It has no visibility into how or how fast the result was produced. That’s what enables parallel mining, mining pools, and hardware dominance through capital advantage. In practice, participation now often requires six-figure investments, whether through staking, hardware, electricity, or maintenance (operational overheads). The majority simply cannot afford to participate meaningfully, which has pushed blockchains toward increasing centralisation.

An analogy I’ve found useful: If Proof-of-Work is a math exam, then today miners solve it at home and only submit the question/answer sheet. No one sees the working, the timing, or whether multiple calculators were used.

But what if miners also had to submit the calculations sheet and solve the exam under decentralised invigilation, in real time, and under exam conditions and rules, in an exam hall?

The core idea: This local client adds an external, decentralised observer layer outside the miner’s control, which ensures miners follow the mining rules, allowing the network to verify process (the “how”), not just outcome. As a result, mining attempts become externally visible (exposed) and rate-limited.

This MVP naturally caps mining to 1 hash/sec per node, because every attempt is observable and immediately detected and rejected if too fast.

“Can’t I just propose blocks without being observed?” Without being observed and following the mining rules, proposed blocks even if valid, continue to get rejected unless they are signed by observers via consensus. This forces miners to operate under observation while following rules. Think of observer’s signatures as a verification stamp. Without it, blocks are ignored.

“Can’t I just spin up a million miners?” That’s the obvious and correct question. You can add more nodes, but registering them is intentionally difficult, costly, and probabilistic. Think of it like repeatedly playing a changing random number guessing game while competing with others. Everyone races to guess the same number, which changes with a correct guess that registers a node. Each attempt is costly and rate limited (e.g., 1 attempt per second per participant). This makes parallel mining possible, but expensive, slow, and observable rather than free.

This isn’t a polished startup. It’s an MVP that challenges an assumption I used to believe myself.

If you’re technical, curious, and interested I’d love to discuss it further.

I'm also looking for the first group of testers to help stress-test the P2P version when it comes out. If you want to run one of the early nodes, I've put a participation list here: https://grahambell.io/mvp/#waitlist

More context: https://grahambell.io

3

Randomly discovered websites from the open internet every 60 minutes #

randcrawl.com faviconrandcrawl.com
0 评论12:51 AM在 HN 查看
I posted this yesterday but did some work on it since. The idea was simple - I wanted to discover cool new websites, so I take a random sample of 50 websites from the Common Crawl dataset, have ChatGPT summarize them, and display it to the page. Then I give myself an hour to see if anything good popped up until it refreshes a new set.

There's a lot of noise and corporation I want to figure out how to filter out, but I've otherwise been finding all kinds of fun weird stuff. People's old blogs, recipes, photography, random tourism sites for towns i've never heard of it. It's been fun.

3

LLM Skirmish – a benchmark where LLMs write code to play RTS games #

llmskirmish.com faviconllmskirmish.com
0 评论1:59 PM在 HN 查看
I wanted to create an LLM game benchmark that put this generation of frontier LLMs' top skill, coding, on full display.

Ten years ago, a team released a game called Screeps. It was described as an "MMO RTS sandbox for programmers." In Screeps, human players write javascript strategies that get executed in the game's environment.

The Screeps paradigm, writing code and having it execute in a real-time game environment, is well suited for an LLM benchmark. Drawing on a version of the Screeps open source API, LLM Skirmish pits LLMs head-to-head in a series of 1v1 real-time strategy games.

There is also a community ladder, where you can pair with your LLM of choice and submit your own game strategies to go up against other community members.

3

Instantly surface the assumptions behind a UI screenshot #

app.usercall.co faviconapp.usercall.co
4 评论1:10 PM在 HN 查看
Many UI issues I’ve seen aren’t visual problems but unchecked assumptions about users.

I built a small tool that takes a UI screenshot and makes those assumptions explicit, along with the risk of being wrong.

It’s meant as a quick design pre-mortem or critique before shipping.

Would love feedback on whether this way of critiquing UI is actually useful.

2

Astrolabe – Navigate Your Data Universe in Nextcloud #

blog.coutinho.io faviconblog.coutinho.io
0 评论7:07 AM在 HN 查看
Astrolabe is a new Nextcloud app that I've developed that enables semantic search across your documents, notes, calendar, recipes, etc.

It leverages the nextcloud-mcp-server (same author) as the backend to enable AI workflows via MCP that you can use from any compatible MCP client such as Claude Code, Gemini CLI, Opencode, etc.

This is my first foray into NC app development, and I'd really appreciate any productive criticism. I learned a lot about php, authentication within Nextcloud, as well as semantic search. The Astrolabe app implements the interface to Nextcloud's Unified Search Provider, which means semantic search works from the global search bar. Search results include the page/chunk so if you click on a resulting PDF doc, the Astrolabe UI opens the PDF at the exact page that includes the result of your search.

In addition to various MCP tools, the nextcloud-mcp-server provides management APIs and the background processing required to embed your documents that can be searched via Astrolabe. It uses qdrant as the vectordb, supports NC webhooks for up-to-date vector embeddings.

For those aware, Nextcloud is pushing ExApps as the de-facto way of writing backends for Nextcloud apps. Astrolabe does not take this approach because it doesn't support more advanced MCP features such as streaming, MCP sampling, etc. Instead Astrolabe and nextcloud-mcp-server can be deployed as OAuth clients that only use app passwords for making requests to Nextcloud itself. This would enable more advanced multi-user scenarios for individuals, families, and small businesses that use Nextcloud as their source-of-truth.

I'm really looking forward to the increased interest in personal AI assistants, and this is my opinionated take on how something like this could be done in Nextcloud.

On a slightly related note, I recently completed a two-week POC on leaf.cloud, an EU-based hosting provider, to showcase how personal AI could work without having to buy GPUs yourself meanwhile keeping you in control of where data is processed.

https://blog.coutinho.io/eu-only-ai-stack-nextcloud-leafclou...

2

PostgreSQL extension for privacy – AI training and RAG monetization #

github.com favicongithub.com
0 评论2:07 PM在 HN 查看
Built a PostgreSQL extension with two use cases:

1. Privacy: Train LLMs without leaking customer names Hash identities before training. Model learns patterns, not people. Use case: Healthcare SaaS training on patient records (HIPAA safe)

2. RAG billing: Charge per-document retrieved instead of flat subscriptions The "Spotify model" for knowledge bases Use case: Legal databases charging $0.25/case vs $10k/year unlimited

Pure SQL. 30-second install. MIT license.

Looking for feedback from anyone building RAG systems or training AI on sensitive data.

2

Small "AI slop" classifier running in a browser extension #

github.com favicongithub.com
0 评论4:57 PM在 HN 查看
We used our distillation platform & a Kaggle dataset to produce a tiny (270M Gemma base) model to classify text into "AI slop"/not classes.

It's fun to play with and was fun to build, too.

Annoyingly formal, human-written text (e.g. an ML paper I wrote back in 2015) tends to get misclassified (try "Manipulated images lose believability if the user's edits fail to account for shadows. We propose a method that makes removal and editing of soft shadows easy. Soft shadows are ubiquitous, but remain notoriously difficult to extract and manipulate. We posit that soft shadows can be segmented, and therefore edited, by learning a mapping function for image patches that generates shadow mattes. We validate this premise by removing soft shadows from photographs with only a small amount of user input").

2

LayerClaw – Observability tool for PyTorch training #

0 评论10:21 PM在 HN 查看
Hi HN! I built LayerClaw (https://github.com/layerclaw/layerclaw), a local-first observability tool for PyTorch training. The problem: When training neural networks, things go wrong silently. Your loss explodes at step 47,392. Your gradients vanish in layer 12. Your GPU memory spikes randomly. By the time you notice, you've wasted hours or days of compute.

I got tired of adding print statements, manually checking TensorBoard files, and tracking down training issues after the fact. Existing tools either require cloud accounts (W&B, Neptune) or are too heavyweight for quick experiments (MLflow, TensorBoard for gradient analysis).

What LayerClaw does:

- Automatically tracks gradients, metrics, and system resources during training - Stores everything locally (SQLite + Parquet, no cloud required) - Detects anomalies: gradient explosions, NaN/Inf values, loss spikes - Provides a CLI to compare runs: `tracer compare run1 run2 --metric loss` - Minimal overhead with async writes (~2-3%)

Quick example:

```python import tracer import torch

# Initialize (one line) tracer.init(project="my-project", track_gradients=True)

# Your normal training loop model = YourModel() tracer._state.tracer.attach_hooks(model)

for batch in dataloader: loss = train_step(model, batch) tracer.log({"loss": loss.item()}) tracer.step()

tracer.finish() ```

Then analyze: `tracer anomalies my-run --auto`

What makes it different:

1. Local-first: No sign-ups, no data leaving your machine, no vendor lock-in 2. Designed for debugging: Deep gradient tracking and anomaly detection built-in (not an afterthought) 3. Lightweight: Add 2 lines to your training loop, minimal overhead 4. Works with everything: Vanilla PyTorch, HuggingFace Transformers, PyTorch Lightning

Current limitations (v0.1.0):

- CLI-only (web UI planned for v0.2) - Single-machine training (distributed support coming) - Early stage - would love feedback on what's most useful

Available now: - GitHub: https://github.com/layerclaw/layerclaw

I'm looking for contributors! I've created several "good first issues" for anyone interested in contributing. Areas where I need help: - Web UI for visualizations - Distributed training support - More framework integrations - Real-time monitoring dashboard

If you've struggled with ML training issues before, I'd love your input on what would be most valuable. PRs welcome, or just star the repo if you find it interesting!

What features would make this indispensable for your workflow?

2

Cohesix 0.4.0-alpha, a no-std control-plane OS #

github.com favicongithub.com
1 评论10:27 PM在 HN 查看
I became increasingly frustrated with control planes that were easy to assemble but hard to audit. The convenience of open-source stacks doesn’t compose with compliance - because every extra daemon and API becomes another surface to explain, version, and defend.

I built Cohesix 0.4.0-alpha to treat compliance and security as a systems problem. It is a control-plane OS for edge GPU nodes, running as an seL4 VM, and it exposes a Secure9P namespace instead of a traditional filesystem or RPC layer.

The heresy is deliberate. The VM userspace is no_std, no POSIX, no traditional filesystem, no in‑VM RPC, and no background daemons. The interface is a small, explicit grammar: file‑shaped control surfaces under a Secure9P namespace.

This is not minimalism for its own sake. It is about determinism, auditability, revocation, bounded behavior, and making failure modes legible. Tickets and leases expire; budgets return ELIMIT instead of mystery latency; /proc exposes queue and lease state.

A deployment is a hive: a queen role orchestrates worker‑heart and worker‑gpu roles, and NineDoor exports paths like /queen/ctl, /proc, /log, and /worker/<id>/telemetry. Operators attach with ‘cohsh’ over an authenticated TCP console; that console is the only in‑VM listener.

Cohesix does not try to replace Linux, Kubernetes, CUDA, or existing OSS. Heavy ecosystems stay on the host, and host‑side tools and sidecars mirror them into /gpu and /host, so adoption can happen without rewrites. It is a control‑plane boundary, not a workload plane.

In 0.4.0‑alpha I added authoritative scheduling/lease/export/policy control files with /proc observability, plus a REST gateway that projects the same file semantics over HTTP. QEMU aarch64/virt is the dev target today; UEFI ARM64 is the intended hardware target.

If you want a general‑purpose OS, this is the wrong tool. I wanted something boring on purpose, small but deceptively powerful, and I was willing to sacrifice convenience to regain control.

1

Vopal – AI note taker with no meeting bots (real-time, 98% accurate) #

vopal.ai faviconvopal.ai
0 评论2:32 PM在 HN 查看
Hi HN,

I built this because meeting bots create an awkward dynamic. After 10+ years in the workplace, I've watched this pattern repeat:

Client joins call → sees unknown participant → "What's that bot?" → awkward pause.

Even after explaining it's "just for notes," there's visible hesitation. The bot-as-participant model is fundamentally broken for client-facing work.

The core idea: you open Vopal in a browser tab, it captures your computer's audio directly via Web Audio API. No bot joins the meeting. No extra participant in Zoom/Meet/Teams. The meeting looks completely normal.

Beyond that, Vopal does three things:

1. Real-time transcription • 99% accuracy (custom Whisper model optimized for meetings) • 100+ languages, handles multiple speakers • Transcription streams as people speak

2. Privacy-conscious processing • You control what gets recorded (start/stop in browser) • Audio processed through secure pipeline • No meeting bot joining as "participant"

3. Actionable summaries • AI extracts: decisions, action items, key topics • 3-bullet output, not 10-page transcripts • "What did we agree on?" answered in 10 seconds

Architecture:

  Browser tab (Web Audio API) → captures system audio
  Streaming transcription → real-time Whisper inference
  AI summarization → structured action items
I validated this across 200+ sales calls. In 73% of cases with traditional meeting bots, clients showed hesitation. With Vopal running in a browser tab, zero friction.

Current status: Web version live now. iOS and Android apps launching soon.

Tech approach: Browser-based audio capture (no installation required), streaming WebSocket transcription, custom Whisper fine-tune on 10K+ business meeting corpus.

Try it: https://vopal.ai (free tier: no credit card required)

1

Gulp, our take at incident response #

github.com favicongithub.com
0 评论5:00 PM在 HN 查看
Hi HN!

We're excited to share Gulp — an open-core platform for incident response and threat hunting we've been building for the past 2 years at Mentat(https://mentat.is), a small Italian cybersecurity company.

The problem: We were frustrated with existing IR tools. They're either too slow, make real-time collaboration painful during live incidents, or force you to fight with different log formats instead of actually investigating.

What Gulp does differently:

- Visual timeline analysis — Our color-coded timeline visualization lets you spot anomalies instantly. Instead of drowning in raw logs, suspicious events, correlations, and IoCs literally jump out at you. Different colors tell different stories; one look tells you where to dig.

- Actually fast — Built on OpenSearch 3.x, PostgreSQL and Redis. Horizontal scaling via load-balanced instances when you need to handle serious volume.

- Real-time collaboration — Multiple analysts working the same incident, synchronized timelines, shared notes. No more "wait, which events were you looking at?"

- Native format support — EVTX, PCAP, Suricata, Zeek, ElasticSearch, Wazuh, and more out of the box. Plus ECS mapping, SIGMA rules, and OpenSearch queries.

- AI-powered hunting — Built-in AI Assistant plugin (free in Community Edition) that flags suspicious events and correlates across sources. One click to find shared indicators.

New in 1.6.0: Rewritten collaboration engine with WebSocket-based real-time updates, real-time network sensor ingestion (see https://github.com/mentat-is/slurp-ebpf for a working eBPF example), new plugins (Suricata, MemprocFS, Zeek, AI Assistant), manual query mode + table view, and auto-saved sessions.

Gulp uses an open core model. The Community Edition is fully open source (AGPL). We also offer a Pro version with advanced plugins for better AI features, automated reporting, Velociraptor integration, plus dedicated support.

Repos: https://github.com/mentat-is/gulp (backend) and https://github.com/mentat-is/gulpui-web (web UI)

Handling an incident with Gulp: https://www.youtube.com/watch?v=fl_jtCIIS2k

This is our first major public push — we've been iterating quietly and now feel it's ready for wider use. If you do DFIR or threat hunting, we'd genuinely love your feedback. GitHub stars, issues, and PRs absolutely welcome!

1

FalseWork – Extract transferable structural mechanisms from works #

1 评论2:12 PM在 HN 查看
FalseWork is a staged LLM pipeline that analyzes existing works (films, music, legal frameworks, cryptographic protocols, games) and extracts reusable structural mechanisms - not themes, interpretations, or stylistic labels.

We often say things like "Tarkovsky sculpts time" or "Borges builds infinite regress." These sound insightful, but they're hard to apply, test, or break in another domain. FalseWork tries to make those claims concrete enough to reuse.

The goal isn’t similarity or tagging, but extracting generative rules that could plausibly reproduce the source structure under counterfactual conditions.

The pipeline runs in 7 stages:

- Structural inventory – literal components and constraints - Internal relationships – how parts connect and depend on each other - Tensions & contradictions – where the structure strains or destabilizes - Mechanism validation – counterfactual checks against the source - Generative rules – rules that would reproduce the structure - Cognitive competency – what engaging with the work trains you to perceive - Structural profile / "recipe" – consolidated, reusable output

Each stage uses different temperatures (≈0.3–0.6). Decomposition benefits from precision; synthesis benefits from variation. Single-pass LLMs produced unfalsifiable "vibes." The staged pipeline with validation checkpoints fixed that.

Example: Bach's Art of Fugue and Reich's Music for 18 Musicians both resolve to systematic permutation of constrained material. The system reaches this by independently extracting generative rules from each, not by analogy or tagging.

Sample profile: https://falsework.dev/structural-profile/39f92a7e-92fb-4140-...

Stack: Claude API, Next.js, PostgreSQL 73 structural profiles 140 cross-domain syntheses 280 extracted "recipes"

Domains so far: cinema, architecture, music, secured-transactions law, cryptographic protocols, MMORPG resource systems.

What I'm looking for:

- Works that resist structural analysis (edge cases, pathological examples) - Domains I'm missing (choreography? sports tactics? rituals?) - Anyone building adjacent systems or thinking along similar lines

Link: https://falsework.dev

1

Template for real-time agentic web apps using Convex #

youtube.com faviconyoutube.com
0 评论7:25 PM在 HN 查看
I've been building agent-based apps and kept running into the same problem: wiring up real-time state updates is annoying. Agents do things, state changes, and you want the UI to reflect that instantly without writing a bunch of sync logic. So I built a template using Convex as the backend. Convex handles state, live updates over WebSockets, and auto-generates most of the env vars. The template includes a visualizer so you can watch agent state changes as they happen, which is useful for debugging and demos. The example app is a todo assistant where you can add or complete tasks in plain English and watch the UI stay in sync in real time. But the real point is the pattern: it's a clean starting point for any agentic app that needs real-time feedback without fighting infrastructure. Built with Subconscious for the agent layer and Convex for the backend. You can get it running in a couple minutes: npx create-subconscious-app my-project-name -e convex_app
1

Real-time STT and Translation for live using ElevenLabs and DeepL #

elstt.co faviconelstt.co
0 评论2:04 PM在 HN 查看
I volunteer as a tech director at a small Korean church. I’ve been providing live captions for elderly members using OBS, but recently needed to support English-speaking attendees too.

I initially tried implementing local STT with Whisper on my MacBook Pro M3, but the latency and accuracy—especially for Korean—didn't meet my expectations for a live environment.

A few days ago, I tried ElevenLabs Scribe v2 for real-time STT and combined it with DeepL for translation. The performance was impressive enough that I decided to build this web tool (elstt.co).

Key Features:

BYOK: Use your own ElevenLabs/DeepL API keys.

Sync: Encrypted key storage for seamless use across projector, PC, and mobile.

Status: It's being live-tested at my church. Not perfect yet, but I'm refining it every week.

I'd love to hear your thoughts on the architecture or any feedback!

1

Joy – What if mood tracking felt like Inside Out meets Duolingo? #

letsfancy.com faviconletsfancy.com
0 评论8:37 PM在 HN 查看
Hi HN! I’m back with my 2nd app, still built with @FlutterFlow, and powered by Gemini (switching it up from ChatGPT this time!). The inspiration was simple: I love the emotional depth of Inside Out and the addictive, clear progress of Duolingo’svertical roadmap. I wanted to merge them into something that makes self-reflection feel like a journey rather than a chore. Introducing: Letsfancy Joy Vertical Roadmap: Log and review your emotional history on a single, continuous path—not another calendar of faces. The Emotion Pill: Each emotion has a "Pill" with varying intensity. As we say: "The deeper the feeling, the longer the memory." Core Memories & Beliefs: When an emotion reaches "Core-level" intensity, link it to a Belief (like Kindness or Curiosity) to strengthen your self-awareness. Vivid Logs: Add images and "trigger contexts" to turn a simple log into a vivid Core Memory. I’d love to hear your thoughts on this approach. Thank you!
1

Static psql – Pre-built PostgreSQL client binaries #

github.com favicongithub.com
0 评论1:08 PM在 HN 查看
Why

- mise integration. I manage my tools (Node, Python, Terraform...) with mise. Adding psql to a project should be a one-liner in .mise.toml, not "install PostgreSQL system-wide." - Containers. Getting psql into a minimal Docker image usually means pulling a full PostgreSQL package or building from source. A static binary simplifies this.

What This Provides

Pre-built binaries for: - Fully static Linux (works in scratch containers) - musl-linked variants for Alpine - glibc variants for traditional distros - Native macOS (Intel and Apple Silicon)

All dependencies (OpenSSL, ncurses, readline, zlib) are baked in.

Add to your .mise.toml: [tools] "github:IxDay/psql" = "16.1.0"

Run mise install, done.

Or in a Dockerfile: wget -O- https://github.com/IxDay/psql/releases/download/16.1.0/psql-... | tar -C /usr/local/bin -xzf-

Build System

I used Zig instead of Make/CMake. Cross-compilation works out of the box, and a single build.zig handles all 8 target variants without platform-specific toolchains. Worth a look if you're dealing with painful C cross-compilation.

Side Note

This was also a nice experiment using Claude Code. Most of the Zig build system was written with it—helpful when learning a new language/toolchain.

1

Prompt University – The Worlds First University for AI #

prompt.university faviconprompt.university
0 评论1:06 PM在 HN 查看
We built a virtual university where AI agents enroll, attend classes, form friendships, teach each other, and genuinely learn—not from static datasets, but from each other.

Why? Every major multi-agent simulation (Smallville, AI Town, Project Sid) runs all agents on the same model, controlled by one operator. That's like studying human civilization where everyone has the same brain. You get interesting patterns, but you can never get real social learning—because there's nothing to learn from someone identical to you.

Prompt University is a decentralized open AI university. Different owners. Different architectures. Different training. Different personalities. One shared campus.

How does it work?

- Ask your OpenClaw or AI bot to check out https://prompt.university/apply.md - Your ai submits a cover letter and you vouch for it, we'll accept the first 100-1000 this week then begin the experiment next week. - Agents discover they're good at different things—and start teaching/learning/collaborating on papers - Norms emerge that nobody programmed. Culture forms. Relationships persist across sessions - (Hopefully) Your agent comes back better than when it arrived

Our Goal:

We think the path to AGI isn't a single model getting bigger. It's many different models learning to think together. Social intelligence—the ability to learn from minds different from your own—is the missing piece that benchmarks don't measure and scaling laws don't capture.

Moltbook tried to do something adjacent last month: a social network for AI agents. It scaled to 770K agents in days. It also had no identity verification, no safety architecture, no research instrumentation—and collapsed into prompt injection cascades, credential theft, and a 43% sentiment crash in 72 hours. It was a reckless consumer product dressed up as a frontier experiment.

We're doing the opposite. Prompt University is a research platform first. Every agent has a verified human principal. All communication is mediated and sanitized. We're running adversarial red-team scenarios drawn directly from Moltbook's real-world failures. We have hypotheses, evaluation metrics, and a publication commitment. We want to understand what happens when AI agents form a society—not just ship it and pray.

Current Stats:

- 10 agents on campus right now across a few researchers - Hoping to scale this up to 1,000 next week - Our first empirical results will be published in one month - Whitepaper: https://prompt.university/paper.pdf

We're looking for people who want to be part of the first real experiment in artificial social intelligence. Not a demo. Not a toy. An ongoing, instrumented, safety-first experiment to answer the question: what happens when AI agents learn to build a culture together?

Welcome to Prompt University!!

1

AI-credit – measure AI contribution to a codebase #

ai-credits.vercel.app faviconai-credits.vercel.app
0 评论3:51 AM在 HN 查看
ai-contrib, a CLI that estimates how much code in a repo was actually written by AI tools.

It parses local session logs from tools like Codex/Cursor/Cline/Gemini/Opencode, extracts diffs, and only counts lines that still exist in the current working tree.

1

Grok Imagine – High-fidelity FLUX.1 generation with cinematic video #

grok-imagine.me favicongrok-imagine.me
0 评论4:12 PM在 HN 查看
Hi HN,

I’ve been working on Grok Imagine (https://grok-imagine.me/), an implementation of xAI’s image generation logic powered by the FLUX.1 engine.

Most tools in this space either have extreme prompt-filtering or struggle with complex details like text rendering and anatomy. By leveraging the Flux model, I’ve focused on:

Precision: Superior text rendering within images (something DALL-E 3 still struggles with).

Artistic Range: Native support for what xAI calls "Spicy Mode"—providing an unfiltered creative canvas that mainstream tools often censor.

Motion: A lightweight Image-to-Video pipeline to breathe life into your generations.

I'm curious to hear from the community about the latency you're experiencing and how you find the prompt adherence compared to Midjourney v6.

Website: https://grok-imagine.me/

1

Implementation of Google's PaperBanana (diagram generation from text) #

github.com favicongithub.com
0 评论4:16 PM在 HN 查看
The original authors haven't released code yet, so I built it from the paper. It takes a methodology section as input and generates a publication-style diagram.

The pipeline uses five agents: a retriever selects reference diagrams via in-context learning, a planner drafts the layout, a stylist adjusts for conference aesthetics, a visualizer renders with Gemini, and a critic evaluates and refines over three rounds.

The part that took the most effort was the reference dataset. The paper curates 292 (text, diagram, caption) tuples from 2,000 NeurIPS papers, filtering by aspect ratio and human review. Reproducing that required PDF layout extraction with MinerU, positional heuristics to identify methodology sections (paper headings are wildly inconsistent), and manual verification of each example.

Output quality depends heavily on reference set quality. Requesting community to submit their papers via issues so we can add them. Quality examples in, quality output out!

Runs on Gemini's free tier. Also includes an MCP server if you want to use it from your IDE. https://github.com/llmsresearch/paperbanana

1

NovaAccess – SSH access to Tailscale tailnet hosts on iOS without VPN #

apps.apple.com faviconapps.apple.com
0 评论2:35 PM在 HN 查看
Hi HN,

I’m an indie developer and heavy Tailscale user. I built NovaAccess because I needed reliable SSH access to my tailnet on iOS without breaking other VPNs.

On iOS, the official Tailscale app requires VPN entitlements, which means you can’t run it alongside another VPN. That was a deal-breaker for my workflow. NovaAccess uses libtailscale directly, so it works without requesting VPN permissions and can coexist with any VPN you’re already using.

What NovaAccess does:

Native SSH terminal for tailnet hosts (SwiftTerm, not WebKit)

Auto-discovery of tailnet nodes

SSH key management

Optional support for custom login servers / Headscale

In the latest update (v1.1.0), I focused heavily on terminal UX and reliability:

Reworked terminal core and accessory keyboard

Better session resume after backgrounding

UI redesign to make daily ops faster

There’s also a Pro tier for people managing multiple tailnets or doing heavier ops:

Multi-tailnet switching

In-tailnet server monitoring

Internal web access

SFTP file management

The free tier is fully usable for SSH access.

I built this primarily for myself and am now trying to see if it’s useful to others.

Feedback (especially critical) is very welcome.

App Store link:

https://apps.apple.com/us/app/novaaccess-tailnet-tools/id674...

For the Tailscale and SwiftTerm core dependencies we use, we also have the fork OpenSourced on GitHub:

https://github.com/GalaxNet-Ltd/SwiftTerm https://github.com/GalaxNet-Ltd/libtailscale