Show HN за 9 октября 2025 г.
33 постовI've built a tiny hand-held keyboard #
Open-Source Voice AI Badge Powered by ESP32+WebRTC #
video[0]
The idea is you could carry around this hardware and ask it any questions about the conference. Who is speaking, what are they speaking about etc... it connects via WebRTC to a LLM and you get a bunch of info.
This is a workshop/demo project I did for a conference. When I was talking to the organizers I mentioned that I enjoy doing hardware + WebRTC projects. They thought that was cool and so we ran with it.
I have been doing these ESP32 + voice ai projects for a bit now. Started with an embedded sdk for livekit[1] that jul 2024 and been noodling with it since then. This code then found its way into pipecat/livekit etc...
So I hope it inspires you to go build with hardware and webrtc. It's a REALLY fun space right now. Lots of different cheap microcontrollers and even more cool projects.
GYST – Digital organizer that replicates the feeling of a physical desk #
The idea is to make all these tools feel like one fluid space instead of 5 separate tools. The hope is to replicate the feeling of a physical desk : where order and freedom coexist.
This 15-min video walks through the current alpha and the vision for the full product : https://youtu.be/AcWzuBBuiPM
I’d love your feedback — especially around the concept and UX. The alpha is online if you want to try it: https://gyst.fr
This is a solo project for now, inspired by the “second brain” / PKM movement and my own frustration with fragmented tools and outdated UX.
Transcript-based video editor that runs 100% locally on Mac #
How it works: * You import videos, then ScriptEdit generates a transcript using Whisper (you can choose model). * Then you edit the text and the video cuts automatically match your edits. * Delete words to remove sections, rearrange sentences to reorder clips. * You can 'delete all filler words' or 'delete all gaps' to get much more professional videos without a lot of painstaking edits / cuts. * You can also drag / drop / cut visually.
Features:
* Native Mac app, optimized to use metal for faster transcription + rendering. * Multiple Whisper model options (tiny to large-v3) * Multi-track timeline support and layers / opacity * Title overlays and caption templates * Handles large files -no upload limits
I built this because I wanted Descript-style editing without uploading my videos anywhere or paying based on video length or transcripts generated.
I was also frustrated that tools like DaVinci Resolve, Captions, OpusClip, and Descript are all pretty expensive for what should just be a better way to edit videos locally.
Useful if you have bandwidth limits, large files, or just prefer local tools and not paying through the nose every month.
Happy to answer questions about the implementation or anything else.
App store link: https://apps.apple.com/us/app/scriptedit-video-editor/id6752...
I Hid Labubus in World Labs' AI Worlds #
Created macOS app to help you keep your distance from your screen #
If you're anything like me, you spend a good chunk of your day (and night) on your computer. I often find that when I'm zoned in, my posture gets worse and worse and my face ends up being really close to the screen. And over a course of a workday, when I finally unplug, my eyes have a hard time focusing on things that are far away. This has become a big enough problem for me that I decided to create an app to help me keep my face far enough from the screen.
Now, I could've gone with a simple notification with a timer built into it but, as with all reminder notification, they soon become noise for me and I end up just dismissing it. I needed something to actively force me to move back. Which is where FarSight comes in.
It uses your camera to gauge your distance and blurs the entire screen if it detects that you are getting close enough for a certain period of time. I made it so that it won't be extremely annoying and disruptive (e.g. blurring the screen every time you cross the line) but just enough of a nuisance to be helpful. I've been using it everyday since creating it and it's definitely helped me with eye strain, double vision, and surprisingly, my posture as well.
I'm not sure if I'll keep it free forever but I wanted to release it first to ask for feedback. I only have the app in MacOS so if it has enough interest, I'll invest into making Windows counterpart.
https://apps.apple.com/us/app/farsight-restore-your-eyes/id6...
Also, in case anyone is wondering, no data is collected and the snapshots during the app's usage are not saved but only used to calculate the distance.
A context aware backend for AI coding agents #
When using agents like Cursor or Claude to build applications, they often assume what the backend looks like instead of inspecting it. Without access to the actual backend state, they fall back on outdated information, which leads to concrete issues. For example:
1. Storage, edge functions, and database logic are closely connected. Without understanding how these parts interact, existing setups get overwritten and important flows break. 2. Database migrations conflict with foreign keys or miss functions because the agent never inspects the live schema. 3. Recreating tables or adding columns that already exist, which leads to conflicts and failed deploys.
These problems are not about the agent’s ability to code. They happen because there’s no structured way for the agent to inspect and understand the actual backend before acting.
To address this, I built InsForge, which exposes the backend in a structured way and gives the agent direct control:
1. Introspection endpoints for schema, relations, functions, triggers, policies, routes, storage, roles, documentation, logs and events 2. Control endpoints for operations usually done through CLI, dashboards or SQL editors
InsForge is a full backend platform that includes:
- Postgres - Authentication - Storage - Edge functions - Built in AI-model endpoints (via OpenRouter)
On top of this, it exposes structured backend metadata and control capabilities through an MCP server and tools, providing a structured, self-describing interface for agents to inspect schemas, policies, triggers, and docs, and interact with the backend.
It’s open source and can be self hosted (https://github.com/InsForge/InsForge) or try our cloud service at (https://insforge.dev/).
We love feedback!
Browser extension to analyze my son's Math Academy data #
I wanted a better way to understand his progress and see where he might be getting stuck. As a parent trying to support his learning, I found myself wondering:
- Is he spending a lot of time on certain lessons? Could that signal frustration?
- What's his learning efficiency? I framed this as "XP per minute" to see how smoothly he's progressing through different courses.
- Are reviews taking him almost as long as new lessons? Maybe he needs more help with foundational concepts.
To get these insights, I needed the raw data. Since there wasn't a built-in way to export or analyze it, I built a browser extension to do it for me. The Math Academy Stats Extension is a simple tool for Chrome and Firefox that helps you (or any parent/tutor) download and analyze a student's activity.
What it does:
- Fetches all activity data.
- Exports to JSON & CSV: You can get a full data dump to keep for your records or to analyze yourself in a spreadsheet.
- Generates detailed stats: This is the core feature. It opens a report showing XP per minute broken down by course (e.g., Prealgebra vs. 5th Grade Math) and activity type (Lessons, Reviews, Quizzes). It shows percentiles also so you can see variability.
To keep the stats meaningful, the analysis automatically filters out any activity that took more than 2 hours. I assume those are sessions where he walked away from the computer or decided to work on something else that day. Including those activities would skew the "time spent", which is the denominator for all the stats.
Tech Stack:
It's built with WXT (a browser extension framework), React, and TypeScript.
The data-fetching script is designed to be polite to Math Academy's servers.
There are pre-packaged zip files for easy installation: https://github.com/rahimnathwani/mathacademy-stats/releases
I'm sharing this in case other parents or adult users of Math Academy find it useful.
I have no affiliation with Math Academy.
I Built Claude Code for CUDA in 18 Hours (Open Source) #
It writes CUDA kernels, debugs memory issues, and optimizes for your specific GPU. It is a fully agentic AI with tool calling built specifically for the CUDA toolkit
This is the CLI version of RightNow AI code editor, our GPU-native code editor with a built-in emulator, visual profiling, and remote GPU access. The CLI is open source, lightweight, and fast. It is designed for anyone who wants AI help with CUDA without setup
I used Python because it is the most common language, so anyone can build on top of it. You can clone it and customize it for your own use case, not just for CUDA:D
Oneseal – Secrets, configs, and platform outputs as code #
I built Oneseal: a small CLI that turns platform outputs (Terraform state, etc.) into a typed, versioned SDK you can install and import. The goal is to make consumption predictable and diffable, not to replace your vault.
What it does - Reads outputs (secrets, URLs, flags, IDs, connection strings)
- Generates a package (TypeScript today) with types + multi-env selection
- Deterministic artifacts, safe to commit or publish to your internal registry.
If this solves your env-drift pain, tell me where it breaks in your stack !
Saoirse, a privacy-first research assistant (Mac Beta) #
Currently, Saoirse runs locally on Mac and routes external model requests through a Zero Trust cryptographic proxy deployed in a Trusted Execution Environment (Google Cloud's Confidential Computing). We make a best-effort attempt to anonymize traffic while keeping your data on your machine.
It’s designed for deep work, long-form writing, brainstorming, note-taking, and other academic and journalistic workflows. The app is free and in beta. We’re looking for feedback from students, researchers, and educators on features, threat models, and usability.
(Discord for early testers is linked on the site.)
Thank you! Looking forward to your feedback.
Tonkotsu – a developer app for managing a team of AI coding agents #
We strongly believe that the future of software development requires every developer to become the tech lead of a team of agents. That shift requires new tools and workflows that go beyond existing IDEs or terminal agents.
Tonkotsu is our take on that next-gen tool. It’s a desktop app that lets developers plan technical work, delegate many coding tasks in parallel, and then review the diffs — all in one place. Our main difference is that we tightly integrate this plan > delegate > verify loop and scale it up so that every engineer can use it to manage a team of agents working in parallel.
Current status:
- macOS + Windows desktop app
- Free to use during early access
Demo video (2 min): https://www.tonkotsu.ai/demo
Download: https://www.tonkotsu.ai
We’re really curious to hear from people who are using multiple agents in parallel already. What do your workflows look like? Where are the bottlenecks?
Debugg – 0-Config AI browser (E2E) tests that review every commit #
I hated writing tests, my end to end's never worked, and dealing with browsers was brutal. AI made this 10x worse because it'll change random stuff across your app that will pass unit tests but break something obvious. So I wanted and then built a system that handles everything for you and then updates you with results after each commit or PR (configurable). LMK what you all think!
The Long:
I was writing a ton of code, particularly with AI for a previous startup idea and kept feeling like things were moving really quick until I tried to go and use pieces of it. This isn't uncommon in general, but i think the new thing with AI is that stuff you didn't think you touched would start breaking too cause I wasn't watching or making every single edit.
Granted AI has gotten much better since then, but in general my view is that everyone (even AI) needs someone to give a second set of eyes on something and then send back the results. That's what debugg.ai attempts (cough cough - to be seen if you think we do) to do. Currently that feedback is in the form of PR review comments or email updates & our app, but our nearterm plan is to offer the ability to pull that right back into whatever AI you use so that it can get its own feedback and iterate until it's done.
I'm tired of opening an app that Claude Code said was 'working perfectly' only to find the main page won't even load or has some react hydration problem. The cool thing about this is that even though it may not be best - yet - for super complex and detailed test flows, most people wouldn't write an E2e just to make sure the main page loads cause that's a 'manual' test thing. Because this removes all the browser handling, building, ci / cd, etc setup you can have really simple and a lot of quick tests that reign in you AI and also just give you peace of mind as you're making changes.
On the tech side i def had some fun:
Built a use-specific crawler agent that sequences and learns your application from top down. Think of it like a sitemap, but actually useful – it knows "login button on homepage → takes you to /login → which has a form → which posts to /api/auth" and includes files from /auth/components/... etc.
The above improved our ability to track github code changes and associate them with tests that could be impacted & create new ones for stuff that hasn't been seen.
Ultimately my goal is to build myself out a job a bit so i can just prompt Claude to make changes then have a hook that sends debugg's test results (failures) back to Claude to keep making changes until it actually works :).
Open to all feedback & thoughts on whether you've felt this pain as well!
A minimal demo of SIP ↔ WebRTC #
AI Desk – Help desk software that auto-improves with your business #
- Deploys in under 10 minutes - Auto ticket routing and self-learning response generation
Try it at https://aidesk.us and let me know what you think.
GYST – A new take on the desktop interface (alpha) #
The idea is to make all these tools feel like one fluid space instead of 5 separate tools. The hope is to replicate the feeling of a physical desk : where order and freedom coexist.
This 15-min video walks through the current alpha and the vision for the full product : https://youtu.be/AcWzuBBuiPM
I’d love your feedback — especially around the concept and UX. The alpha is online if you want to try it: [your app link]
This is a solo project for now, inspired by the “second brain” / PKM movement and my own frustration with fragmented tools and outdated UX.
I want you to see Nod, a new object-oriented language I designed #
Nod is a new programming language I've been working on for five years. It's a serious effort to design a language that I wished someone else would have invented while I was still working as a professional software engineer.
This is my first post to HN, so hopefully I'm not stepping on any guidelines. It's real news, it's mine, and you can learn more about it on the Nod dev website.
Why I Built Nod
I was a professional programmer/software engineer for almost 40 years. During most of my career, C and its descendants ruled the day. In fact, it can't be overstated just how influential C was (is) on the field of programming. But that influence can also be characterized as burden. Newer C-based languages like C++, Java, C#, and others, are improvements over the original for sure, but backward compatibility and adherence to familiar constructs have stifled innovation and clarity. C++ in particular is an unapproachable Frankenstein. Powerful, yes, but complex syntax and semantics has raised the barrier of entry to all but the most motivated.
Although C++ was usually my first or only choice for a lot of projects, I kept waiting (hoping) that a viable successor would come along. Something fresh, performant, and pragmatic. Something that broke cleanly from the past without throwing away what worked. But nothing really did. Or at least nothing worth the effort did. So, in 2019, newly retired and irrationally predisposed, I decided to build that fresh, performant, pragmatic language myself. That language, imho is Nod.
What Nod Is
Nod is an object-oriented language designed from the start to be a fresh and practical alternative to the current status quo. The goal is to balance real-world trade-offs in a language that is uniquely regular (consistent), efficient (fast), reliable (precautious), and convenient (automatic). While Nod respects the past, it's not beholden to it. You might say that Nod acknowledges the past with a respectful nod, then moves on.
Nod has wide applicability, but it's particularly well-suited for building low-level infrastructure that runs on multiple platforms. A keen awareness of portability issues allows many applications to be written without regard to runtime platform, while kernel abstraction and access to the native kernel provide the ultimate ability to go low. Furthermore, built-in modularity provides a simple and robust path for evolution and expansion of the Nod universe.
What Next?
Although I've worked on Nod for five years, it's a long way from being finished. But it's far enough along that I can put it out there to gauge interest and feedback from potential early adopters and collaborators.
The language itself is mature and stable, and there are the beginnings of a Nod Standard Library residing in a public GitHub archive.
I've written a compiler (in C++) that compiles source into intermediate modules, but it's currently in a private archive.
There's still much more that needs to be done.
If you're interested, please go to the website (https://www.about-nod.dev) to find links to the Nod Design Reference and GitHub archive. In the archive, there's a brief syntax overview that should let you get started reading Nod code.
Thanks for your interest.
Enfra – Live SEO/Ads Data Inside ChatGPT (Chrome Extension) #
Enfra is a Chrome extension that pulls Google SERPs, Ads, and page markup for a keyword or URL you choose, and adds that context in your chat. Works in ChatGPT, Claude, Gemini, and Perplexity.
The problem: we built a Marketing agent, but end-users in design partner companies weren't using it. So we asked them how they're using AI currently, and they said "we have a ChatGPT tab open at all times".
What it does: injects structured data from SERPs, Google Ads, URLs and Search Console so that users can run their own chat workflows.
The big change is that users don't have to learn a new interface. They're already using ChatGPT and others for a ton of Marketing work, now they get better answers because the AI tools have better context.
And about privacy, we only fetch and insert Marketing data, we don't read your chats.
We'd deeply appreciate any comments or feedback.
Sluqe AI voice notes turn conversations into searchable memory #
- One-tap recording & instant transcript + summary
- Search & query across all conversations
- Export data, privacy controls, retention options
We have 100+ non-paid users. Looking for UX feedback, accuracy issues, feature ideas & integrations.
/Team Sluqe
SDF-Field Synthesis – rendering SDF without ray marching #
I have developed a new approach to rendering the scenes described by the SDF, which completely abandons the classic ray marching.
The idea is simple: instead of searching for ray intersections, we consider the scene as a potential field of view, where the source is the SDF itself. The solution of the field equation gives zero isosurfaces — they correspond to the visible boundaries of objects.
Advantages:
- natural coherence between pixels,
- analytical continuation for infinite resolution,
- adaptive computational load depending on the complexity of the scene,
- all the advantages of SDF are preserved (procedurality, Boolean operations, compactness).
The full description of the idea with basic formulas and concept is published on Zenodo: https://doi.org/10.5281/zenodo.17306506
It will be interesting to hear the opinion of the community:
- How feasible is the method?
- What potential applications do you see?
- Are there any similar approaches that I might have missed?
I'm starting a fund that runs on luck, caffeine, and denial #
this was just for fun. the message i wanna pass on is after $POLY raising $2B from $ICE, the signal that markets really are everything is true. cheers to that lads!
have a great day.
Engin – a modular application framework for Python #
Meihus – Mortgage Calculator Now Supports 25 Languages (UX Update) #
This version focuses mainly on localization which is done by parsing the IP country before the page loads.
Also, the users now have the possibility to select the language they see the page in.
Previous post details:
I built it because I needed a way to see how much will my early payments save in interest overtime.
Most of the websites I found are solely made for the US, which does not help my case. So I built one that has a bit more flexibility in terms of loan data.
Any feedback is appreciated.
EchoMode – A stability layer that prevents persona drift in LLMs #
The Problem: Even well-prompted models (GPT-4, Claude, Gemini) start to lose persona after ~8 turns. Studies like Measuring and Controlling Persona Drift (arXiv:2402.10962) confirm that tone consistency decays fast with topic shifts and long contexts.
The Solution: EchoMode is a middleware protocol that keeps LLMs consistent by monitoring tone drift in real time and repairing deviations automatically. Think of it as TCP/IP for language stability.
How it works: – Finite-state machine (Sync / Resonance / Insight / Calm) tracks conversation state – Calculates a driftScore between each output and baseline persona – Triggers a repair loop if deviation > threshold – EWMA smoothing (λ≈0.3) prevents overcorrection – Works across OpenAI, Anthropic, Gemini, Mistral APIs
Tech Stack: TypeScript SDK • Drift telemetry dashboard (soon) • Apache-2.0 open core + commercial observability layer
Use Cases: AI agents that must stay on-brand, compliant, or emotionally consistent over long sessions.
Try it: github.com/Seanhong0818/Echo-Mode