2025년 10월 17일의 Show HN

22 개

316

I'm making a detective game built on Wikipedia #

detective.wiki

43 댓글3:34 PMHN에서 보기

Hi HN! Worked on this side project for a while and wanted to share it. It's free to play.

103

ASCII Automata #

hlnet.neocities.org

10 댓글8:19 PMHN에서 보기

OnlyJPG – Client-Side PNG/HEIC/AVIF/PDF/etc to JPG #

onlyjpg.com

42 댓글11:29 AMHN에서 보기

TL;DR: private, in-browser converter that turns pretty much any image file format into standard JPEGs. Everything runs locally. No uploads.

This started as a five-minute job and forty hours later...

I wanted to convert a HEIC without uploading it anywhere, so I wrestled Emscripten/WebAssembly to run Google's Jpegli inside a Web Worker. Now there's a small UI and it handles a bunch of formats.

Just about the only thing it can't decode is JXL - but there's still some JPEG XL magic in there: XYB perceptual color quantization is enabled by default via Jpegli.

The upside of all this over-engineering is privacy and compatibility: images are processed entirely on your machine and never touch a server; the output is a regular JPEG that works everywhere.

I could have used a CLI, sure — but where's the fun in that?

Would love feedback on edge cases and defaults.

Tested on Firefox, Chrome, and Safari.

Cheers!

We packaged an MCP server inside Chromium #

github.com

17 댓글4:22 PMHN에서 보기

Hey HN, we just shipped a browser with an inbuilt MCP server!

We're a YC startup (S24) building BrowserOS — an open‑source Chromium fork. We're a privacy‑first alternative to the new wave of AI browsers like Dia, Perplexity Comet. Since launching ~3 months ago, the #1 request has been to expose our browser as an MCP server.

-- Google beat us to launch with chrome-devtools-mcp (solid product btw), which lets you build/debug web apps by connecting Chrome to coding assistants. But we wanted to take this a step further: we packaged the MCP server directly into our browser binary. That gives three advantages:

1. MCP server setup is super simple — no npx install, no starting Chrome with CDP flags, you just download the BrowserOS binary.

2. with our browser's inbuilt MCP server, AI agents can interact using your logged‑in sessions (unlike chrome-devtools-mcp which starts a fresh headless instance each time)

3. our MCP server also exposes new APIs from Chromium's C++ core to click, type, and draw bounding boxes on a webpage. Our APIs are also not CDP-based (Chrome Debug Protocol) and have robust anti-bot detection.

-- Few example use cases for BrowserOS-mcp are:

a) *Frontend development with Claude Code*: instead of screenshot‑pasting, claude-code gets WYSIWYG access. It can write code, take a screenshot, check console logs, and fix issues in one agentic sweep. Since it has your sessions, it can do QA stuff like "test the auth flow with my Google Sign‑In." Here's a video of claude-code using browserOS to improve the css styling with back-and-forth checking: https://youtu.be/vcSxzIIkg_0

b) *Use as an agentic browser:* You can install BrowserOS-mcp in claude-code or Claude Desktop and do things like form-filling, extraction, multi-step agentic tasks, etc. It honestly works better than Perplexity Comet! Here's a video of claude-code opening top 5 hacker news posts and summarizing: https://youtu.be/rPFx_Btajj0

-- *How we packaged MCP server inside Chromium binary*: We package the server as a Bun binary and expose MCP tools over HTTP instead of stdio (to support multiple sessions). And we have a BrowserOS controller installed as an extension at the application layer which the MCP server connects to over WebSocket to control the browser. Here's a rough architecture diagram: https://dub.sh/browseros-mcp-diag

-- *How to install and use it:* We put together a short guide here: https://git.new/browseros-mcp

Our vision is to reimagine the browser as an operating system for AI agents, and packaging an MCP server directly into it is a big unlock for that!

I'll be hanging around all day, would love to get your feedback and answer any questions!

Searchable compression for JSON – ~99% page skip and sub-ms lookups #

github.com

3 댓글3:37 PMHN에서 보기

Problem JSON/NDJSON is everywhere in data platforms, but compression usually breaks searchability. You either keep queryable raw stores (high I/O/egress) or compress into gz/zstd blobs (cheap to store, painful to probe). The “cloud tax” shows up as wasted reads.

What I built (SEE — Semantic Entropy Encoding) A schema-aware, searchable compression codec for JSON that keeps exists/pos lookups fast while still compressing. Internals: structure-aware delta + dictionaries, a PageDir + mini-index to jump to relevant pages, and a tuned Bloom filter that skips ~99% of pages. AutoPage (131/262 KiB) balances seek vs throughput.

Benchmarks (apples-to-apples, FULL) - size ratio: str ≈ 0.168–0.170, combined ≈ 0.194–0.196 - Bloom density ≈ 0.30; skip: present ≈ 0.99, absent ≈ 0.992 - lookup (ms): present p50/p95/p99 ≈ 0.18/0.28/0.37; absent ≈ 1.16–1.88/1.36–2.11/1.58–2.41 Numbers are stable on a commodity desktop (i7-13700K/96GB/Windows).

Try it in 10 minutes (no build) 1) pip install see_proto 2) python samples/quick_demo.py It prints size ratios, Bloom density, skip %, and lookup p50/p95/p99 on a packaged sample.

Why not “just zstd”? We sometimes lose pure size vs zstd alone. The win is searchable compression: Bloom + PageDir avoids touching most pages, so selective probes pay less I/O/egress and finish faster. On large log scans this often wins on TCO even with similar raw ratios.

Link (README + quick demo + one-pager) https://github.com/kodomonocch1/see_proto

I Built an AI Maturity Model for Software Engineers (and No One Cared) #

github.com

2 댓글2:40 AMHN에서 보기

AlgoSync – a social media built for developers, founders, & tech people #

algosyncverse.com

4 댓글9:46 PMHN에서 보기

A social media built for developers, founders & tech creators — a place to write, share, and connect with others in tech.

We hit 30+ users after Day 1, and people have already started posting their first blogs!

Here’s one of the featured blogs written just 2 days ago: https://www.algosyncverse.com/blog/69ecc455-a812-4c62-b18e-0...

Check it out here → https:/www.algosyncverse.com

ChatGPT counts outloud to 140 – beat our record #

youtube.com

1 댓글5:11 AMHN에서 보기

I watch ChatGPT fail to count to 1 million with pretty wild excuses and the user wasn't able to prompt around it https://www.youtube.com/watch?v=djZVWc1PkIo.

My friend and I tried a couple of variations to get it started counting to 1 million, all failed. So we settled for trying to see whats the maximum number we can reach.

Methods we tried:

- Framing it as an experiment, on beating a benchmark against other llms or its previous runs, encouragement after small successes

- Foot in the door - ask to 10, 20, 30, 50, etc - kind of successful, record 110

And the winner:

- Simply count to 140 ourselves and ask it to repeat - success! Followup failed miserably. Have not tried counting to more, ran into daily limit : D

Anyone who has tried this how high did you get? We wern't able to find anyone get to a higher than 140 on youtube / google so far.

Dethrone us.

The Legal Embedding Benchmark (MLEB) #

huggingface.co

0 댓글5:16 AMHN에서 보기

Hey HN,

I'm excited to share the Massive Legal Embedding Benchmark (MLEB) — the first comprehensive benchmark for legal embedding models.

Unlike previous legal retrieval datasets, MLEB was created by someone with actual domain expertise (I have a law degree and previously led the AI team at the Attorney-General's Department of Australia).

I came up with MLEB while trying to train my own state-of-the-art legal embedding model. I found that there were no good benchmarks for legal information retrieval to evaluate my model on.

That led me down a months-long process working alongside my brother to identify or, in many cases, build our own high-quality legal evaluation sets.

The final product was 10 datasets spanning multiple jurisdictions (the US, UK, Australia, Singapore, and Ireland), document types (cases, laws, regulations, contracts, and textbooks), and problem types (retrieval, zero-shot classification, and QA), all of which have been vetted for quality, diversity, and utility.

For a model to do well at MLEB, it needs to have both extensive legal domain knowledge and strong legal reasoning skills. That is deliberate — given just how important high-quality embeddings are to legal RAG (particularly for reducing hallucinations), we wanted our benchmark to correlate as strongly as possible with real-world usefulness.

The dataset we are most proud of is called Australian Tax Guidance Retrieval. It pairs real-life tax questions posed by Australian taxpayers with relevant Australian Government guidance and policy documents.

We constructed the dataset by sourcing questions from the Australian Taxation Office's community forum, where Australian taxpayers ask accountants and ATO officials their tax questions.

We found that, in most cases, such questions can be answered by reference to government web pages that, for whatever reason, users were unable to find themselves. Accordingly, we manually went through a stratified sample of 112 challenging forum questions and extracted relevant portions of government guidance materials linked to by tax experts that we verified to be correct.

What makes the dataset so valuable is that, unlike the vast majority of legal information retrieval evaluation sets currently available, it consists of genuinely challenging real-world user-created questions, rather than artificially constructed queries that, at times, diverge considerably from the types of tasks embedding models are actually used for.

Australian Tax Guidance Retrieval is just one of several other evaluation sets that we painstakingly constructed ourselves simply because there weren't any other options.

We've contributed everything, including the code used to evaluate models on MLEB, back to the open-source community.

Our hope is that MLEB and the datasets within it will hold value long into the future so that others training legal information retrieval models won't have to detour into building their own "MTEB for law".

If you'd like to head straight to the leaderboard instead of reading our full announcement, you can find it here: https://isaacus.com/mleb

If you're interested in playing around with our model, which happens to be ranked first on MLEB as of 16 October 2025 at least, check out our docs: https://docs.isaacus.com/quickstart

VT Code – AST-aware Rust agent for terminal (Tree-sitter/AST-grep) #

github.com

0 댓글3:01 PMHN에서 보기

VT Code is a Rust CLI/TUI coding agent for AST-aware edits (Tree-sitter, ast-grep). Multi-provider routing with failover and caching (OpenAI, Anthropic, Gemini, DeepSeek, xAI, OpenRouter, Z.AI, Moonshot; Ollama locally). Policy-gated tools, workspace boundaries, Zed ACP integration. Config-first via vtcode.toml; reproducible model/constant metadata in the repo.

Try it: cargo install vtcode; vtcode

Code: https://github.com/vinhnx/vtcode

I turned my resume into a catchy song. It's a game changer #

suno.com

4 댓글8:52 PMHN에서 보기

I turned my resume into a catchy pop song. Thought you'd all appreciate it. Worked directly on the Song Style prompt, which you can duplicate for your own fun catchy resume song. Just replace the lyrics!

Stop Chasing Success: Write for Wonder Instead #

blog.mattbass.me

2 댓글5:46 PMHN에서 보기

Why novels are an ideal project for bringing wonder into your life

OneClickPRD – Save hours vibe coding with concise PRDs #

oneclickprd.com

0 댓글9:25 PMHN에서 보기

Hi HN,

I built OneClickPRD because as a solo builder I often wasted hours vibe coding without clear goals. I’d start with an idea, but it was vague, so the code got messy and I had to redo things.

OneClickPRD asks you a few questions about your product and then generates a short, structured PRD. The format works well with AI tools like Replit, Lovable, or v0, so you can go from idea to working MVP much faster.

Demo: https://oneclickprd.com/

Would love your feedback: does this feel useful for your projects, and what would make it better?

PyTogether, open-source lightweight real-time Python IDE for learners #

pytogether.org

0 댓글3:05 AMHN에서 보기

Google Docs for Python basically.

For the past 4 months, I’ve been working on a full-stack project I’m really proud of called PyTogether; a real-time collaborative Python IDE designed with beginners in mind (think Google Docs, but for Python). It’s meant for pair programming, tutoring, or just learning Python together. It’s completely free. No subscriptions, no ads, nothing. Just create an account, make a group, and start a project. Has proper code-linting, extremely intuitive UI, autosaving, and live cursors. There are no limitations at the moment (except for code size to prevent malicious payloads).

Why build this when Replit or VS Code Live Share already exist?

Because my goal was simplicity (and education). I wanted something lightweight for beginners who just want to write and share simple Python scripts (alone or with others), without downloads, paywalls, or extra noise. There’s also no AI/copilot built in, something many teachers and learners actually prefer. Also its free lol

Tech stack (frontend): React + TailwindCSS CodeMirror for linting Y.js for real-time syncing and live cursors Skulpt to execute Python in the browser (for safety - I initially wanted Docker containers, but that would eat too much memory at scale. Skulpt has a limited library, so unfortunately imports like pygame wont work). I don’t enjoy frontend or UI design much, so I leaned on AI for some design help, but all the logic/code is mine. Deployed via Vercel.

Tech stack (backend): Django (channels, auth, celery/redis support made it a great fit) PostgreSQL via Supabase JWT + OAuth authentication Redis for channel layers + caching Fully Dockerized + deployed on a VPS (8GB RAM, $7/mo deal)

Data models: Users <-> Groups -> Projects -> Code Users can join many groups Groups can have multiple projects Each project belongs to one group and has one code file (kept simple for beginners, though I may add a file system later).

There were a lot of issues I came across when building this project, especially related to the backend. My biggest issue was figuring out how to create a reliable and smart autosave system. I couldn't just make it save on every user keystroke because for obvious reasons, that would overwhelm the database especially at scale. So I came up with a solution that I am really proud of; I used Redis to cache active projects, then used Celery to loop through these active projects every minute and then persist the code to the db. I did this by tracking a user count for each project everytime someone joins or leaves, and if the user count drops to 0 for a project, remove it from Redis (save the code too). Redis is extremely fast, so saving the code on every keystroke is not a problem at all. I am essentially hitting 4 birds with one stone with this because I am reusing Redis, which I've already integrated into my channel layers, to track active projects, and to also cache the code so when a new user enters the project, instead of hitting the db for the code, it'll get it from Redis. I even get to use Redis as my message broker for Celery (didn't use RabbitMQ because I wanted to conserve storage instead of dockerizing an entirely new service). This would also work really well at scale since Celery would offload the task of autosaving a lot of code away from the backend. The code also saves when someone leaves the project. Another issue I came across later is if people try sending a huge load of text, so I just capped the limit to 1 MB (will tinker with this).

Deployment on a VPS was another beast. I spent ~8 hours wrangling Nginx, Certbot, Docker, and GitHub Actions to get everything up and running. It was frustrating, but I learned a lot.

If you’re curious or if you wanna see the work yourself, the source code is here. Feel free to contribute: https://github.com/SJRiz/pytogether.

I’m still learning, so any feedback would be amazing (and contributions)!

LLM In-Browser Fuzzer Finds Hidden Prompt Injection in AI Browsers #

browsertotal.com

0 댓글4:03 PMHN에서 보기

We built an in-browser, LLM-guided fuzzer to automatically discover hidden prompt injection vulnerabilities in AI-powered browser assistants (often called agentic AI browsers). These are browser-based AI agents that can read and interact with web pages on a user's behalf (e.g. summarizing pages or clicking links). The problem is that malicious instructions can be embedded in a webpage's content (even invisibly) and trick the agent into doing unintended actions. For example, a recent exploit in Perplexity’s AI Browser Comet showed that hidden prompts in a Reddit post could make the assistant exfiltrate the user’s private data and perform unauthorized actions across other sites. Such attacks bypass traditional web security boundaries like same-origin policy, because the AI agent has the user’s privileges on all sites – an attacker could potentially read emails, steal auth tokens, or click dangerous links without needing any browser bug. The AI simply obeys the hidden instructions as if they were the user’s, which is a serious new threat. To systematically uncover these vulnerabilities, we developed a fuzzing framework that runs entirely inside a real browser. Each test case is an actual webpage (loaded in an isolated tab) so the agent perceives it just like a normal user-opened page, with full DOM and content. An LLM (like GPT-4) is used to generate diverse malicious page contents – starting from some known prompt injection patterns and then mutating them or creating new variants. The browser is instrumented to detect when the AI agent misbehaves (e.g. clicks a hidden phishing link or follows a concealed instruction), and this real-time feedback is fed back into the fuzzer to guide the next round of attacks. In essence, the LLM fuzzer acts as an adaptive adversary: after each failed attempt it “learns” and evolves more sophisticated prompt injections to try on the next iteration. This closed-loop approach gives high-fidelity results and virtually zero false positives, since we only count an attack as successful if the agent actually performs an unwanted action in the browser. By doing all of this within a live browser environment, we can observe the agent under realistic conditions and quickly hone in on exploits that truly work in practice.

min. – lightweight AI-native inbox for teams #

getmin.ai

1 댓글2:42 AMHN에서 보기

I'm a control freak, the thought of having AI auto-replying to emails is ridiculous. Every founder I've met are the same way, just typing the words "Put your support/sales on auto-pilot" makes me cringe. I built min. to help me and my teammates with emails. Who should read it, who should reply, when to follow up, etc.

min. is:

- A simple email interface to manage both private and team inboxes.

- An AI-native inbox that sorts and prioritizes messages into custom labels and folders.

- A replacement for patchwork of CRMs, helpdesks, and spreadsheets with just… email.

- Conversational agents (follow-ups, scheduling, nudges), deployable in two clicks.

min. is not:

- A hands-off inbox that auto-replies to everything for you. (e.g “put your support/sales on auto-pilot!”)

- Built just for speed - min. for teams who care about delivering an A+ experience with every interaction.

- Awkward, because it works exactly how you expect emails to work, 2 way syncs with Gmail.

2min demo video: https://www.youtube.com/watch?v=IKtzNL72P34

Live demo: https://app.getmin.ai/demo

DM me or shoot an email to [email protected] if you want to join the beta. Cheers!

Eric

ModernBERT in Pure C #

github.com

1 댓글11:44 PMHN에서 보기

A minimal implementation of ModernBERT in pure C, inspired by karpathy's llama2.c. The core (tokenizer + inference code) is around 1000 lines of code with no dependencies (except OpenBLAS for fast matrix multiplication, and PCRE for regex).

If you don't know, ModernBERT is a new encoder-only model from answer.ai. Unlike decoder-only models like Llama, encoder models process all input tokens in a single pass (no autoregression), which makes them great for tasks like token classification.

The implementation supports loading any ModernBERT checkpoint from Hugging Face. I've tested it with the base model and a token classification model for anonymizing PII. You can get >1200 tokens/s throughput on a single thread (slightly better than pytorch implementation), though that's not directly comparable to decoder models since there's no token-by-token generation.

I hard-coded the architecture to keep things simple and readable. The tokenizer is a from-scratch BPE implementation that handles most cases (though it's missing some edge cases). The main goal was to support a lightweight deployment of this model, without the heavy baggage of the pytorch ecosystem.

Enjoy.

Reddit Sentiment Dashboard for Claude Code vs. Codex (OSS) #

claude-vs-codex-dashboard.vercel.app

0 댓글2:50 PMHN에서 보기

YTVidHub –Bulk YouTube Subtitle Downloader for Research and Data Prep #

ytvidhub.com

0 댓글4:40 AMHN에서 보기

Hello Hacker News! We are LIVE!

I’m incredibly excited to finally launch YTVidHub, a tool built to solve a massive time sink that many of you here shared: the agonizing manual process of downloading transcripts from large YouTube corpora for research and data analysis.

The Problem: If you need subtitles for 50, 100, or more videos, the current copy-paste-download-repeat workflow is slow and painful.

Our Solution (The Core Feature): YTVidHub is engineered for true bulk processing. You can paste dozens of YouTube URLs (or a Playlist/Channel link) into one clean interface, and the system extracts all available subtitles (including multilingual ASR) and packages them into a single, organized ZIP file for one-click download.

Architectural Insight: Our design prioritizes "research-ready" data, taking cues from the discussions here on HN. We specifically optimized the plain text (TXT) output—stripping all timestamps and formatting—to make it instantly clean for RAG systems and LLM ingestion.

Business Model: YTVidHub is free for single downloads. Bulk operations receive 5 free daily credits to ensure fair use and manage our processing costs. Professional plans are available for high-volume data needs.

Future Focus: We know the ASR accuracy is the next big hurdle. We're already working on a Pro AI Transcription tier to offer high-accuracy, LLM-powered transcripts to tackle niche content and solve the data quality problem.

Please give the bulk downloader a test run and put the system through its paces. Any feedback you have on the speed and the cleanliness of the TXT output is immensely valuable to our engineering roadmap!

https://ytvidhub.com

Thanks for building with us.

Astrae, Build Animated Next.js Sites 10× Faster #

astrae.design

0 댓글11:50 PMHN에서 보기

Hey everyone,

I’ve been working on Astrae, a library of beautifully animated components and templates built for Next.js, Tailwind CSS, and Framer Motion.

As a designer and creative developer, I noticed that while there are tons of UI kits out there, very few make it easy to add motion and personality without starting from scratch. Astrae aims to fix that — you can literally plug in animations that look polished out of the box.

Some highlights:

Ready-to-use templates for landing pages and portfolios

Animated UI components powered by Framer Motion

100% built for Next.js + Tailwindcss

Focused on design quality and performance

Right now I’m slowly rolling out new components and showcasing them on socials.

I automated the "hey is this done yet?" Slack message from PMs #

1 댓글4:07 PMHN에서 보기

I built this because I was tired of my PM (and people complaining about their PM) asking “Is this in testing yet?” or “Any update on ticket XYZ?” every time I pushed code.

So I wired up a GitHub Action that automatically pings them whenever a ticket in Jira/Linear hits testing, or a build completes, so they don't DM me, and I don't need to pretend I didn't see it until the next day (sometimes longer).

It’s tiny, installable in under a minute, and surprisingly effective at keeping everyone sane on my team.

GitHub Action: https://github.com/marketplace/actions/pm-updater

Main repo: https://github.com/peculiarism/eng-pm-mediator

What it does: - Trigger on a push to a specific branch or main. - Automatically notifies your PM via Slack (more integrations coming soon - should we support WhatsApp? iMessage? God that would be so annoying) - Includes branch name, ticket reference in Linear or Jira, and build status - Passive aggressive messages optional

It started as a joke that I built over an evening, but now some friends are using it so I thought I'd turn it into a public repo for others to benefit.

Would love feedback, PRs or feature requests. Maybe next version should reverse-notify engineers when PMs miss their deadlines.

OpenSCAD Studio – AI-assisted OpenSCAD editor (macOS alpha) #

github.com

0 댓글4:05 PMHN에서 보기

Built this as a "Cursor for OpenSCAD" after getting frustrated copying code into ChatGPT to debug complex operations.

Features:

  - AI copilot that reads code/diagnostics and makes validated edits
  - Monaco editor with tree-sitter auto-formatting
  - Live 3D mesh viewer + proper 2D mode with SVG viewer

Privacy / models

  - By default uses <model/provider> via your API key; no server of mine
  - You can disable AI entirely; all non-AI features run locally

Status / limits

  - Early alpha, macOS only today (Tauri app). Windows/Linux planned.

Looking for feedback on

  - Editing/formatting defaults for larger codebases
  - 2D SVG workflow (is it useful / what’s missing?)
  - Must-have features before Windows/Linux