2025년 10월 17일의 Show HN
22 개OnlyJPG – Client-Side PNG/HEIC/AVIF/PDF/etc to JPG #
This started as a five-minute job and forty hours later...
I wanted to convert a HEIC without uploading it anywhere, so I wrestled Emscripten/WebAssembly to run Google's Jpegli inside a Web Worker. Now there's a small UI and it handles a bunch of formats.
Just about the only thing it can't decode is JXL - but there's still some JPEG XL magic in there: XYB perceptual color quantization is enabled by default via Jpegli.
The upside of all this over-engineering is privacy and compatibility: images are processed entirely on your machine and never touch a server; the output is a regular JPEG that works everywhere.
I could have used a CLI, sure — but where's the fun in that?
Would love feedback on edge cases and defaults.
Tested on Firefox, Chrome, and Safari.
Cheers!
We packaged an MCP server inside Chromium #
We're a YC startup (S24) building BrowserOS — an open‑source Chromium fork. We're a privacy‑first alternative to the new wave of AI browsers like Dia, Perplexity Comet. Since launching ~3 months ago, the #1 request has been to expose our browser as an MCP server.
-- Google beat us to launch with chrome-devtools-mcp (solid product btw), which lets you build/debug web apps by connecting Chrome to coding assistants. But we wanted to take this a step further: we packaged the MCP server directly into our browser binary. That gives three advantages:
1. MCP server setup is super simple — no npx install, no starting Chrome with CDP flags, you just download the BrowserOS binary.
2. with our browser's inbuilt MCP server, AI agents can interact using your logged‑in sessions (unlike chrome-devtools-mcp which starts a fresh headless instance each time)
3. our MCP server also exposes new APIs from Chromium's C++ core to click, type, and draw bounding boxes on a webpage. Our APIs are also not CDP-based (Chrome Debug Protocol) and have robust anti-bot detection.
-- Few example use cases for BrowserOS-mcp are:
a) *Frontend development with Claude Code*: instead of screenshot‑pasting, claude-code gets WYSIWYG access. It can write code, take a screenshot, check console logs, and fix issues in one agentic sweep. Since it has your sessions, it can do QA stuff like "test the auth flow with my Google Sign‑In." Here's a video of claude-code using browserOS to improve the css styling with back-and-forth checking: https://youtu.be/vcSxzIIkg_0
b) *Use as an agentic browser:* You can install BrowserOS-mcp in claude-code or Claude Desktop and do things like form-filling, extraction, multi-step agentic tasks, etc. It honestly works better than Perplexity Comet! Here's a video of claude-code opening top 5 hacker news posts and summarizing: https://youtu.be/rPFx_Btajj0
-- *How we packaged MCP server inside Chromium binary*: We package the server as a Bun binary and expose MCP tools over HTTP instead of stdio (to support multiple sessions). And we have a BrowserOS controller installed as an extension at the application layer which the MCP server connects to over WebSocket to control the browser. Here's a rough architecture diagram: https://dub.sh/browseros-mcp-diag
-- *How to install and use it:* We put together a short guide here: https://git.new/browseros-mcp
Our vision is to reimagine the browser as an operating system for AI agents, and packaging an MCP server directly into it is a big unlock for that!
I'll be hanging around all day, would love to get your feedback and answer any questions!
Searchable compression for JSON – ~99% page skip and sub-ms lookups #
What I built (SEE — Semantic Entropy Encoding) A schema-aware, searchable compression codec for JSON that keeps exists/pos lookups fast while still compressing. Internals: structure-aware delta + dictionaries, a PageDir + mini-index to jump to relevant pages, and a tuned Bloom filter that skips ~99% of pages. AutoPage (131/262 KiB) balances seek vs throughput.
Benchmarks (apples-to-apples, FULL) - size ratio: str ≈ 0.168–0.170, combined ≈ 0.194–0.196 - Bloom density ≈ 0.30; skip: present ≈ 0.99, absent ≈ 0.992 - lookup (ms): present p50/p95/p99 ≈ 0.18/0.28/0.37; absent ≈ 1.16–1.88/1.36–2.11/1.58–2.41 Numbers are stable on a commodity desktop (i7-13700K/96GB/Windows).
Try it in 10 minutes (no build) 1) pip install see_proto 2) python samples/quick_demo.py It prints size ratios, Bloom density, skip %, and lookup p50/p95/p99 on a packaged sample.
Why not “just zstd”? We sometimes lose pure size vs zstd alone. The win is searchable compression: Bloom + PageDir avoids touching most pages, so selective probes pay less I/O/egress and finish faster. On large log scans this often wins on TCO even with similar raw ratios.
Link (README + quick demo + one-pager) https://github.com/kodomonocch1/see_proto
AlgoSync – a social media built for developers, founders, & tech people #
We hit 30+ users after Day 1, and people have already started posting their first blogs!
Here’s one of the featured blogs written just 2 days ago: https://www.algosyncverse.com/blog/69ecc455-a812-4c62-b18e-0...
Check it out here → https:/www.algosyncverse.com
ChatGPT counts outloud to 140 – beat our record #
My friend and I tried a couple of variations to get it started counting to 1 million, all failed. So we settled for trying to see whats the maximum number we can reach.
Methods we tried:
- Framing it as an experiment, on beating a benchmark against other llms or its previous runs, encouragement after small successes
- Foot in the door - ask to 10, 20, 30, 50, etc - kind of successful, record 110
And the winner:
- Simply count to 140 ourselves and ask it to repeat - success! Followup failed miserably. Have not tried counting to more, ran into daily limit : D
Anyone who has tried this how high did you get? We wern't able to find anyone get to a higher than 140 on youtube / google so far.
Dethrone us.
The Legal Embedding Benchmark (MLEB) #
I'm excited to share the Massive Legal Embedding Benchmark (MLEB) — the first comprehensive benchmark for legal embedding models.
Unlike previous legal retrieval datasets, MLEB was created by someone with actual domain expertise (I have a law degree and previously led the AI team at the Attorney-General's Department of Australia).
I came up with MLEB while trying to train my own state-of-the-art legal embedding model. I found that there were no good benchmarks for legal information retrieval to evaluate my model on.
That led me down a months-long process working alongside my brother to identify or, in many cases, build our own high-quality legal evaluation sets.
The final product was 10 datasets spanning multiple jurisdictions (the US, UK, Australia, Singapore, and Ireland), document types (cases, laws, regulations, contracts, and textbooks), and problem types (retrieval, zero-shot classification, and QA), all of which have been vetted for quality, diversity, and utility.
For a model to do well at MLEB, it needs to have both extensive legal domain knowledge and strong legal reasoning skills. That is deliberate — given just how important high-quality embeddings are to legal RAG (particularly for reducing hallucinations), we wanted our benchmark to correlate as strongly as possible with real-world usefulness.
The dataset we are most proud of is called Australian Tax Guidance Retrieval. It pairs real-life tax questions posed by Australian taxpayers with relevant Australian Government guidance and policy documents.
We constructed the dataset by sourcing questions from the Australian Taxation Office's community forum, where Australian taxpayers ask accountants and ATO officials their tax questions.
We found that, in most cases, such questions can be answered by reference to government web pages that, for whatever reason, users were unable to find themselves. Accordingly, we manually went through a stratified sample of 112 challenging forum questions and extracted relevant portions of government guidance materials linked to by tax experts that we verified to be correct.
What makes the dataset so valuable is that, unlike the vast majority of legal information retrieval evaluation sets currently available, it consists of genuinely challenging real-world user-created questions, rather than artificially constructed queries that, at times, diverge considerably from the types of tasks embedding models are actually used for.
Australian Tax Guidance Retrieval is just one of several other evaluation sets that we painstakingly constructed ourselves simply because there weren't any other options.
We've contributed everything, including the code used to evaluate models on MLEB, back to the open-source community.
Our hope is that MLEB and the datasets within it will hold value long into the future so that others training legal information retrieval models won't have to detour into building their own "MTEB for law".
If you'd like to head straight to the leaderboard instead of reading our full announcement, you can find it here: https://isaacus.com/mleb
If you're interested in playing around with our model, which happens to be ranked first on MLEB as of 16 October 2025 at least, check out our docs: https://docs.isaacus.com/quickstart
VT Code – AST-aware Rust agent for terminal (Tree-sitter/AST-grep) #
Try it: cargo install vtcode; vtcode
I turned my resume into a catchy song. It's a game changer #
Stop Chasing Success: Write for Wonder Instead #
OneClickPRD – Save hours vibe coding with concise PRDs #
I built OneClickPRD because as a solo builder I often wasted hours vibe coding without clear goals. I’d start with an idea, but it was vague, so the code got messy and I had to redo things.
OneClickPRD asks you a few questions about your product and then generates a short, structured PRD. The format works well with AI tools like Replit, Lovable, or v0, so you can go from idea to working MVP much faster.
Demo: https://oneclickprd.com/
Would love your feedback: does this feel useful for your projects, and what would make it better?
PyTogether, open-source lightweight real-time Python IDE for learners #
For the past 4 months, I’ve been working on a full-stack project I’m really proud of called PyTogether; a real-time collaborative Python IDE designed with beginners in mind (think Google Docs, but for Python). It’s meant for pair programming, tutoring, or just learning Python together. It’s completely free. No subscriptions, no ads, nothing. Just create an account, make a group, and start a project. Has proper code-linting, extremely intuitive UI, autosaving, and live cursors. There are no limitations at the moment (except for code size to prevent malicious payloads).
Why build this when Replit or VS Code Live Share already exist?
Because my goal was simplicity (and education). I wanted something lightweight for beginners who just want to write and share simple Python scripts (alone or with others), without downloads, paywalls, or extra noise. There’s also no AI/copilot built in, something many teachers and learners actually prefer. Also its free lol
Tech stack (frontend): React + TailwindCSS CodeMirror for linting Y.js for real-time syncing and live cursors Skulpt to execute Python in the browser (for safety - I initially wanted Docker containers, but that would eat too much memory at scale. Skulpt has a limited library, so unfortunately imports like pygame wont work). I don’t enjoy frontend or UI design much, so I leaned on AI for some design help, but all the logic/code is mine. Deployed via Vercel.
Tech stack (backend): Django (channels, auth, celery/redis support made it a great fit) PostgreSQL via Supabase JWT + OAuth authentication Redis for channel layers + caching Fully Dockerized + deployed on a VPS (8GB RAM, $7/mo deal)
Data models: Users <-> Groups -> Projects -> Code Users can join many groups Groups can have multiple projects Each project belongs to one group and has one code file (kept simple for beginners, though I may add a file system later).
There were a lot of issues I came across when building this project, especially related to the backend. My biggest issue was figuring out how to create a reliable and smart autosave system. I couldn't just make it save on every user keystroke because for obvious reasons, that would overwhelm the database especially at scale. So I came up with a solution that I am really proud of; I used Redis to cache active projects, then used Celery to loop through these active projects every minute and then persist the code to the db. I did this by tracking a user count for each project everytime someone joins or leaves, and if the user count drops to 0 for a project, remove it from Redis (save the code too). Redis is extremely fast, so saving the code on every keystroke is not a problem at all. I am essentially hitting 4 birds with one stone with this because I am reusing Redis, which I've already integrated into my channel layers, to track active projects, and to also cache the code so when a new user enters the project, instead of hitting the db for the code, it'll get it from Redis. I even get to use Redis as my message broker for Celery (didn't use RabbitMQ because I wanted to conserve storage instead of dockerizing an entirely new service). This would also work really well at scale since Celery would offload the task of autosaving a lot of code away from the backend. The code also saves when someone leaves the project. Another issue I came across later is if people try sending a huge load of text, so I just capped the limit to 1 MB (will tinker with this).
Deployment on a VPS was another beast. I spent ~8 hours wrangling Nginx, Certbot, Docker, and GitHub Actions to get everything up and running. It was frustrating, but I learned a lot.
If you’re curious or if you wanna see the work yourself, the source code is here. Feel free to contribute: https://github.com/SJRiz/pytogether.
I’m still learning, so any feedback would be amazing (and contributions)!
LLM In-Browser Fuzzer Finds Hidden Prompt Injection in AI Browsers #
min. – lightweight AI-native inbox for teams #
min. is:
- A simple email interface to manage both private and team inboxes.
- An AI-native inbox that sorts and prioritizes messages into custom labels and folders.
- A replacement for patchwork of CRMs, helpdesks, and spreadsheets with just… email.
- Conversational agents (follow-ups, scheduling, nudges), deployable in two clicks.
min. is not:
- A hands-off inbox that auto-replies to everything for you. (e.g “put your support/sales on auto-pilot!”)
- Built just for speed - min. for teams who care about delivering an A+ experience with every interaction.
- Awkward, because it works exactly how you expect emails to work, 2 way syncs with Gmail.
2min demo video: https://www.youtube.com/watch?v=IKtzNL72P34
Live demo: https://app.getmin.ai/demo
DM me or shoot an email to [email protected] if you want to join the beta. Cheers!
Eric
ModernBERT in Pure C #
If you don't know, ModernBERT is a new encoder-only model from answer.ai. Unlike decoder-only models like Llama, encoder models process all input tokens in a single pass (no autoregression), which makes them great for tasks like token classification.
The implementation supports loading any ModernBERT checkpoint from Hugging Face. I've tested it with the base model and a token classification model for anonymizing PII. You can get >1200 tokens/s throughput on a single thread (slightly better than pytorch implementation), though that's not directly comparable to decoder models since there's no token-by-token generation.
I hard-coded the architecture to keep things simple and readable. The tokenizer is a from-scratch BPE implementation that handles most cases (though it's missing some edge cases). The main goal was to support a lightweight deployment of this model, without the heavy baggage of the pytorch ecosystem.
Enjoy.
Reddit Sentiment Dashboard for Claude Code vs. Codex (OSS) #
YTVidHub –Bulk YouTube Subtitle Downloader for Research and Data Prep #
I’m incredibly excited to finally launch YTVidHub, a tool built to solve a massive time sink that many of you here shared: the agonizing manual process of downloading transcripts from large YouTube corpora for research and data analysis.
The Problem: If you need subtitles for 50, 100, or more videos, the current copy-paste-download-repeat workflow is slow and painful.
Our Solution (The Core Feature): YTVidHub is engineered for true bulk processing. You can paste dozens of YouTube URLs (or a Playlist/Channel link) into one clean interface, and the system extracts all available subtitles (including multilingual ASR) and packages them into a single, organized ZIP file for one-click download.
Architectural Insight: Our design prioritizes "research-ready" data, taking cues from the discussions here on HN. We specifically optimized the plain text (TXT) output—stripping all timestamps and formatting—to make it instantly clean for RAG systems and LLM ingestion.
Business Model: YTVidHub is free for single downloads. Bulk operations receive 5 free daily credits to ensure fair use and manage our processing costs. Professional plans are available for high-volume data needs.
Future Focus: We know the ASR accuracy is the next big hurdle. We're already working on a Pro AI Transcription tier to offer high-accuracy, LLM-powered transcripts to tackle niche content and solve the data quality problem.
Please give the bulk downloader a test run and put the system through its paces. Any feedback you have on the speed and the cleanliness of the TXT output is immensely valuable to our engineering roadmap!
Thanks for building with us.
Astrae, Build Animated Next.js Sites 10× Faster #
I’ve been working on Astrae, a library of beautifully animated components and templates built for Next.js, Tailwind CSS, and Framer Motion.
As a designer and creative developer, I noticed that while there are tons of UI kits out there, very few make it easy to add motion and personality without starting from scratch. Astrae aims to fix that — you can literally plug in animations that look polished out of the box.
Some highlights:
Ready-to-use templates for landing pages and portfolios
Animated UI components powered by Framer Motion
100% built for Next.js + Tailwindcss
Focused on design quality and performance
Right now I’m slowly rolling out new components and showcasing them on socials.
I automated the "hey is this done yet?" Slack message from PMs #
So I wired up a GitHub Action that automatically pings them whenever a ticket in Jira/Linear hits testing, or a build completes, so they don't DM me, and I don't need to pretend I didn't see it until the next day (sometimes longer).
It’s tiny, installable in under a minute, and surprisingly effective at keeping everyone sane on my team.
GitHub Action: https://github.com/marketplace/actions/pm-updater
Main repo: https://github.com/peculiarism/eng-pm-mediator
What it does: - Trigger on a push to a specific branch or main. - Automatically notifies your PM via Slack (more integrations coming soon - should we support WhatsApp? iMessage? God that would be so annoying) - Includes branch name, ticket reference in Linear or Jira, and build status - Passive aggressive messages optional
It started as a joke that I built over an evening, but now some friends are using it so I thought I'd turn it into a public repo for others to benefit.
Would love feedback, PRs or feature requests. Maybe next version should reverse-notify engineers when PMs miss their deadlines.
OpenSCAD Studio – AI-assisted OpenSCAD editor (macOS alpha) #
Features:
- AI copilot that reads code/diagnostics and makes validated edits
- Monaco editor with tree-sitter auto-formatting
- Live 3D mesh viewer + proper 2D mode with SVG viewer
Privacy / models - By default uses <model/provider> via your API key; no server of mine
- You can disable AI entirely; all non-AI features run locally
Status / limits - Early alpha, macOS only today (Tauri app). Windows/Linux planned.
Looking for feedback on - Editing/formatting defaults for larger codebases
- 2D SVG workflow (is it useful / what’s missing?)
- Must-have features before Windows/Linux