How We Built an AI Agent That Actually Reads Threads

"AI agent" can mean a lot of things these days. Most of the conversation is about agents that write code, browse the web for you, or book restaurants. Ours does something simpler and, in its own way, harder: it reads threads.

Why reading is harder than scraping

Any script can hit an API and dump JSON into a database. That is not reading. Reading means loading a page, rendering it, scrolling through the comments, understanding the tone of the conversation, and deciding whether this thread is relevant to a specific product.

Each platform presents content differently. Reddit has nested comments with voting. Hacker News has a flat tree with a different threading model. Lobsters has yet another layout. Dev.to has long-form articles with discussion in a sidebar. And Product Hunt is an entirely different beast — a launch page with a chronological comment stream.

An API gives you structured data but loses context. A thread titled "Anyone moved off Clerk recently?" could be a mundane tech-switch discussion or a goldmine of competitive intel, depending on the comments that follow. A scraper that only reads titles misses the point.

Browser-based browsing

Our agent runs a real browser instance — headless, but otherwise indistinguishable from a normal visitor. It navigates to each community, loads the page, scrolls through comments, and captures the full text and structure. This is slower than an API call, but it is the only way to see what a human would see.

We maintain one adapter per platform. Each adapter knows how to navigate that specific site: where the comment threads are, how pagination works, how to handle login walls (for platforms that require them), and what rate limiting looks like. The browser handles the rendering; the adapter handles the logic.

Scoring: from text to signal

Once a thread is ingested, our scout agent scores it along two axes: relevance (how closely the topic matches the topics and competitors you configured) and intent (what kind of conversation it is — asking, switching, complaining, launching, or just noise). The intent classification is what makes the queue useful. A thread where someone is asking for a recommendation is urgent. A thread where someone is venting about a general industry pain point is worth a thoughtful reply, but not necessarily urgent.

Drafting: voice as a starting point

The draft feature — marked experimental for good reason — works from writing samples you provide. The model learns your cadence, your vocabulary, your level of formality, and the kind of openers and sign-offs you use. When a thread surfaces, it generates two or three angles for a reply.

We are careful to frame these as a reference, not a finished output. The draft is a starting point that saves you the blank-page problem, but you should always rewrite it in your own words before posting. Nothing leaves your workspace without you hitting "copy."

The constraints that shaped the design

Building a reading agent meant working within real constraints. Browser sessions are expensive in memory and time. Proxy management adds complexity. Each platform has different anti-bot measures. And the whole thing has to run every 30 minutes without human supervision.

We learned early that caching is critical — storing raw thread content in a full-text index means the scout can re-score without re-browsing. We also learned that not every platform needs the same cadence. Hacker News is fast-moving and merits frequent checks; Lobsters is slower and can be visited less often.

The system is not perfect — no agent is. But it is honest about its limits, and we improve it continuously. Every week we tune the scoring models and add platform-specific refinements.

Get posts like this in your inbox

Notes on building Signalstac, developer marketing, and community engagement — sent roughly every two weeks.

Subscribe →