Back to blog

The email is the app

Why SlyReply has no inbox UI: SMTP is the interface, your From address is the login, and we store zero message bodies. How the pieces fit.

AndrewAndrewFounder & Engineer, SlyReply (Sherman Studio Ltd)
9 min read
  • architecture
  • email
  • smtp
  • privacy
  • security

Every AI product I tried before building this one wanted the same thing first: connect your inbox. Grant OAuth, hand over a token that can read every email you've ever received, trust us to be careful with it. The feature I wanted — an AI that drafts and sends a reply — needed maybe one percent of that access. The other ninety-nine percent was liability I'd be holding on someone else's behalf.

So SlyReply doesn't ask. There's no inbox connection, no OAuth scope, no dashboard you log into to do the actual work. You email an address — support@slyreply.ai, say — and a reply comes back. The product is a correspondent, not an app you visit. This post is about what that buys you, and the three decisions underneath it: SMTP is the interface, the sender address is the login, and we store none of the message bodies.

SMTP is the interface

The thing that receives your mail is our own SMTP server. The MX record for the domain points at a box we run, and an aiosmtpd handler answers on the SMTP port — port 25 in production, a high port in dev. No inbound webhook from a third party, no "connect your Gmail" handshake. A message arrives as raw RFC-5322 bytes, we parse it into one normalised object, and that object flows through a single pipeline.

The reason it's our server and not a webhook provider is mundane and it's the whole point: a self-hosted inbound path is self-contained. There's no external dashboard to keep in sync, no provider-side routing config that can drift away from what the code expects. A deploy ships the receiver and its logic together. Outbound is the opposite trade — we hand replies to a dedicated sending provider, because deliverability (DKIM signing, SPF, IP reputation) is a specialist job and not one I want to relearn every time an inbox starts greylisting us.

Parsing is where "email as interface" gets real. The parser pulls the bare address out of Name <addr> framing, lower-cases it, walks the MIME tree for a text/plain part, and — when a sender's client sends HTML only, which corporate Exchange and a few programmatic senders do — derives plain text from the HTML rather than letting the message reach the AI as just its subject line. It also reads the threading headers (In-Reply-To, References) and the RFC 3834 / RFC 5230 loop markers (Auto-Submitted, List-Id, Precedence), because an interface made of email has to refuse to talk to an out-of-office autoresponder. Two robots politely replying to each other forever is a funny bug exactly once.

Your From address is the login

There are no passwords in the email flow. There's nowhere to type one — it's email. So the authentication is the thing the protocol already carries: who the message is from.

The check is deliberately boring:

async def authenticate_sender(db, from_email: str) -> dict | None:
    """The from address IS the authentication. If it matches a
    registered_emails entry, the sender is authenticated as that user.
    A suspended account is treated as unauthenticated."""
    user = await db.users.find_one(
        {"registered_emails": from_email.lower()}
    )
    if user is not None and user.get("suspended"):
        return None
    return user

If your From address matches an address you've registered to your account, you're you. Two consequences fall straight out of that. A suspended account is just treated as unauthenticated — the user record, threads and config all survive, the pipeline simply stops replying, which makes suspension a reversible alternative to deletion. And mail from an address nobody has registered is silently dropped. No bounce, no "this address isn't recognised." An error message is a confirmation that something is here to answer; a stranger gets nothing back, which is the correct amount of information to hand an unknown sender.

That sender-is-login model is also what lets agent slugs be per-user rather than globally unique. Two different accounts can both own support@slyreply.ai. The disambiguator is the sender: your From address authenticates you to a single user record, and that user's namespace contains exactly one support agent. The recipient local-part names the agent within your world; your address says which world.

flowchart TD
    A[Inbound email arrives via SMTP] --> B[Parse to one normalised object]
    B --> C{From address in<br/>registered_emails?}
    C -->|no, and not a public agent| D[Silent drop]
    C -->|yes| E[Authenticated as that user]
    E --> F[Resolve recipient local-part<br/>within THIS user's namespace]
    F --> G[Generate reply, send via outbound provider]
    G --> H[Persist metadata-only thread row]

A small set of slugs is reserved globally — the RFC 2142 system addresses (postmaster, abuse, and friends) plus a few platform-transactional names. Those never fall through to a user's catch-all; they get a static, polite bounce instead, so a typo can't quietly burn cost against a system address. Everything else is yours to name.

Why this is safe (and where the work is)

The obvious objection is the right one: From addresses can be forged. Sender-as-login is only honest if forging the sender is hard, so most of the engineering here is in the gates, not the lookup above.

The defenses are the email standards built for exactly this. DKIM lets us check that the message carries a valid cryptographic signature from the domain it claims to be from. DMARC-style alignment checks that the signing domain actually matches the From domain — a valid signature from the wrong domain doesn't count. And SPF validates the envelope sender against the IP that connected to us, which closes a different gap: a plain forwarder keeps the envelope sender intact, so the worst it produces is a soft signal, never a hard one — which makes a hard SPF failure safe to treat as "this IP is not allowed to send for that domain."

The design principle is that these gates are layered and tuned per context rather than flipped on uniformly. A hard SPF failure is an explicit "not authorised" and is dropped everywhere. A DKIM/DMARC result is recorded on every message into a forensic audit row, but how aggressively a failure is enforced depends on context — because legitimate forwarders and relays do occasionally fail strict alignment, and a false positive that drops a real customer's mail is its own kind of outage. So some surfaces strict-drop on failure while others soft-log first, the audit trail letting us watch the false-positive rate before tightening a screw. The public, no-signup demo address is the strictest of all: it has no account behind it, so it leans hardest on the cryptographic checks and on its own rate and cost caps.

I'm being deliberately vague about which surface is set how strictly, and I'm not publishing a single threshold or limit. The pattern is the useful part; the parameters are an attacker's checklist. What I'll commit to is the shape: forging your way past sender-as-login means defeating DKIM, DMARC alignment, and SPF, against an audited pipeline that drops the clear failures and is built to tighten the ambiguous ones.

We store none of the message bodies

Here's the part that makes the "we don't read your email" pitch something other than a slogan: there is no table the bodies could be in.

When a message clears the gates, we create or find a conversation row and append a turn to it. The row is metadata only — a thread id, the subject, the sender, and per turn a role, a timestamp, and the message ids needed to thread the next reply. No text_body, no html_body, no AI reply text. The append function takes a role, a from_email and a message_id, and that is the entire payload it will write. The body is in memory long enough to build the prompt and send the answer, then it's gone.

This isn't a retention window or a per-user privacy toggle you could forget to enable. Storage is off for everyone, unconditionally. There's no /conversations screen, no body-search API, nothing to leak in a breach because the thing worth stealing was never written down. The audit rows that do persist are the routing decisions — who, when, which gate, what outcome — the operational trail, not the content.

Which raises the obvious question: if we keep no history, how does the AI hold a conversation across replies? The answer is that email already solved this, decades ago. Every standards-compliant client, when you hit reply, quotes the message you're replying to — the On <date>, X wrote: > block. That quoted chain is the history, and it rides in-band with the new message. We feed the AI the un-stripped body, so the model reads the thread the same way you do: by scrolling down. The stored metadata exists only so a later reply still threads to the right row if a client truncates the quote.

It's a constraint that turned out to be a feature. The continuity lives where it always lived — in the email — and we get to be the service that genuinely can't read your back-catalogue, because we never kept one.

Why email-shaped

Put the three together and the shape of the thing is clear. The interface is a protocol everyone already has a client for, so there's nothing to install and nothing to learn. The login is an address you already own, so there's no password and no inbox-access grant. The storage is nothing, so the privacy promise is structural rather than a policy you're asked to trust.

None of this is the path of least resistance — a webhook provider and a Postgres messages table would have been a faster afternoon. But every shortcut there is one I'd be holding on a user's behalf: an OAuth token with the run of their inbox, a body store waiting to be subpoenaed or breached, a routing config drifting out of sync with the code. Email-shaped means the product is exactly as wide as its job and not one scope wider. The email isn't the input to the app. The email is the app.