bonelessmedia - Inferthermic - Chapter 4, Page 11

Your post on HN about scaling LLM inference at Stripe got my attention. We're dealing with similar problems - real-time safety classification at scale, sub-100ms latency requirements, handling millions of requests daily.

The twist? We're a three-person team. (Technically two-and-a-half, since one of us is still in school and interning at Google.)

We're raising our next round and need someone to build the infrastructure team from scratch. First engineering hire, reporting directly to me. Equity that actually matters.

Interested?

Jian Li COO, Inferthermic

Inferthermic: Intelligence to buffer the entropy Empirically proven 98% Risky Interaction Reduction (RIR)

Email 3:

Subject: Safety Engineering

Dr. Chen,

I read your dissertation on adversarial robustness in transformer architectures. We're applying similar principles in production - not in a lab, but in actual systems processing actual user requests for actual money.

I won't pretend we can match Stanford's compute budget. But we can offer something they can't: immediate impact. Your research deployed to millions of users within weeks, not years.

We're small, we're fast, and we're solving a problem the big players can't crack. The safety gaps in current LLMs are real, they're dangerous, and we're the only ones with a working solution.

Worth a conversation?

Alex Vukovic CEO, Inferthermic

Inferthermic: Intelligence to buffer the entropy Empirically proven 98% Risky Interaction Reduction (RIR)

Email 4:

Subject: Operations Role

Priya,