bonelessmedia - Inferthermic - Chapter 4, Page 16

Worth a 15-minute call?

Jian Li COO, Inferthermic

Inferthermic: Intelligence to buffer the entropy Empirically proven 98% Risky Interaction Reduction (RIR)

Email 10 (from Alex):

Subject: Distributed Systems + ML Infrastructure

Dr. Park,

I read your paper on "Efficient Scheduling for Heterogeneous GPU Clusters" with genuine envy. The optimization techniques you developed for batching inference requests - that's exactly the problem keeping me up at night.

We're processing millions of LLM outputs daily for safety classification. Current latency: 47ms p95. Target: sub-20ms. Current cost: unsustainable at scale. Target: 10x reduction.

The catch: we can't just throw more hardware at it. We're a startup. We need someone who understands both the ML side (transformer architectures, attention mechanisms) and the systems side (CUDA optimization, distributed serving, memory management).

I know you're at Google now. I know the compensation is good. But if you've ever wanted to own the entire technical stack, make architectural decisions that actually ship, and solve problems that matter for AI safety...

I'd love to buy you coffee and make my case.

Alex Vukovic CEO, Inferthermic

Inferthermic: Intelligence to buffer the entropy Empirically proven 98% Risky Interaction Reduction (RIR)

Email 11 (from Alex):

Subject: From Lab to Production

Professor Nakamura,

I've been following your work on mechanistic interpretability since your NeurIPS keynote. The visualization techniques you developed for attention head analysis - they're elegant. They're also sitting in a repository that maybe a hundred researchers have seen.