AsianFin – AI firm DeepSeek has officially launched an account on Zhihu, where it published a technical article titled "DeepSeek-V3/R1 Inference System Overview," disclosing key details on model optimization and profitability for the first time.
The article outlines the optimization goals for the DeepSeek-V3/R1 inference system: higher throughput and lower latency. To achieve these, DeepSeek has adopted large-scale cross-node Expert Parallelism (EP), which enhances efficiency but also increases system complexity. The post explains techniques such as expanding batch sizes, minimizing data transfer latency, and balancing system loads.
In a notable first, DeepSeek also disclosed theoretical cost and profit estimates. Assuming a GPU rental cost of $2 per hour, the company estimates a total operational cost of $87,072 per day. With all tokens priced according to DeepSeek R1’s current rate, the theoretical daily revenue could reach $562,027—resulting in an impressive 545% cost-profit margin.
Explore more exclusive insights at nextfin.ai.
