Offline Throughput Benchmark Summary

backend req/s output tok/s total tok/s total latency (s)
roseinfer 184.02 11600.55 58710.07 0.696
roseinfer (prefill auto2) 136.00 8633.58 43448.40 0.941
roseinfer (in-proc) 122.96 7714.51 39191.16 1.041
SGLang 239.60 15334.57 76672.85 0.534
TensorRT-LLM 250.68 15939.89 80114.66 0.511
vLLM 141.90 9048.16 45373.82 0.902