Offline Throughput Benchmark Summary

backend req/s output tok/s total tok/s total latency (s)
roseinfer 201.63 12904.30 64521.52 0.635
roseinfer (+fast BT sync) 199.79 12786.27 63931.33 0.641
roseinfer (in-proc) 204.11 13062.83 65314.13 0.627
SGLang 243.20 15564.48 77822.40 0.526
TensorRT-LLM 248.69 15916.24 79581.21 0.515
vLLM 140.44 8988.14 44940.70 0.911