Offline Throughput Benchmark Summary

backend req/s output tok/s total tok/s total latency (s)
roseinfer 201.13 12872.49 64362.44 0.636
roseinfer (in-proc) 204.11 13062.83 65314.13 0.627
roseinfer (+kv256) 0.00 0.00 0.00 8.936
roseinfer (-affinity) 327.87 20983.52 104917.61 0.390
roseinfer (-batch send) 328.31 21011.93 105059.64 0.390
roseinfer (-cmd budg) 327.52 20961.08 104805.40 0.391
roseinfer (-fill tgt) 327.34 20950.03 104750.13 0.391
roseinfer (-thr cap) 327.30 20947.42 104737.08 0.391
roseinfer (+queue ipc) 326.87 20919.91 104599.53 0.392
roseinfer (-fast cnt) 326.57 20900.67 104503.36 0.392
roseinfer (+stream tok) 206.20 13196.70 65983.50 0.621
roseinfer (+warmup cg16) 202.47 12957.92 64789.61 0.632
SGLang 238.44 15259.99 76299.94 0.537
TensorRT-LLM 0.00 0.00 0.00 23.099
vLLM 139.63 8936.27 44681.36 0.917