性能测试
Qwen2.5-VL 性能测试
环境设置
1 | # vllm |
Run:
1 | bash benchmarks/scripts/run-performance-benchmarks.sh |
Benchmark 结果
Before (未移除任何 layer 之前):
1 | ============ Serving Benchmark Result ============ |
Before (无优化,直接进行卷积):
1 | ============ Serving Benchmark Result ============ |
After (只替换卷积):
1 | ============ Serving Benchmark Result ============ |
After (替换整个模型):
1 | ============ Serving Benchmark Result ============ |
error:
1 | AttributeError: 'AscendRMSNorm' object has no attribute 'next_need_quant_fusion_linear' |
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来源 xhj的博客!