diff --git a/index.html b/index.html index 04bf194..1b64e09 100644 --- a/index.html +++ b/index.html @@ -741,7 +741,7 @@

Metric - Final Sum Score ↓ + Final Avg Score ↓ VideoFeedback-test EvalCrafter GenAI-Bench @@ -750,60 +750,60 @@

- MantisScore (reg)278.375.751.178.573.0 + MantisScore (reg)69.675.751.178.573.0 - MantisScore-(gen)222.477.127.659.058.7 + MantisScore-(gen)55.677.127.659.058.7 - Gemini-1.5-Pro158.822.122.960.952.9 + Gemini-1.5-Pro39.722.122.960.952.9 - Gemini-1.5-Flash157.520.817.367.152.3 + Gemini-1.5-Flash39.420.817.367.152.3 - GPT-4o155.423.128.752.051.7 + GPT-4o38.923.128.752.051.7 - CLIP-sim126.88.936.234.247.4 + CLIP-sim31.78.936.234.247.4 - DINO-sim121.37.532.138.543.3 + DINO-sim30.37.532.138.543.3 - SSIM-sim118.013.426.934.143.5 + SSIM-sim29.513.426.934.143.5 - CLIP-Score114.4-7.221.745.054.9 + CLIP-Score28.6-7.221.745.054.9 - LLaVA-1.5-7B108.38.510.549.939.4 + LLaVA-1.5-7B27.18.510.549.939.4 - LLaVA-1.6-7B93.3-3.113.244.538.7 + LLaVA-1.6-7B23.3-3.113.244.538.7 - X-CLIP-Score92.9-1.913.341.440.1 + X-CLIP-Score23.2-1.913.341.440.1 - PIQE78.3-10.1-1.234.555.1 + PIQE19.6-10.1-1.234.555.1 - BRISQUE75.9-20.33.938.553.7 + BRISQUE19.0-20.33.938.553.7 - Idefics173.06.50.334.631.7 + Idefics118.36.50.334.631.7 - MSE-dyn42.5-5.5-17.028.436.5 + MSE-dyn10.6-5.5-17.028.436.5 - SSIM-dyn36.7-12.9-26.431.444.5 + SSIM-dyn9.2-12.9-26.431.444.5 - +

The best MantisScore is in bold and the best in baselines is underlined. - "-" means the answer of MLLM is meaningless or in wrong format. +