PAct min evidence span2 render check

Dataset audit: True · samples: 5 · mean weighted score: 0.00

manual benchmark_index selection: 0, 4, 26, 73, 97

report.json · report.md

#00 ArtVIP / major_appliances

VLM card

score 0.0 · part MAE 7 · joint F1 0.00

skipped

#04 GAPartNet / small_appliances

VLM card

score 0.0 · part MAE 12 · joint F1 0.00

skipped

#26 GAPartNet / electronics

VLM card

score 0.0 · part MAE 21 · joint F1 0.00

skipped

#73 PartNetMobility / electronics

VLM card

score 0.0 · part MAE 31 · joint F1 0.00

skipped

#97 PartNetMobility / major_appliances

VLM card

score 0.0 · part MAE 6 · joint F1 0.00

skipped