Articulated Generation Leaderboard v0.1

Offline portal generated from leaderboard_v0.1.json.

Generated 2026-05-09T03:21:13+00:00 Status: not_yet_reliable
Important: A/B/C/D are grid-position labels from the PhysX-Anything judge prompt, not verified method names.

Reliable Single-Pipeline Checkpoint

Existing A/B/C/D rank outputs are grid positions in the PhysX-Anything judge prompt, not verified method names. The current benchmark_light20 videos appear to be single-method renders judged with a grid-style schema.

Scale Error32.250lower is better
Affordance9.699proxy score
Material8.994proxy score
Description13.040proxy score

Grid-Position Results

Lower rank is better. These rows preserve judge output without treating grid labels as method names.

evaluation_kine_results

/data/250010098/PhysX-Anything/evaluation_kine_results

samples 16
JSON files: 17 | invalid/unparseable: 3.json
Grid Label n Mean Geometry Rank Mean Motion Rank Geometry Rank-1 Motion Rank-1
A161.7501.75099
B161.8751.87588
C161.6251.62599
D161.9381.93888

evaluation_kine_results_benchmark_light20

/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_light20

samples 20
JSON files: 20 | invalid/unparseable: none
Grid Label n Mean Geometry Rank Mean Motion Rank Geometry Rank-1 Motion Rank-1
A201.1501.1501919
B201.1501.1501818
C201.1501.1501818
D201.2001.2001717

evaluation_kine_results_fixed

/data/250010098/PhysX-Anything/evaluation_kine_results_fixed

samples 17
JSON files: 17 | invalid/unparseable: none
Grid Label n Mean Geometry Rank Mean Motion Rank Geometry Rank-1 Motion Rank-1
A171.5291.5291010
B172244
C172.5292.52944
D173344

evaluation_kine_results_rerun

/data/250010098/PhysX-Anything/evaluation_kine_results_rerun

samples 15
JSON files: 17 | invalid/unparseable: 5.json, 7.json
Grid Label n Mean Geometry Rank Mean Motion Rank Geometry Rank-1 Motion Rank-1
A151.4671.4671110
B151.6671.667109
C151.8001.86787
D151.9331.93399

Evaluation Directory Inventory

StatusDirectoryJSONRaw TXTTXTMP4
rank_json_availableevaluation_kine_results17000
empty_or_not_completedevaluation_kine_results_benchmark_completed1230000
empty_or_not_completedevaluation_kine_results_benchmark_full2430000
rank_json_availableevaluation_kine_results_benchmark_light202020200
empty_or_not_completedevaluation_kine_results_benchmark_sharded0000
empty_or_not_completedevaluation_kine_results_benchmark_sharded_diag0000
rank_json_availableevaluation_kine_results_fixed1717170
rank_json_availableevaluation_kine_results_rerun17000
empty_or_not_completedevaluation_phy_results_benchmark_completed1230000
empty_or_not_completedevaluation_phy_results_benchmark_full2430000
text_onlyevaluation_phy_results_benchmark_light200010
empty_or_not_completedevaluation_phy_results_benchmark_sharded0000
empty_or_not_completedevaluation_phy_results_benchmark_sharded_diag0000
video_onlyevaluation_video0001
video_onlyevaluation_video_physxanything00017
empty_or_not_completedevaluation_video_physxanything_benchmark_completed1230000
empty_or_not_completedevaluation_video_physxanything_benchmark_full2430000
video_onlyevaluation_video_physxanything_benchmark_light2000020
empty_or_not_completedevaluation_video_physxanything_benchmark_sharded0000
empty_or_not_completedevaluation_video_physxanything_benchmark_sharded_diag0000
video_onlyevaluation_video_physxanything_rerun00017

Uncertainty Notes

  • Do not describe A/B/C/D as methods.
  • Rank JSON directories are useful for auditing judge behavior, but are not method leaderboard entries.
  • Full243/sharded benchmark directories contain no completed rank JSON in this workspace snapshot.

Required Before Method Ranking

  • Recover or create a montage manifest mapping grid positions to real method names.
  • Separate native/generated/manually cleaned assets according to the v0.1 metric rules.
  • Keep failed samples in reports with failure cause and repair status.

Embedded Source JSON

{
  "leaderboard_id": "articulated_generation_leaderboard_v0.1",
  "benchmark_id": "articulated_generation_v0.1",
  "generated_at_utc": "2026-05-09T03:21:13+00:00",
  "sources": {
    "metrics_spec": "/data/250010098/Unified_dataset/articulated_generation_benchmark/metrics/articulated_generation_metrics_v0.1.json",
    "reproduced_summary_json": "/data/250010098/Unified_dataset/articulated_generation_benchmark/experiments/reproduced_method_eval_summary.json",
    "reproduced_summary_md": "/data/250010098/Unified_dataset/articulated_generation_benchmark/experiments/reproduced_method_eval_summary.md",
    "physx_evaluation_root": "/data/250010098/PhysX-Anything"
  },
  "decision": {
    "method_leaderboard_status": "not_yet_reliable",
    "reason": "Existing A/B/C/D rank outputs are grid positions in the PhysX-Anything judge prompt, not verified method names. The current benchmark_light20 videos appear to be single-method renders judged with a grid-style schema.",
    "required_before_method_ranking": [
      "Recover or create a montage manifest mapping grid positions to real method names.",
      "Separate native/generated/manually cleaned assets according to the v0.1 metric rules.",
      "Keep failed samples in reports with failure cause and repair status."
    ]
  },
  "reliable_results": [
    {
      "entry_id": "physxanything_pipeline_benchmark_light20_physics_proxy",
      "display_name": "PhysX-Anything pipeline, benchmark_light20 physics proxy",
      "split": "benchmark_light20",
      "sample_count_inferred_from_video_dir": 20,
      "metric_source": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_light20/phy_stdout.txt",
      "metrics": {
        "scale": 32.25,
        "affordance": 9.698711012010476,
        "material": 8.993700337231624,
        "description": 13.040154636356394
      },
      "confidence": "limited",
      "limitations": [
        "The stdout contains aggregate scalar proxy metrics, not per-sample JSON.",
        "Lower scale error is better; affordance/material/description are PSNR-like scores where higher is better.",
        "This is a single-pipeline result and is not a method-vs-method leaderboard."
      ]
    }
  ],
  "grid_position_unknown_results": [
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results",
      "sample_count": 16,
      "json_file_count": 17,
      "invalid_or_unparseable_files": [
        "3.json"
      ],
      "label_status": "grid-position/unknown",
      "methods": {
        "A": {
          "n": 16,
          "mean_geometry_rank": 1.75,
          "mean_motion_rank": 1.75,
          "geometry_rank1_count": 9,
          "motion_rank1_count": 9
        },
        "B": {
          "n": 16,
          "mean_geometry_rank": 1.875,
          "mean_motion_rank": 1.875,
          "geometry_rank1_count": 8,
          "motion_rank1_count": 8
        },
        "C": {
          "n": 16,
          "mean_geometry_rank": 1.625,
          "mean_motion_rank": 1.625,
          "geometry_rank1_count": 9,
          "motion_rank1_count": 9
        },
        "D": {
          "n": 16,
          "mean_geometry_rank": 1.9375,
          "mean_motion_rank": 1.9375,
          "geometry_rank1_count": 8,
          "motion_rank1_count": 8
        }
      }
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_light20",
      "sample_count": 20,
      "json_file_count": 20,
      "invalid_or_unparseable_files": [],
      "label_status": "grid-position/unknown",
      "methods": {
        "A": {
          "n": 20,
          "mean_geometry_rank": 1.15,
          "mean_motion_rank": 1.15,
          "geometry_rank1_count": 19,
          "motion_rank1_count": 19
        },
        "B": {
          "n": 20,
          "mean_geometry_rank": 1.15,
          "mean_motion_rank": 1.15,
          "geometry_rank1_count": 18,
          "motion_rank1_count": 18
        },
        "C": {
          "n": 20,
          "mean_geometry_rank": 1.15,
          "mean_motion_rank": 1.15,
          "geometry_rank1_count": 18,
          "motion_rank1_count": 18
        },
        "D": {
          "n": 20,
          "mean_geometry_rank": 1.2,
          "mean_motion_rank": 1.2,
          "geometry_rank1_count": 17,
          "motion_rank1_count": 17
        }
      }
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_fixed",
      "sample_count": 17,
      "json_file_count": 17,
      "invalid_or_unparseable_files": [],
      "label_status": "grid-position/unknown",
      "methods": {
        "A": {
          "n": 17,
          "mean_geometry_rank": 1.5294117647058822,
          "mean_motion_rank": 1.5294117647058822,
          "geometry_rank1_count": 10,
          "motion_rank1_count": 10
        },
        "B": {
          "n": 17,
          "mean_geometry_rank": 2,
          "mean_motion_rank": 2,
          "geometry_rank1_count": 4,
          "motion_rank1_count": 4
        },
        "C": {
          "n": 17,
          "mean_geometry_rank": 2.5294117647058822,
          "mean_motion_rank": 2.5294117647058822,
          "geometry_rank1_count": 4,
          "motion_rank1_count": 4
        },
        "D": {
          "n": 17,
          "mean_geometry_rank": 3,
          "mean_motion_rank": 3,
          "geometry_rank1_count": 4,
          "motion_rank1_count": 4
        }
      }
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_rerun",
      "sample_count": 15,
      "json_file_count": 17,
      "invalid_or_unparseable_files": [
        "5.json",
        "7.json"
      ],
      "label_status": "grid-position/unknown",
      "methods": {
        "A": {
          "n": 15,
          "mean_geometry_rank": 1.4666666666666666,
          "mean_motion_rank": 1.4666666666666666,
          "geometry_rank1_count": 11,
          "motion_rank1_count": 10
        },
        "B": {
          "n": 15,
          "mean_geometry_rank": 1.6666666666666667,
          "mean_motion_rank": 1.6666666666666667,
          "geometry_rank1_count": 10,
          "motion_rank1_count": 9
        },
        "C": {
          "n": 15,
          "mean_geometry_rank": 1.8,
          "mean_motion_rank": 1.8666666666666667,
          "geometry_rank1_count": 8,
          "motion_rank1_count": 7
        },
        "D": {
          "n": 15,
          "mean_geometry_rank": 1.9333333333333333,
          "mean_motion_rank": 1.9333333333333333,
          "geometry_rank1_count": 9,
          "motion_rank1_count": 9
        }
      }
    }
  ],
  "existing_summary_snapshot": {
    "metric_source": "existing VLM/manual rank JSON files; lower rank is better",
    "method_label_warning": "A/B/C/D are grid-position labels in PhysX-Anything evaluation prompts, not verified method names. Current benchmark_light20 videos appear to be single-method renders passed through a grid-style judge, so these ranks must not be interpreted as method comparisons until a montage manifest or method-position mapping is recovered.",
    "evaluations": [
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_light20",
        "sample_count": 20,
        "methods": {
          "A": {
            "n": 20,
            "mean_geometry_rank": 1.15,
            "mean_motion_rank": 1.15,
            "geometry_rank1_count": 19,
            "motion_rank1_count": 19
          },
          "B": {
            "n": 20,
            "mean_geometry_rank": 1.15,
            "mean_motion_rank": 1.15,
            "geometry_rank1_count": 18,
            "motion_rank1_count": 18
          },
          "C": {
            "n": 20,
            "mean_geometry_rank": 1.15,
            "mean_motion_rank": 1.15,
            "geometry_rank1_count": 18,
            "motion_rank1_count": 18
          },
          "D": {
            "n": 20,
            "mean_geometry_rank": 1.2,
            "mean_motion_rank": 1.2,
            "geometry_rank1_count": 17,
            "motion_rank1_count": 17
          }
        }
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_fixed",
        "sample_count": 17,
        "methods": {
          "A": {
            "n": 17,
            "mean_geometry_rank": 1.5294117647058822,
            "mean_motion_rank": 1.5294117647058822,
            "geometry_rank1_count": 10,
            "motion_rank1_count": 10
          },
          "B": {
            "n": 17,
            "mean_geometry_rank": 2.0,
            "mean_motion_rank": 2.0,
            "geometry_rank1_count": 4,
            "motion_rank1_count": 4
          },
          "C": {
            "n": 17,
            "mean_geometry_rank": 2.5294117647058822,
            "mean_motion_rank": 2.5294117647058822,
            "geometry_rank1_count": 4,
            "motion_rank1_count": 4
          },
          "D": {
            "n": 17,
            "mean_geometry_rank": 3.0,
            "mean_motion_rank": 3.0,
            "geometry_rank1_count": 4,
            "motion_rank1_count": 4
          }
        }
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results",
        "sample_count": 15,
        "methods": {
          "A": {
            "n": 15,
            "mean_geometry_rank": 1.6,
            "mean_motion_rank": 1.6,
            "geometry_rank1_count": 8,
            "motion_rank1_count": 8
          },
          "B": {
            "n": 15,
            "mean_geometry_rank": 1.7333333333333334,
            "mean_motion_rank": 1.7333333333333334,
            "geometry_rank1_count": 7,
            "motion_rank1_count": 7
          },
          "C": {
            "n": 15,
            "mean_geometry_rank": 1.5333333333333334,
            "mean_motion_rank": 1.5333333333333334,
            "geometry_rank1_count": 8,
            "motion_rank1_count": 8
          },
          "D": {
            "n": 15,
            "mean_geometry_rank": 1.9333333333333333,
            "mean_motion_rank": 1.9333333333333333,
            "geometry_rank1_count": 6,
            "motion_rank1_count": 6
          }
        }
      }
    ]
  },
  "evaluation_dir_inventory": [
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results",
      "name": "evaluation_kine_results",
      "json_count": 17,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_completed123",
      "name": "evaluation_kine_results_benchmark_completed123",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_full243",
      "name": "evaluation_kine_results_benchmark_full243",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_light20",
      "name": "evaluation_kine_results_benchmark_light20",
      "json_count": 20,
      "raw_txt_count": 20,
      "txt_count": 20,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_sharded",
      "name": "evaluation_kine_results_benchmark_sharded",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_sharded_diag",
      "name": "evaluation_kine_results_benchmark_sharded_diag",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_fixed",
      "name": "evaluation_kine_results_fixed",
      "json_count": 17,
      "raw_txt_count": 17,
      "txt_count": 17,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_rerun",
      "name": "evaluation_kine_results_rerun",
      "json_count": 17,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_completed123",
      "name": "evaluation_phy_results_benchmark_completed123",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_full243",
      "name": "evaluation_phy_results_benchmark_full243",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_light20",
      "name": "evaluation_phy_results_benchmark_light20",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 1,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_sharded",
      "name": "evaluation_phy_results_benchmark_sharded",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_sharded_diag",
      "name": "evaluation_phy_results_benchmark_sharded_diag",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_video",
      "name": "evaluation_video",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 1
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything",
      "name": "evaluation_video_physxanything",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 17
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_completed123",
      "name": "evaluation_video_physxanything_benchmark_completed123",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_full243",
      "name": "evaluation_video_physxanything_benchmark_full243",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_light20",
      "name": "evaluation_video_physxanything_benchmark_light20",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 20
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_sharded",
      "name": "evaluation_video_physxanything_benchmark_sharded",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_sharded_diag",
      "name": "evaluation_video_physxanything_benchmark_sharded_diag",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 0
    },
    {
      "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_rerun",
      "name": "evaluation_video_physxanything_rerun",
      "json_count": 0,
      "raw_txt_count": 0,
      "txt_count": 0,
      "mp4_count": 17
    }
  ],
  "evaluation_dir_status": {
    "empty_or_not_completed": [
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_completed123",
        "name": "evaluation_kine_results_benchmark_completed123",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_full243",
        "name": "evaluation_kine_results_benchmark_full243",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_sharded",
        "name": "evaluation_kine_results_benchmark_sharded",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_sharded_diag",
        "name": "evaluation_kine_results_benchmark_sharded_diag",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_completed123",
        "name": "evaluation_phy_results_benchmark_completed123",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_full243",
        "name": "evaluation_phy_results_benchmark_full243",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_sharded",
        "name": "evaluation_phy_results_benchmark_sharded",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_sharded_diag",
        "name": "evaluation_phy_results_benchmark_sharded_diag",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_completed123",
        "name": "evaluation_video_physxanything_benchmark_completed123",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_full243",
        "name": "evaluation_video_physxanything_benchmark_full243",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_sharded",
        "name": "evaluation_video_physxanything_benchmark_sharded",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_sharded_diag",
        "name": "evaluation_video_physxanything_benchmark_sharded_diag",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      }
    ],
    "rank_json_available": [
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results",
        "name": "evaluation_kine_results",
        "json_count": 17,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_benchmark_light20",
        "name": "evaluation_kine_results_benchmark_light20",
        "json_count": 20,
        "raw_txt_count": 20,
        "txt_count": 20,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_fixed",
        "name": "evaluation_kine_results_fixed",
        "json_count": 17,
        "raw_txt_count": 17,
        "txt_count": 17,
        "mp4_count": 0
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_kine_results_rerun",
        "name": "evaluation_kine_results_rerun",
        "json_count": 17,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 0
      }
    ],
    "text_only": [
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_phy_results_benchmark_light20",
        "name": "evaluation_phy_results_benchmark_light20",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 1,
        "mp4_count": 0
      }
    ],
    "video_only": [
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_video",
        "name": "evaluation_video",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 1
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything",
        "name": "evaluation_video_physxanything",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 17
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_benchmark_light20",
        "name": "evaluation_video_physxanything_benchmark_light20",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 20
      },
      {
        "path": "/data/250010098/PhysX-Anything/evaluation_video_physxanything_rerun",
        "name": "evaluation_video_physxanything_rerun",
        "json_count": 0,
        "raw_txt_count": 0,
        "txt_count": 0,
        "mp4_count": 17
      }
    ]
  },
  "uncertainty_notes": [
    "Do not describe A/B/C/D as methods.",
    "Rank JSON directories are useful for auditing judge behavior, but are not method leaderboard entries.",
    "Full243/sharded benchmark directories contain no completed rank JSON in this workspace snapshot."
  ]
}