PAct · experiment detail

2026-04-24 08:24:24 UTC

Gym Real Multi-Method Comparison

沿用 chair/laptop/button/lock 四张真实图的六路对比模式，对 gym_real.jpg 运行 PhysX-Anything、PartPacker、OmniPart、PAct、SINGAPO 与 TRELLIS2。该图是多器械真实场景，集中暴露单物体/单实例假设下的误识别、过分割和部件合并问题。

gym_realmulti-methodPhysX-AnythingPartPackerOmniPartPActSINGAPOTRELLIS2real-image

2026-04-24 08:24:24 UTCTimestamp

6Assets

activeStatus

Gym Real Multi-Method Comparison cover image

Assets

Interactive Asset

PhysX-Anything

官方四段链直接处理 `gym_real.jpg`。VLM 将多器械健身房画面误识别为 `Cart / WheelingDevice`，并生成 base + 两个 wheel-like movable groups；这是单物体假设在复杂真实场景上的典型失败。

下载 sample.glb 下载 basic_info.json 下载 articulated_blueprint.json 下载 basic.urdf 下载 basic.xml

3predicted parts

2predicted joints

WheelingDevicecategory

PhysX-Anything Input real image — Input real image

PhysX-Anything official sample.glb

保留了 `sample.glb`、`basic.urdf`、`basic.xml` 和 blueprint。语义识别偏到推车后，后续物理结构也随之变成轮式设备。

Interactive Asset

PartPacker

按官方 flow 对同一张 `gym_real.jpg` 直接推理，输出完整 GLB 与 `27` 个 part GLB。多实例器械被当成一个复杂物体处理，因此分件数量明显膨胀。

下载 full GLB 下载 part0.glb 下载 part1.glb 下载 part2.glb 下载 part3.glb 下载 part4.glb 下载 part5.glb 下载 part6.glb 下载 part7.glb 下载 part8.glb 下载 part9.glb 下载 part10.glb 下载 part11.glb 下载 vol0.glb 下载 vol1.glb 更多 part GLB 见 assets/partpacker/

27predicted parts

50 / 384steps / grid

big_parts_strict_pvaeflow config

PartPacker full object

默认展示完整物体 GLB；下载区保留 part 与 dual-volume 结果，用来观察它在多器械图上的可见部件切分。

Interactive Asset

OmniPart

用本地跳过 RMBG 的官方 app path 运行。SAM/OmniPart 在多器械画面上保留了 `22` 个 region，显著过分割；页面展示 textured/segmented/exploded 三类 GLB 和 mask 可视化。

下载 textured GLB 下载 segmented GLB 下载 exploded_parts.glb 下载 bboxes_vis.glb 下载 bbox numpy 下载 summary JSON

22predicted bboxes

SAM regionscondition source

skip RMBGlocal mode

OmniPart Input real image — Input real image

OmniPart Input image + mask visualization — Input image + mask visualization

OmniPart Final mask segments — Final mask segments

OmniPart textured mesh

该输入不是干净单物体，SAM 区域会把多个器械、背景结构和重复部件都纳入条件，因此更适合作为多实例压力测试看。

Interactive Asset

PAct

先走 Appendix-D 风格 VLM + SAM2 mask labeling，再送入 PAct 官方推理。VLM 识别为 adjustable weight bench，但最终只导出 `4` 个节点，关节类型为 `fixed / prismatic`，大块区域仍然合并。

下载 assembled_object.glb 下载 object.json 下载 bbox_3d.glb 下载 joint_motion.mp4 下载 stage0_granularity.json 下载 stage1_classification.json 下载 stage2_merge.json

4predicted nodes

fixed / prismaticjoint families

official + Appendix-Dpipeline

PAct Input real image — Input real image

PAct Appendix-D SAM2 overlay — Appendix-D SAM2 overlay

PAct Merged part mask — Merged part mask

PAct PAct conditioning grid — PAct conditioning grid

PAct Predicted 3D bounding boxes — Predicted 3D bounding boxes

PAct Predicted kinematic tree — Predicted kinematic tree

Joint motion animation rendered from object.json

PAct assembled object

默认展示从 object.json + part mesh 装配的 GLB。保留 SAM2 overlay、merged mask、conditioning grid、bbox、kinematic tree 和 joint motion。

Interactive Asset

SINGAPO

使用手工 graph prior 跑 SINGAPO，目标是从多器械画面中抽一个代表性 bench 结构。输出 `5` 个节点，关节类型为 `fixed / prismatic`；无 mesh part，因此页面展示 bbox 与 kinematic tree。

下载 object.json 下载 pred_graph.json 下载 manual_graph.json 下载 bbox_3d.glb

5predicted nodes

fixed / prismaticjoint families

manual graphprior mode

SINGAPO Input real image — Input real image

SINGAPO Predicted 3D bounding boxes — Predicted 3D bounding boxes

SINGAPO Predicted kinematic tree — Predicted kinematic tree

SINGAPO predicted 3D bboxes

SINGAPO 本次只产生结构 JSON 与 bbox，可交互视图展示 bbox_3d.glb。manual_graph.json 和 pred_graph.json 一并保留。

Interactive Asset

TRELLIS2

直接用 TRELLIS2 512 pipeline 从 `gym_real.jpg` 生成整体网格，未做 part/kinematic decomposition。复杂多实例画面会被压成一个整体 3D asset。

下载 sample.glb 下载 sample.obj

1whole mesh

512pipeline type

no partsdecomposition

TRELLIS2 Input real image — Input real image

TRELLIS2 direct sample.glb

这是纯 whole-object generation baseline，适合作为几何生成对照，不提供部件或关节语义。