Raw-Only Open-State Pilot
Redone raw-only open-state pilot using original PM URDF/OBJ, GAPartNet raw zip URDF/OBJ, ArtVIP USD, and GRScenes raw USD/zip evidence; SceneSmith conversion paths are rejected.
这一页把同主题实验收在一起展示。排序仍然按秒级时间戳倒序,星标和筛选逻辑与首页一致。
Redone raw-only open-state pilot using original PM URDF/OBJ, GAPartNet raw zip URDF/OBJ, ArtVIP USD, and GRScenes raw USD/zip evidence; SceneSmith conversion paths are rejected.
INVALIDATED 2026-05-28: this page used SceneSmith-compatible converted SDF paths under /data/share/ud4scenesmith, not the original raw dataset formats.
重新从 benchmark source stats、PAct reference_object.json 和 exported PAct object.json 生成的 9 样本 GT vs PAct 对照页,保留 report.json 可核验证据与交互式 3D viewer。
9 个官方 PAct 失败现象(复杂物体/OOD/遮挡/小part/关节参数/mask 依赖/拓扑/简单翻车/纹理)逐例诊断;mask ablation 揭示 mask 是硬性二值 gate 而非细粒度 part-id 信号 (82.05 → 0 → 82.04)。
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
Stage 1 only: rule/LLM affordance label and query proposal results for the Unified asset library, with HSSD cached-run summary and downloads.
Unified asset library affordance label proposals connected to the official 3D-ADLLM checkpoint, producing 208 point-cloud masks over 53 assets with browser-inspectable asset/label/mask cards.
Browsable OpenAD asset point clouds with affordance labels, GT masks, and 3D-ADLLM predicted masks for 24 diverse samples.
PAct official-default hard-5 run with cardinality sentinels expanded to 3x3 mask-grid cells to test whether more upstream latent evidence keeps improving or starts polluting the condition.
PAct official-default hard-5 run with cardinality sentinels expanded to 2x2 mask-grid cells to test stronger upstream latent evidence.
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
Hard-5 comparison of cardinality sentinels with larger minimum evidence area; best mean score 50.81 under official sampler defaults.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
OT gate heatmaps and Stage1-vs-export part-count collapse localization for hard sample #73.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
Manual probe over low-part-count PartNetMobility samples with VLM QA and interactive 3D previews.
Manual easy-case probe over simple cabinets, doors, drawers, and one PM fixture-like sample with VLM QA and interactive 3D previews.
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
Source-aware random 5-sample PAct evaluation with complete metric statuses and Gemini VLM QA.
Fixed 100-sample Dataset 2.1 benchmark covering ArtVIP, GAPartNet, GRScenes, and PartNetMobility with source/category/problem-tag sampling visualizations.
Ablation and smoke-test portal for OT-gated routing, mask/edge priors, virtual patches, and Stage2 spatial articulation adapter.
True OT-router inference variants for edge, virtual patch, and first-third injection; all 3-sample export QA runs passed.
Stage1 OT-router mask-prior checkpoint reconstructed in PAct inference, exported multi-part GLBs, 3/3 QA pass.
Official-initialized PAct pipeline QA with exported multi-part GLBs, interactive previews, and 3/3 visual QA pass.
Strict Articulate AnyMesh run on an HSSD 3.2 chest-of-drawers asset using Gemini-compatible VLM calls; one prismatic drawer link visualized with a slider.
Visual audit of paper-style DINO feature extraction and TRELLIS v1 SLAT preprocessing for PAct.
Static-only part-name/AABB kinematic hypotheses compared against hidden GAPartNet source-layout joints.
Old fragmented proxy inputs versus corrected full-object color/material inputs, with official PAct and Stage1-cache 3D previews.
Official PAct vs unchanged-architecture Stage1 SGD fine-tuned cache on bad-case inputs, with complete GLBs and articulation videos.
Original PAct vs official-equivalent Stage2 SGD fine-tuned cache on 3 historical bad cases; includes complete GLB previews, articulation videos, and GT-vs-predicted counts.
16 historical official-PAct failure cases converted from source SDF/mobility GT into PAct fine-tuning source data with full-part masks and movable 3D previews.
Audit page for repaired PAct training-format samples: RGB, cleaned masks, source geometry, and closed/half/open joint states.
Official PAct inference on texture-positive source assets rendered into PAct RGB plus source-native link masks: 3/3 complete GLB exports, strong simple cases, small-part drift on dense sample.
Supervised GAPartNet kinematic-mask segmenter driving official PAct inference: 6/6 non-empty exports, but first-pass mask quality is strong only on 2/6 samples.
Negative diagnostic: VLM+SAM2 Appendix-D style preprocessing on GAPartNet CAD renders segments render facets rather than kinematic bodies; PAct inference is therefore mask-limited.
Fixed GAPartNet RGB conditioning render, reran official baseline and a stable official-equivalent Stage2 finetune, then compared exported mesh quality.
Repaired official-equivalent Stage2 fine-tune: fixed mask protocol, supported categories only, 3/3 visual QA pass.
Negative diagnostic baseline: official-equivalent training runs, but inference collapses into tiny wheel-named fragments.
Leave-one-object-out training of a lightweight instance-axis candidate head on complex GAPartNet objects.
Closed, half-open, and open state visualization for official PAct predictions.
Few-joint versus many-joint split evaluation and visualizations for official PAct on GAPartNet non-PM 100.
Minority-button and symmetric-pair local mechanism repair probe.
Per-instance local mechanism axis probe for same-label multi-part hard cases.
Semantic expansion plus root/body local-frame axis candidates for non-canonical hard cases.
Semantic expansion plus geometry-nearest continuous axis priors on hard GAPartNet cases.
Correct-GT hard-case rerun plus semantic mask-to-part expansion for missing joint proposals.
Official PAct diagnostic audit on segmentation dependence, joint parameters, and internal structure; mean strict F1 0.3695.
Official PAct predictions on 100 non-PM GAPartNet samples with source-material visualization; strict F1 0.3695.
Official PAct vs PAct-Transporter/Core-OT on 100 non-PM GAPartNet samples; strict F1 0.3695 -> 0.4395.
官方 PAct commit d974309 的 Dishwasher demo 验证:纯官方源码在当前 mask 读取处失败,local compatibility patch 后成功导出 articulated JSON、URDF、part GLB 和运动视频。
中文交互式讲义:从 Optimal Transport 零基础讲到 Sinkhorn、SceneTransporter 的结构搬运思想,以及如何把 OT 作为 PAct 的 patch-part-joint 约束。
沿用 chair/laptop/button/lock 四张真实图的六路对比模式,对 gym_real.jpg 运行 PhysX-Anything、PartPacker、OmniPart、PAct、SINGAPO 与 TRELLIS2。该图是多器械真实场景,集中暴露单物体/单实例假设下的误识别、过分割和部件合并问题。
把 PartNeXt probe 扩成了 9 个有效类别:`Knife / Toilet / Monitor / Guitar / Teapot / Laptop / Chair / Microwave / Mug`。另外还有 `Handbag / Lamp / Sofa` 3 类因为当前自动视角渲染成了空 alpha,被明确排除,不和模型质量混在一起。有效 9 类上的结果很稳定:`mean_gt_coarse_part_count = 2.67`,`mean_pred_num_nodes = 2.56`,`mean_part_count_abs_error = 0.11`。也就是说 PAct 在 PartNeXt 上大体能保住 coarse slot 数,但语义几乎系统性地往 `door/base/drawer` 一类 appliance 模板漂移。
从已下载的 PartNeXt 子集里挑了 3 个 raw+mesh 已对齐样本:`Chair`、`Monitor`、`Microwave Oven`。我们直接用 PartNeXt 顶层语义树渲染出精确 `mask.exr` 与同视角输入图,再送进 PAct 官方推理链做一个小型 sanity eval。结果很有代表性:`part count` 三例都对上了,但语义标签明显漂移,比如 `Chair -> base/base/door`、`Monitor -> door/base`、`Microwave -> door/base/drawer`。也就是说,PAct 在 PartNeXt 上已经能维持粗结构槽位,但还没有学会这套跨域语义。
从 `raw_datasets/partnet-mobility-v0` 里挑了 5 个**不在官方 demo 里**的类别,直接用原始 PM mesh 自己渲染 synthetic RGBA 输入和真 part mask,再送进 PAct 官方推理链。当前结论很清楚:`Door / Safe` 明显更接近可用,`Bottle / Display` 基本塌成 fixed base,`TrashCan` 则恢复了运动但语义漂移。
把我们补出来的论文附录 D `VLM+SAM2` mask 重新送回 PAct 官方推理链,验证 `Appendix-D preprocessing -> articulated 3D tree` 是否真正闭环。当前双样例 `Dishwasher_001 + StorageFurniture_004` 已经成功跑通。
把论文附录 D 的 `GPT/VLM + SAM2 + VLM merge` 预处理链扩展到更多类别。当前页集中展示 4 个真实样例:`Dishwasher` 与 `StorageFurniture` 是成功例,`Refrigerator` 与 `Table_door` 则保留为 hard case,方便直接对比“哪类对象已经补齐,哪类对象还在失败模式里”。
把 PAct 论文附录 D 里描述的 `VLM-guided prompting pipeline followed by SAM2 refinement` 真正补成本地可运行链路。该页面展示 `Dishwasher_001` 的 Stage 0 granularity、SAM2 候选、Stage 1 articulated/fixed 分类、Stage 2 semantic merge,以及最终导出的 PAct `mask.exr`。
修复 CLI 对 `*_mask.exr` 的多标签读取后,我们按官方 README 原样重跑了全部 `22/22` real-world examples。最终 `single_fixed_base_ratio = 0.0`,跨类别恢复出了 `revolute / prismatic` articulation。这一页集中展示完整统计和 6 个代表样例。
把 `/data` 里的现成渲染图接回 PAct 正式推理链。我们先用 PNG alpha 轮廓拟合相机,再从已知 part mesh 渲出 per-part 标签,导出成 PAct 所需的 `mask.exr`,最后重新喂给官方 inference。当前已打通 Dishwasher 与 StorageFurniture 两个样例。
沿着“为什么全是 fixed”继续往前追后,我们定位到真正根因:CLI 数据集用 `imageio` 读取 `*_mask.exr` 时把多标签 part-id mask 读成了二值图。修复为 `cv2.imread(..., IMREAD_UNCHANGED)` 后,官方 README 路线在 `Table_door_002` 上恢复导出 `base + door + 2 drawers` 的 4 节点结构树,关节类型也恢复成 `revolute / prismatic`。
这页保留了我们最初严格按 README 检查 PAct 时看到的失败症状:前 4 个样例导出的 `object.json` 都只有单个 `fixed base`。后来继续追查后确认,这不是 PAct 最终能力判断,而是 CLI 读取 `EXR` 多标签 mask 时发生了二值化塌缩。请结合后续的 `PAct README EXR Fix Probe` 页面一起阅读。
核验了另一段对话里的 PAct 复现结果后,确认它满足“官方 released inference path 已在 trellis2 环境跑通”的要求:22/22 官方 real-world examples 成功完成,产出 44 个 mp4、22 个 png 和 1 个 run_command.txt。这里展示 8 个代表性样例,并附上复现说明与命令。注意:这一页验证的是官方推理路径和动画/可视化产物,不是训练、GLB 导出或 URDF 全链路复现。