ArtiPhys: 语义件→运动部件 合并+关节读出+可动MJCF
冻结同学分割基座,只训一个pairwise头同时学合并(语义件→运动部件,ARI 0.616中位1.0≈基线2×)与关节读出(轴11.1°vs idea1 90°),拼成端到端带真<joint>的可动MJCF。8件mujoco渲染亲眼验证。
当前要看的版本:6352 个 SceneSmith-format articulated assets;官方 loader 已验证可直接读取,页面内含跨数据集 joint slider 抽查。
统一展示我让 Codex 生成的可视化结果。主页按时间倒序陈列实验,点进每个实验即可查看可交互 3D 资产、渲染图与可下载文件。
冻结同学分割基座,只训一个pairwise头同时学合并(语义件→运动部件,ARI 0.616中位1.0≈基线2×)与关节读出(轴11.1°vs idea1 90°),拼成端到端带真<joint>的可动MJCF。8件mujoco渲染亲眼验证。
复现VIPL-VSU/GEAR(2DGS铰接物体建模). box_100154端到端跑完5阶段(SAM mask→coarse GS→体素化抽关节→几何-运动交替精修→渲染评测): 正确分出底座+4翻盖5部件、估4条铰链轴, 官方关节指标轴角误差0.0266/轴距0.0061/运动量误差0.064/动件Chamfer0.155. 新建conda env gear(torch2.4.1+cu124)+3个CUDA扩展+pytorch3d0.7.8+tinycudann源码编译. 修了4个repo自带bug: ①提交了未解决git合并冲突(train_coarse平凡/other_utils非平凡-main侧引用丢失变量取HEAD侧) ②Pillow np.byte→uint8 ③artgs渲染期args.iterations兜底 ④补tensorboard+依赖pin要py3.11与README矛盾松绑. PSNR=-1因数据集无test划分. bucket/door后台跑中, 16 scene全解压换名即复现.
PNM Lamp 13928折叠台灯(5件4 revolute链). 那根上臂细杆生成转90°的**根因(因果验证)**: FullPart把每part在边长=盒子最长边的16³立方体网格里条件化+放置, 盒子薄/厚只靠in/out mask传入. part0盒aspect=41, 最薄x维在该网格只0.39体素(<1格)→朝向信息被分辨率抹掉→模型瞎猜→错. 阈值aspect≳16(薄维<1格)即危险. **修复**: prep时把退化薄维加厚到aspect≤8(薄维≥2格). 验证: 受控重生成同图同件, aspect 41→8后长轴从错的x纠正为对的z; 整把灯零对齐原样加载即连(part0两侧缝0.261/0.399→0.046/0.002). 护栏已加进prep_pnm.py(BOX_MAX_ASPECT=8). 过程: 我被用户纠错3次才查到底(只是不咬合/pivot-snap连上[循环度量]/复杂链连不上), 教训=下结论前用最直接底层量自查.
15个坟场灵感逐个用一个干净对比验证: 14有效/1不显著(#11). 核心=三维物体对比渲染看见解决/证伪(轴错门穿机身/融合壳切不出门/P3-SAM切碎门/想象相机漂/两态找回门/物理保range/VLM检马桶seat+lid与cart轮)+每灵感定量+复现核对. 自查纠2处旧数口径+4个渲染度量bug.
专挑非方正几何验证FullPart: PNM Toilet 102621(扳手/盖/座圈/水箱盖/机体5件). 全是曲面/有机/非凸: 座圈=带洞椭圆环(非凸), 盖=椭圆穹面, 机体=带S形存水弯+便池内腔的有机曲面. 每个曲面件都干净生成成独立全分辨率part=>FullPart不只会做盒子. 同流程零改动只换输入. 小现象: URDF的lid和seat包围盒几乎重合(合盖时叠一起)->模型把两件生成成相似环状件(布局盒高度重叠时相邻件会像). 诚信点不变: part盒子是输入(GT分解)非预测, 单图->盒子vecset未开源.
换个例子复跑FullPart(PNM StorageFurniture 40453, 3抽屉+柜体). 微波炉是单revolute门, 这次验证prismatic抽屉: 3个抽屉各自生成成独立完整的开口盒子(前脸+四壁+底, 顶开口), 能从柜体一个个拉出来=真抽屉非贴脸. 同一套流程零改动只换输入. 诚信点不变: part盒子是输入(来自GT部件分解)非FullPart预测, 单图->盒子的vecset未开源.
复用开源 FullPart(hkdsc/fullpart, MIT, 基于TRELLIS)跑通: 给定门的包围盒, 微波炉门成功生成成独立完整有厚度全分辨率的 part(厚z≈0.29带窗+把手), 机身另成带真实内腔的盒子, 两件分开可拼/可绕铰链转开露内腔 → 绕开TRELLIS2'门焊死在壳里切不出实体门'的卡点. ★诚信点: FullPart放出的inference输入=图+每件3D包围盒(.npy), part布局(门在哪多大)是输入非预测; 门盒子来自PNM 7167 GT部件分解; 单图→part盒子的vecset预测器README说还没开源=下一决策点. 先跑绿官方toy_gun(12件独立part). 3坑已修(decord/deepspeed段segfault须先import deepspeed/作者私路径/缺datasets-tyro-utils3d, torch没动).
微波炉切机身+门. 用户两次纠错+深挖后逐条撤回overclaim: 真实PNM7119 GT mesh(门真可分+两态GT关节真转完美对应)代替TRELLIS2几何喂管线; P3-SAM被误当主信号已移除(文档只列为弱先验/GEAR原法是2D SAM). 占据差precision其实满分1.0(机身正确对齐下体素重叠96%抵消). 三真问题: 我管线膨胀dilation2有害(门recall0.42->0.17)+薄齐平门recall天花板~0.42(只抓门腾空前面)+占据差对对齐脆弱(ICP没收敛precision崩0.02). 稳健信号=门新位置开态-only IoU0.92/连续逐面位移10.7x. step5(P3-SAM区投票)在TRELLIS2融合壳上门区可恢复成立.
基座原生分割逐点part mask, 14个Objaverse OOD物体可交互3D预览(three.js点云,mesh质量预筛). 椅子→Base/Seat/Backrest, 马桶→Seat/Lid/Base/Tank, 圆桌→Tabletop/Base等. 含mask decoder高部件数塌缩诊断.
冻基座+轻量关节头+几何角度损失: 翻轴率0.65→0.18、轴中位90°→10.1°、≤15% 24%→53%。一箭双雕(防灾难性遗忘+治翻轴)。含基座vs全参SFT分割对照、干净成功例+诚实尾部例。
稳定线 obj_to_mjcf 后半段:P3-SAM bbox→XPart 生成式部件分解。three.js 可交互(旋转/缩放/爆炸滑条/逐件开关/点击高亮)。6 例:洗衣机/微波炉/烤箱/冰箱/储物柜(密集网格,8-16件) + 洗碗机(低模对比)。产物静态刚体(零关节)。
单张静态家电照片→端到端产带纹理可动3D资产。关键招:gpt-image-2(micu)把关闭态图编辑成打开态图(擅长编辑非分割)→双态对比反推关节(VLM 5/5准)→TRELLIS2出真UV纹理mesh(#1)→VLM bbox+前壳启发式切门+套关节驱动带纹理开合(4/4)→与gpt-image开合态闭环视觉对照(4/4一致)+#47物理护栏(Microwave证伪:外开OK/反向BLOCK)。我们自己的端到端生成,#47退为护栏。可交互双态GLB。诚实:门切分薄片是最大瓶颈,OSH未接,不做像素loss。
一条管线两个输出: 数据管线产带部件+关节的可动资产(Demo A, 8件PNM GT gallery:部件着色+关节轴+rest|articulated)→ 物理方法#47让它sim-ready(Demo B英雄镜头:基线开到预测上限过开/穿模 vs #47碰撞扫掠定可行range, 实体mesh before/after)。硬指标过开率 基线16.6%→#47 2.6%(降84%, 全154件真算)。正是IP在Limitations认输的点(非sim-ready)。curate样例, 诚实边界已标。
Tencent Hunyuan3D-Part / XPart 复现: 生成式部件分解(P3-SAM出AABB→PartFormer DiT逐part生成网格)。demo mesh 出23部件, 可交互GLB(拖动旋转)+爆炸视图+pyrender EGL静态渲染核验。=INSTRUCT-PARTICULATE那条专有图像生成分割前端的本地无限流替代。真bug=系统缺libOpenGL.so.0致PyMeshLab ply插件挂(伪装成trimesh错), apt装libopengl0修好。环境用trellis2+torchdiffeq。
INSTRUCT-PARTICULATE第一步'先image→3D'的复现底座: 官方TRELLIS2(g区权重, DiT slat_flow + DINOv3)把真实家电照片端到端生成3D mesh, 作后续条件化B分割+关节标注输入。各类第2/3张照片共9件(累计16/22), pyrender EGL着色渲染+点云三视图亲眼核验。诚实: 多数偏块状箱体, Table_2例外渲出真桌面+腿。环境坑: 真llmenv在顶层(tf5.12+DINOv3), trellis2 env缺DINOv3ViTModel。
全主线一页: ①论文INSTRUCT-PARTICULATE ②LLM在GT mesh加关节(Qwen3.5微调:轴90°→0°/F1 0.90/跨类0.57→0.80/limit天花板/latent负结果) ③转向复现 ④复现(MVR B>A; 从零训复现无条件→塌 mIoU0.71vs0.63轴角10.7°vs21.7°; 难例诊断VLM瓶颈+真实合成反转; 照片→可动7/7)。含训练曲线+结果柱状图+代表可视化, 导航4详情页。
可铰链2120占HSSD19.3%: T1官方1321(62.3%)/T2-PM检索611(28.8%)/无从检索187(8.8%); 各源样例标净空, T2做原件vs替身对比
论文Fig.1招牌能力复现: 7真实家电照片→TRELLIS2(g区官方权重)image→3D→VLM运动学条件→条件化B→开到上限。7/7端到端跑通(跨transformers版本3处补丁+llmenv)。诚实:生成mesh偏块状/B在生成mesh上分割偏弱(同难例诊断)/关节开合温和,不及论文Fig.1精致但能力已复现。
给全部 6092/6096 个需净空的 HSSD 非铰链物体打人体测量学锚定的净空标注:5 种净空型(落座/接近/操作/上方站立/通行)×人体常数(纵深0.40-0.60m/站高1.90m/通道0.70m),横向逐实例 clamp 肩宽/footprint。每条带 basis+confidence+method_version。high 3091/med 1604/low 1397,小件继承门控 535。14 个代表样例 Blender 4.2.9 真实渲染并逐张肉眼核验。
真SceneSmith从prompt全新生成卧室(床/双床头柜/衣柜/书桌+椅/落地灯/地毯,7要素全), 完整链路打通: SceneSmith生成→官方converter→scene.json(8物体)→Blender4.2.9渲染. env用工作venv, gpt-5.5走Responses API, 4个server超时坑全修.
HSSD 263类/6096对象非铰链物体也需功能性净空: 落座/接近/操作/上方站立/通行5型, 13代表类用方向性净空盒标注+Blender渲染示意
官方converter真实卧室(22件)跑同学week/25的VLM critic(gpt-5.5带纹理): 14 pass/4 degraded/0 fail, 16规则+2 VLM真裁决; 另渲相机拉远整房overview三角度(隐天花/近墙)确保布局可见。
四数据集净空标注齐全性核查(PM 2330/Articraft 270/HSSD voxel 2108/HSSD 官方 1480,零检索缺口) + 10 代表铰链类目各 3 件扫掠体素净空示意图(橙可动件+半透明蓝扫掠体+灰静止,Blender 4.2.9 逐张肉眼核验非空)
20真实难例(咖啡机/烤箱/洗衣机/多按钮遥控)三档诊断: (a)难例上B不优于A(分割A胜0.666vs0.46/B仅轴胜4.7°vs7.5°,与全val的B>A相反);(b)第一瓶颈=VLM结构50%(GPT5.5漏小件,咖啡机16→6)。Fig.4四列对照×rest/articulated。
复检 2120 件 HSSD 静态资产的铰链化替换:T1(1321件)官方URDF替身质量高,T2-PM(787件)检索多数语义正确(washer/laptop/ashcan/refrigerator),但少数系统性错配(range_hood→Oven整台炉灶、toilet_brush→整只马桶)。6-8类各渲2-3个替身实例(Blender4.2.9带纹理),逐张肉眼核验。
家具齐全卧室(22件)真跑同学critic: 18check→14pass/4degraded/0fail, VLM(gpt-5.5)真裁决2条。纹理修复=换Blender4.2.9(3.0.1渲不出glTF纹理)→HSSD的PBR纹理正确渲出; clay与纹理两版裁决逐项一致(鲁棒性)。我们Y-up修正的Articraft洗衣机过critic且违规被判fail(其GLB本身无纹理)。
在现成Particulate上最小加条件化, 全量12945件(PNM+GAPartNet+ArtiCraft全量9854)8卡20k步: B(instruct)全4指标>A(无条件) mIoU0.730vs0.700/轴角AE7.38°vs10.10°/≤15% 89.2%vs85.1%, 20k gap>8k。6件逐件可视化(GT/A/B 分割+轴)。
根因: 我们Articraft网格是Z-up而payload约定glTF Y-up→Blender导入额外转倒90°→躺倒+底沉0.30m。修复=源头Rx(-90°)规范化, 与官方HSSD逐项同构无损导入。含before/after渲染+三方parity表+根因链。已集成yz-week25推送origin/yz。
抽检细粒度语义匹配的两类:29%真匹配质量~85%好(残留bookcase几何多样→VLM); 4%无来源84件当前荒谬错配须重生成
把GT/bbox-LLM/latent-LLM三路预测关节绑到同一部件mesh同步驱动, 4件GIF逐件看差异: Dishwasher latent轴翻错/Table latent行程过头/WashingMachine斜轴滚筒两路都塌成正交/range公共弱项。cardinal数据上bbox≥latent。
官方PArtArt-Gen管线在e区跑通: 5件从未预处理的多样件(Globe/Eyeglasses/Scissors + ArtiCraft吊扇/望远镜)现场产SLAT part latent, 格式与官方data_stage4逐字段一致; 修了singpo卡死坑(latent无需singpo)。每件展示EEVEE渲染+逐part latent→RGB。
从LLM语义/affordance校正后的611个真匹配中抽18例, 人工判断合理性, 对比初版substring样本的通过率
随机18个HSSD物体的跨库PM铰链替换, 交互式让用户判断每个替换是否合理(合理/不合理/不确定), 实时落盘
全HSSD 10968对象: 非铰链81%过滤, 可铰链2120件官方精确(62%)+PM检索(37%)铰链化, 精确体素扫掠净空中位bloat2.05, 99%零生成
微调 Qwen3.5-9B: 纯几何→运动学(轴向 90°→0°)+ GAPart affordance(F1 0.90), 跨类泛化(未见类 0.57→0.80); limit 等非几何信息是天花板。正面验证 EPPUR A7。
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
忠实复现PhysX-Omni官方三步推理(VLM结构预测→TRELLIS几何解码→URDF/MJCF装配),5张demo端到端跑通产出带纹理可动关节物体。独立FK+Blender管线渲闭/开两态肉眼验:5/5几何好+关节有效+FK正确,其中2/5(柜门/魔方)教科书级清晰铰链;拖拉机/跑车被VLM安上语义可疑关节(印证关节监督缺可动性先验)。另用官方PhysX-Bench MuJoCo渲染器出标准化关节视频(已修嵌套mesh路径)。复现最大坑:VLM权重经HF Xet静默损坏致退化,curl+sha256修复。
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
把PAct复现在7个PNM域内物体的单图→可动3D预测(diffuse_tree→URDF),用与PhysX-Omni同套PhysX-Bench协议评估:FK+Blender闭/开两态+MuJoCo关节视频。诚实结论:4/7预测出合理关节(洗碗机/烤箱门、储物柜/桌抽屉方向正确),3/7(微波炉/冰箱/洗衣机)把本该有门的物体预测成全fixed=漏判关节。完整KPS数值需122B评判(本地无),此处给标准化视频+肉眼三维度评估;06-05 ckpt+自动分割,灰模无纹理。印证v0.3改动1/2必要性。
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
06-04→06-09 全过程: 1519官方件盘点→Particulate/AnyMesh/SINGAPO/S2O 铰链化尝试(失败博物馆)→净空路线逐一证伪(检索/raw向量VLM/功能VLM)→AA式程序出题+保守并集收敛(轴94-97%≤15°, containment 0.97)。three.js 可交互点云+净空盒预览20件, 附提精度路线图。
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
task3.2净空pilot收敛: AA式离散程序出题把轴≤15°拉到94-97%(翻轴39%→3-6%), {4铰边×2转向}扫掠并集→containment 0.97/c≥0.9占85%, bloat≈2。20件叠加可视化: 绿框=GT swept, 蓝框=U3预测保守盒。
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
(自动恢复条目: 目录存在但此前未入manifest; 详见详情页)
refined plan W1-W2 三任务包终判: T-A 物理碰撞过滤证伪(空腔壳几何), T-B VLM 关键帧消歧证伪(一致≠正确 1/14, 2/12), T-C 体素+s 表示达标成为新交付 U3vox(新协议 containment 0.989 / bloat 3.30, 同协议比 AABB 省 35% 体积)。20 件叠加渲染(GT绿框+紧致蓝体)+ 失败归因 montage。
Redone raw-only open-state pilot using original PM URDF/OBJ, GAPartNet raw zip URDF/OBJ, ArtVIP USD, and GRScenes raw USD/zip evidence; SceneSmith conversion paths are rejected.
INVALIDATED 2026-05-28: this page used SceneSmith-compatible converted SDF paths under /data/share/ud4scenesmith, not the original raw dataset formats.
重新从 benchmark source stats、PAct reference_object.json 和 exported PAct object.json 生成的 9 样本 GT vs PAct 对照页,保留 report.json 可核验证据与交互式 3D viewer。
9 个官方 PAct 失败现象(复杂物体/OOD/遮挡/小part/关节参数/mask 依赖/拓扑/简单翻车/纹理)逐例诊断;mask ablation 揭示 mask 是硬性二值 gate 而非细粒度 part-id 信号 (82.05 → 0 → 82.04)。
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
Raw Articraft-10K download overview for 2.1 articulated-data exploration: URDF integrity, joint/link/mesh distributions, prompt-family grouping, and local source manifest. SceneSmith adaptation is intentionally scoped to a future 3.2 copy.
Stage 1 only: rule/LLM affordance label and query proposal results for the Unified asset library, with HSSD cached-run summary and downloads.
Unified asset library affordance label proposals connected to the official 3D-ADLLM checkpoint, producing 208 point-cloud masks over 53 assets with browser-inspectable asset/label/mask cards.
Browsable OpenAD asset point clouds with affordance labels, GT masks, and 3D-ADLLM predicted masks for 24 diverse samples.
SceneSmith official pipeline on the 3.2 asset library using a dataset-side HSSD adapter for generated furniture retrieval; completed compact kitchen/dining furniture stage with zero final collisions.
SceneSmith official pipeline smoke run on the 3.2 asset library with GPT-5.5, completed through furniture placement and preserved as a browser-inspectable portal page.
PAct official-default hard-5 run with cardinality sentinels expanded to 3x3 mask-grid cells to test whether more upstream latent evidence keeps improving or starts polluting the condition.
PAct official-default hard-5 run with cardinality sentinels expanded to 2x2 mask-grid cells to test stronger upstream latent evidence.
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
Hard-5 comparison of cardinality sentinels with larger minimum evidence area; best mean score 50.81 under official sampler defaults.
Articraft 官方 React/Three.js viewer 已通过统一 portal 反代挂载,可浏览 10,787 条索引记录与 246 个类别,并按需 hydrate LFS record payload。
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
OT gate heatmaps and Stage1-vs-export part-count collapse localization for hard sample #73.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
Official-default 25/25 generation comparison for Stage1 OT-router variants on a fixed Eval100 subset.
Manual probe over low-part-count PartNetMobility samples with VLM QA and interactive 3D previews.
Manual easy-case probe over simple cabinets, doors, drawers, and one PM fixture-like sample with VLM QA and interactive 3D previews.
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
PAct sample evaluation with complete metric statuses and Gemini VLM QA.
Source-aware random 5-sample PAct evaluation with complete metric statuses and Gemini VLM QA.
Fixed 100-sample Dataset 2.1 benchmark covering ArtVIP, GAPartNet, GRScenes, and PartNetMobility with source/category/problem-tag sampling visualizations.
Ablation and smoke-test portal for OT-gated routing, mask/edge priors, virtual patches, and Stage2 spatial articulation adapter.
True OT-router inference variants for edge, virtual patch, and first-third injection; all 3-sample export QA runs passed.
Stage1 OT-router mask-prior checkpoint reconstructed in PAct inference, exported multi-part GLBs, 3/3 QA pass.
Official-initialized PAct pipeline QA with exported multi-part GLBs, interactive previews, and 3/3 visual QA pass.
Strict Articulate AnyMesh run on an HSSD 3.2 chest-of-drawers asset using Gemini-compatible VLM calls; one prismatic drawer link visualized with a slider.
Visual audit of paper-style DINO feature extraction and TRELLIS v1 SLAT preprocessing for PAct.
Static-only part-name/AABB kinematic hypotheses compared against hidden GAPartNet source-layout joints.
Old fragmented proxy inputs versus corrected full-object color/material inputs, with official PAct and Stage1-cache 3D previews.
Official PAct vs unchanged-architecture Stage1 SGD fine-tuned cache on bad-case inputs, with complete GLBs and articulation videos.
Original PAct vs official-equivalent Stage2 SGD fine-tuned cache on 3 historical bad cases; includes complete GLB previews, articulation videos, and GT-vs-predicted counts.
16 historical official-PAct failure cases converted from source SDF/mobility GT into PAct fine-tuning source data with full-part masks and movable 3D previews.
Audit page for repaired PAct training-format samples: RGB, cleaned masks, source geometry, and closed/half/open joint states.
Official PAct inference on texture-positive source assets rendered into PAct RGB plus source-native link masks: 3/3 complete GLB exports, strong simple cases, small-part drift on dense sample.
Supervised GAPartNet kinematic-mask segmenter driving official PAct inference: 6/6 non-empty exports, but first-pass mask quality is strong only on 2/6 samples.
Negative diagnostic: VLM+SAM2 Appendix-D style preprocessing on GAPartNet CAD renders segments render facets rather than kinematic bodies; PAct inference is therefore mask-limited.
Fixed GAPartNet RGB conditioning render, reran official baseline and a stable official-equivalent Stage2 finetune, then compared exported mesh quality.
Repaired official-equivalent Stage2 fine-tune: fixed mask protocol, supported categories only, 3/3 visual QA pass.
Negative diagnostic baseline: official-equivalent training runs, but inference collapses into tiny wheel-named fragments.
Leave-one-object-out training of a lightweight instance-axis candidate head on complex GAPartNet objects.
Closed, half-open, and open state visualization for official PAct predictions.
Few-joint versus many-joint split evaluation and visualizations for official PAct on GAPartNet non-PM 100.
Minority-button and symmetric-pair local mechanism repair probe.
Per-instance local mechanism axis probe for same-label multi-part hard cases.
Semantic expansion plus root/body local-frame axis candidates for non-canonical hard cases.
Semantic expansion plus geometry-nearest continuous axis priors on hard GAPartNet cases.
Correct-GT hard-case rerun plus semantic mask-to-part expansion for missing joint proposals.
Six OT/VLM variants on the hardest official non-PM samples, with strict scoring and 3D comparison.
Official PAct diagnostic audit on segmentation dependence, joint parameters, and internal structure; mean strict F1 0.3695.
Official PAct predictions on 100 non-PM GAPartNet samples with source-material visualization; strict F1 0.3695.
Official PAct vs PAct-Transporter/Core-OT on 100 non-PM GAPartNet samples; strict F1 0.3695 -> 0.4395.
Five Wikimedia Commons real-image examples processed by PAct; includes source images, coarse masks, object.json, FK-composed GLBs, and articulation videos.
Conservative PAct-Transporter postprocess artifacts with object_transporter_v01.json, diagnostics, simplified GLB comparisons, and raw-vs-v0.1 metrics.
Full substrate evaluation over the 6352-asset v1.0 final release: SDF parseability, mesh URI integrity, movable joints, sidecars, quality flags, category coverage, and PAct-output coverage annex.
在五创新评测基础上继续加入 bbox contact、pivot-to-surface、axis geometry 与同类别 PartNetMobility template retrieval。
PAct evaluation over 27 visualized samples with closed/mid/open GLBs, plus a 55-object.json structure-only scan across existing PAct outputs; strict GT-complete method-quality metrics are merged for the 4 fully scoreable samples.
官方 PAct commit d974309 的 Dishwasher demo 验证:纯官方源码在当前 mask 读取处失败,local compatibility patch 后成功导出 articulated JSON、URDF、part GLB 和运动视频。
离线实现 patch-to-part OT gate、edge/mask cost、part-node assignment、part-pair-to-joint transport、GW graph-template prior,并在 4 个 GT-complete PAct 样本上输出指标和 SVG/GLB 可视化。
PAct method-quality evaluation on GT-complete PartNetMobility synthetic outputs; geometry/part metrics recomputed from existing files and joint metrics reused from the verified v0.3 report.
PartNeXt static queries retrieved against the articulated template bank, with coarse geometry-axis overlays; no A-grade export yet.
Direct native articulated GT frame GLBs loaded through the same portal model-viewer chain for rendering sanity check.
Cached Particulate-generated articulated probes from HSSD static GLBs with animated playback controls.
Baseline PAct vs OT-gated post-process with JSON metrics and 3D GLB comparisons.
Cached Particulate animated GLB results with explicit play/time controls; AnyMesh part-conditioned configs retained.
PAct batch benchmark over 27 successful outputs; 5 GT-aligned samples; missing metrics marked not_computed.
Original PartNeXt/PartVerse face masks seed AnyMesh postprocess/joint estimation for static part assets.
PAct evaluated against all discovered PartNetMobility-ID-aligned outputs using expanded v0.3 articulated-object generation metrics.
Semantic articulatability triage for static HSSD assets plus Particulate evidence.
Particulate inference on HSSD GLBs plus readiness/blocker records for Articulate AnyMesh, Articulate Anything, URDF-Anything, and URDF-Anything+.
PAct evaluated against PartNetMobility aligned samples using expanded v0.3 articulated-object generation metrics.
从 Optimal Transport 数学基础讲到 SceneTransporter 的结构搬运思想,再落到 PAct 的 OT/GW loss、后处理和 benchmark 设计。
Full HSSD official model library and preprocessed index tested through SceneSmith HssdRetrievalServer/HssdRetriever with five returned GLB candidates.
PartNeXt/PartVerse pre-segmented assets converted to GEOPARD-style per-part point-cloud inputs; official GEOPARD inference is blocked by missing public code/checkpoints.
Live-ish progress snapshot for the resumable SceneSmith-preprocessed HSSD model download.
Official Particulate inference on selected PartNeXt and PartVerse static part-aware GLBs, with source, segmented-axis, animated GLB, and URDF downloads.
Expanded articulated-object generation benchmark based on availability, geometry, parts, kinematics, motion consistency, physics, scene usability, texture, semantics, and reproducibility.
Consolidated official-output dashboard for PhysX-Anything and PAct with interactive 3D artifacts and artifact-readiness coverage.
SceneSmith official Objaverse retrieval server/client smoke on UD4 PartNeXt/PartVerse static assets; 4 queries x 3 returned GLBs.
Official SceneSmith HSSD retrieval server/client smoke test with three returned GLB candidates and metric size metadata.
中文交互式讲义:从 Optimal Transport 零基础讲到 Sinkhorn、SceneTransporter 的结构搬运思想,以及如何把 OT 作为 PAct 的 patch-part-joint 约束。
PAct official rerun2 on two Appendix-D inputs; exported object.json, part GLBs, FK-composed closed/mid/open GLBs, and videos.
HSSD preprocessed index/support-surfaces checked; full HSSD objects are gated/missing; PartNeXt static samples mounted; Particulate official foldingchair inference now runs and exports articulated GLB/URDF.
Five clean-official PAct examples with exported object.json, part GLBs, combined closed/mid/open GLBs, and official animation videos.
Clean official PAct rerun on Dishwasher_001 with exported object.json, part GLBs, combined closed/mid/open GLBs, and official animation videos.
Clean official PhysX-Anything demo/0 rerun with VLM, decoder, split, and simready outputs; no stale local outputs included.
Formal route-3 implementation has been switched to clean official PhysX-Anything and PAct repositories; stale local-modification outputs are deprecated.
PartNeXt root semantic classification: 19819 household-eligible mesh-backed records from 23229 local GLBs.
PartNeXt local full set has 23,229 GLBs; this page samples 12 different category folders as interactive raw static GLB previews.
3074 PartNeXt + PartVerse household-ish part-aware static assets in SceneSmith Objaverse-style format with real ViT-L text CLIP embeddings.
Official HSSD and ObjectThor/Objaverse setup audit: size threshold, /data/share search, small-sample direct loader check, and download decision.
每个 GAPartNet 类别抽一个样本;使用原始 URDF 运动树和 textured OBJ/MTL 导出 GLB,滑块切换 9 帧。
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
每个 PartNetMobility 类别抽一个可找到原始 mobility.urdf 的样本;使用 raw URDF source-faithful 导出,不走已知会错乱的 PM SDF diagnostic 预览链。
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
Official SceneSmith articulated SDF render chain from compute_articulated_embeddings.py --keep-renders: zero-joint mesh combine plus Blender CLIP-view renders.
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
Deprecated historical route-3 output generated before the clean official implementation switch. Do not use as formal result. Deprecated historical route-3 output generated before the clean official implementation switch. Do not use as formal result. Interactive 3D result models for route-3 method evaluation: method sample.glb with matched source GT state frames where available.
Deprecated historical route-3 output generated before the clean official implementation switch. Do not use as formal result. Deprecated historical route-3 output generated before the clean official implementation switch. Do not use as formal result. PhysX-Anything finaljson outputs matched to PM/GAPartNet original raw URDF annotations for more complete structure and motion metrics.
Deprecated historical route-3 output generated before the clean official implementation switch. Do not use as formal result. Deprecated historical route-3 output generated before the clean official implementation switch. Do not use as formal result. Corrected benchmark entry: unified native assets as substrate, reproduced methods as evaluated systems, with in-the-wild outputs separated from leaderboard scores.
Deprecated historical route-3 output generated before the clean official implementation switch. Do not use as formal result. Deprecated historical route-3 output generated before the clean official implementation switch. Do not use as formal result. 6352 native/manually cleaned articulated assets evaluated through the benchmark entry for SDF parse, mesh URI integrity, movable joints, bbox and sidecar signals.
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
第二个 test 环境 SceneSmith 官方 furniture 流程:请求 microwave,官方 articulated retrieval 直接调用统一资产库 v0.4,但命中 GRScenes cabinet;页面保留可交互三维场景和检索错配说明。
test 环境跑 SceneSmith 官方 furniture 流程:官方 articulated retrieval 直接从统一资产库 v0.4 检索、注册 1 个资产并生成 scene_after_furniture,随后用官方渲染链输出 5 个场景视角。
SceneSmith v0.4 unified articulated asset library direct-call QC: 6352 assets, official loader validated, cross-dataset joint-slider visual check.
Cross-dataset SceneSmith/Drake joint-state QC page with corrected glTF/SDF coordinate handling and textured GLB previews.
Cross-dataset SceneSmith/Drake joint-state QC page with corrected glTF/SDF coordinate handling and textured GLB previews.
Cross-dataset SceneSmith/Drake joint-state QC page with corrected glTF/SDF coordinate handling and textured GLB previews.
Fixed rerun of the exact 9 diagnostic failure samples. Basket_20 uses coupled ArtVIP route; EKET uses direct SDF-frame export; GRScenes samples with raw USD physics:jointEnabled=False are shown closed-only; PM uses raw URDF-tree export. Original failure page remains visible.
Diagnostic page, not a passed release preview. These are the exact cross-dataset samples where the SDF/FK preview exposed detached parts or wrong joints; keep them visible as regression cases until the root conversion/display bug is fixed.
Curated cross-dataset preview using only previously validated/fixed exports: ArtVIP repaired joint pages, GRScenes mesh-center reframe OK samples, and PM source-faithful raw URDF states.
Correct GAPartNet preview using raw URDF kinematic tree. Portal-native item list is populated; Assets count should be 8.
Deprecated debugging page. Do not use for asset quality judgment; SDF-FK export path mishandles GAPartNet/PM link/visual transforms.
Three-way comparison: raw PM official VLM renders, current official converted SDF keep-renders, and old self-converted visualization frames.
Official SceneSmith articulated SDF render chain from compute_articulated_embeddings.py --keep-renders: zero-joint mesh combine plus Blender CLIP-view renders.
SceneSmith official PM conversion outputs, displayed with a custom Blender-generated joint-slider viewer.
5 个 PartNet-Mobility 样本使用 SceneSmith 官方函数拆进程转换后生成的滑块可视化;Qwen-VL 迁到 GPU1 以避免 Blender/EGL 与视觉模型抢 GPU0。
SceneSmith 官方 PartNet-Mobility 转换函数 smoke test:标准 portal 3D 预览 + 自定义滑块页面,展示 100015 的 9 帧关节状态。
真实 SceneSmith 官方链路运行:floor plan 完成,furniture 阶段直接调用 6343 资产完整包;页面展示官方 registry、停止前 placement 与所有 generated glTF。
Cross-dataset SceneSmith/Drake joint-state QC page with corrected glTF/SDF coordinate handling and textured GLB previews.
Cross-dataset SceneSmith/Drake joint-state QC page with corrected glTF/SDF coordinate handling and textured GLB previews.
Cross-dataset SceneSmith/Drake joint-state QC page with corrected glTF/SDF coordinate handling and textured GLB previews.
Nine slider samples from the final v0.3 publish index after strict geometry, texture, motion, SceneSmith loader, Drake parse, and assembly-audit gates.
A/B check for EKET cabinet: source-faithful conversion versus filtered visual/collision that removes exposed mounting/assembly hardware meshes.
Textured multi-frame SceneSmith/Drake FK viewer for Raw ArtVIP assets; EKET exposed mounting hardware is filtered in the converted visual/collision asset while raw USD remains unchanged.
Compare current USD t-flip conversion with no additional V flip for HAUGA texture debugging.
Source-faithful PM joint-state viewer. No joint limits are modified.
GRScenes mesh-center reframe fix validation. 10 samples converted with max joint pose <2m (previously 38m+).
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
Textured multi-frame SceneSmith/Drake FK viewer for loss-minimized GRScenes articulated assets with slider control.
Mesh-center link reframe validation batch for GRScenes, composed with the official SceneSmith SDF mesh path. Shows default and max-joint states for manual assembly debugging.
Automatically flagged assembly failures from the full SceneSmith/Drake asset audit. Most early flags are GRScenes door/window assets whose max-joint state explodes to tens of meters, indicating conversion frame/limit/unit issues.
Debug page for separating asset conversion errors from temporary renderer errors. GLBs are exported with official SceneSmith SDF mesh composition and Drake forward kinematics at default/max joint states.
Three real SceneSmith floor-plan examples using DeepSeek for text/tool calls and local Qwen2.5-VL for observe_scene image analysis; includes rendered layouts and generated room geometry.
Closed/mid/open visual validation frames for newly completed GRScenes articulated assets, derived from source USD Physics joint relationships and limits without modifying raw data.
Random visual smoke sample from newly completed GRScenes official articulated assets, converted from USD mesh prims to portal-viewable GLB while preserving source paths and joint-prim counts.
Natural SceneSmith furniture-stage run with no forced asset request or forced placement. SceneSmith retrieved GAPartNet/45267 from our converted PM/GAPartNet library and placed storage_cabinet_0 at (-1.499, 0.967); the run exported full scene and house Blend files, while physics validation found one west-wall penetration of about 3.81cm.
Forced smoke test using SceneSmith infrastructure plus explicit asset/placement steering. It verifies that PM/GAPartNet SceneSmith-ready v3 articulated assets can be loaded, placed, rendered, exported, and pass a 0-collision physics check, but it is not a natural SceneSmith planning result.
从 PartNetMobility、GAPartNet、PartVerse、PartNeXt 各取 4 个新样本,生成 SceneSmith-compatible derived layer。16/16 SDF 通过 Drake parse;页面已改为 16 个单资产 joint-state 交互卡片,避免多物体同屏 open 状态互相重叠,并明确标出 PartVerse/PartNeXt 当前仍是静态 part-aware mesh 占位资产。
修复 articulated asset 的 SceneSmith 放置接口:转换侧新增空的 placement_base 根 link,让原生 SceneSmith weld 到放置基准,同时保留柜体和两个抽屉 prismatic joints。当前 run 18-11-19 只放置一个 storage_cabinet_0,位于后墙附近且物理检查 0 collision;页面包含完整场景图、完整场景 GLB、closed/mid/open joint 状态渲染和可交互抽屉滑块。
用 MiniMax-M2.7 通过 OpenAI-compatible Chat Completions adapter 跑通真实 SceneSmith floor-plan chain:planner/designer/critic tool calls、本地文本评分解析、Drake 到 Blender 渲染、最终 .blend 导出均完成。
用官方 SceneSmith 仓库的 ArticulatedRetrievalServer 和 ArticulatedRetrievalClient,直接调用我们的 Raw ArtVIP batch10 SceneSmith-compatible 资产库。三条查询完成官方 loader、retriever、Drake FK mesh composition 和 GLB export,结果挂到 portal 供交互检查。
Raw ArtVIP 小批量标准转换门禁:10 个候选资产全部导出为 SceneSmith/Drake 常规 SDF 表达,保留 rigid link pose,visual/collision mesh 为 link-local,joint pose/axis 按 USD Physics body0 joint frame 映射;全部通过结构校验、Drake parser 与 0.05s 动态 smoke。
把本地 raw ArtVIP 的 cabinet_1.usd 解析为 SceneSmith-compatible 资产包:SDF 表达 3 links / 2 prismatic joints,visual mesh 导出为带 PBR/texture 的 per-link glTF,collision 使用 CoACD 生成 47 个 convex pieces,properties/provenance/embedding index 齐全,并通过结构、Drake parser 与短时动态仿真 smoke。
以 SceneSmith ArtVIP VHACD 包为 gold format,抽取 cabinet、fridge、microwave 三个 articulated assets,建立统一资产库的 SceneSmith-compatible 交付层。该页面展示 SDF + VHACD + properties + embeddings 的 smoke validation 结果;当前 trellis2 环境缺少 pydrake,Drake load 留给 SceneSmith/Drake 环境执行。
沿用 chair/laptop/button/lock 四张真实图的六路对比模式,对 gym_real.jpg 运行 PhysX-Anything、PartPacker、OmniPart、PAct、SINGAPO 与 TRELLIS2。该图是多器械真实场景,集中暴露单物体/单实例假设下的误识别、过分割和部件合并问题。
把同一张真实门锁图分别送进 `PhysX-Anything / PartPacker / OmniPart / PAct / SINGAPO / TRELLIS2`。这页特别适合看方法出域时的差异:`PhysX-Anything` 明确误识别失败,而其余几条链在修复后都给出了可视化结果。
把同一张真实开关图分别送进 `PhysX-Anything / PartPacker / OmniPart / PAct / SINGAPO / TRELLIS2`。这页集中展示 bug 修复后的真实运行结果,包括 `TRELLIS2` 的 plain-mesh 导出修复、`PAct` 的 NaN-safe articulation 修复、以及 `OmniPart` 的本地依赖兼容修复。
把同一张真实笔记本图分别送进 `PhysX-Anything / PartPacker / OmniPart / PAct / SINGAPO / TRELLIS2`。这页把各方法的真实输出并排展示,同时明确写出 bridge 条件、graph prior,或具体失败阶段。
把同一张真实椅子图分别送进 `PhysX-Anything / PartPacker / OmniPart / PAct / SINGAPO / TRELLIS2`。这页把各方法的真实输出并排放在一起,同时明确写出每条链是否需要额外 mask、graph prior,或在哪个阶段失败。
直接将 `/data/250010098/Unified_dataset/test/chair_real.jpg` 送入 PhysX-Anything 官方四段链。结果识别为 `Swivel Chair`,分成 `Backrest / Seat / Pedestal` 三个静态 part,并导出了 `sample.glb / basic.urdf / basic.xml`。
从 ArtVIP 里挑出 dishwasher、refrigerator、laptop 和 anesthesia machine 四个代表样本,直接从 USD 抽出几何统计、semantic structure tree、kinematic tree、world-space bbox 与纹理拼图,帮助判断这套数据更适合拿来做什么。
把高配置 `100 steps + 768 grid` 应用到 4 个更接近训练分布的 appliance-like 输入上。结果显示并不是所有家电都会更稳:`dishwasher=2`、`microwave=3`、`storagefurniture=3`,但 `refrigerator` 会过分裂到 `26` 个 parts。
在 `GRScenes` 两个 bridge 样例上继续把配置翻倍到 `100 steps + 768 grid`。结果显示 part 数没有继续增加,但 mesh 精细度和文件体量明显上升,可直接和 `50/384` 正式页对照。
不再使用 smoke 配置,而是按正式设置 `50 steps + 384 grid` 重跑 `GRScenes` 的 `left_hand` 与 `table_white_big`。结果显示 part 数明显收敛:`left_hand=7`,`table_white_big=2`。
使用 ModelScope 下载的 `GRScenes` 物体 USD 资产,先渲染成 RGBA 图,再送入官方 PartPacker flow 主链。当前已在 `left_hand` 与 `table_white_big` 两个对象上真实跑通。
使用我们自己的 `/data` 渲染图,在不增加任何额外分割先验的条件下直接跑 PartPacker flow 主链。五类 appliance-like 样例均成功导出整物体与多个 part-level GLB。
按统一配置批量跑完官方 `assets/images` 的 9 张单图输入,完整导出了整物体、dual volumes 与 part-level GLB,可作为 PartPacker 在 `trellis2` 环境里的正式官方样例基线。
暂时不借 PAct,直接按 SceneTransporter 的核心思路做一个单图部件生成原型:先用 `SAM / KMeans / OT+edge` 做 part assignment,再把分配结果 lift 成独立 3D part meshes。4 个 PartNeXt 样例的 visible-surface benchmark 上,`SAM` 反而最好:`mean face IoU = 0.389`,高于 `KMeans = 0.337` 和 `OT+edge = 0.332`。这说明一旦目标变成“直接产出 3D parts”,边界质量比 patch-level assignment 的全局约束更关键。
把 SceneTransporter 的核心思想真正接到了下游部件生成上:先用 `KMeans` 或 `OT+edge` 做 patch-to-part assignment,导出 `mask.exr`,再送进官方 PAct 推理链。4 个 PartNeXt 样例上,两种前端都把 `part count` 稳住了,`mean_part_count_abs_error = 0.00`;但 `OT+edge` 目前只带来了局部语义变化,还没有系统性打赢 KMeans。这说明论文里的 assignment 思想是能桥接到部件生成前端的,但如果没有 compositional latent 和生成环内 routing,仅靠前端 mask 重写还不够把 PAct 的跨域语义漂移彻底拉回来。
把 PartNeXt probe 扩成了 9 个有效类别:`Knife / Toilet / Monitor / Guitar / Teapot / Laptop / Chair / Microwave / Mug`。另外还有 `Handbag / Lamp / Sofa` 3 类因为当前自动视角渲染成了空 alpha,被明确排除,不和模型质量混在一起。有效 9 类上的结果很稳定:`mean_gt_coarse_part_count = 2.67`,`mean_pred_num_nodes = 2.56`,`mean_part_count_abs_error = 0.11`。也就是说 PAct 在 PartNeXt 上大体能保住 coarse slot 数,但语义几乎系统性地往 `door/base/drawer` 一类 appliance 模板漂移。
从已下载的 PartNeXt 子集里挑了 3 个 raw+mesh 已对齐样本:`Chair`、`Monitor`、`Microwave Oven`。我们直接用 PartNeXt 顶层语义树渲染出精确 `mask.exr` 与同视角输入图,再送进 PAct 官方推理链做一个小型 sanity eval。结果很有代表性:`part count` 三例都对上了,但语义标签明显漂移,比如 `Chair -> base/base/door`、`Monitor -> door/base`、`Microwave -> door/base/drawer`。也就是说,PAct 在 PartNeXt 上已经能维持粗结构槽位,但还没有学会这套跨域语义。
在现有 `trellis2` conda 环境里完成了 `PartPacker` 官方 CLI 推理复现。补装少量缺失依赖并下载官方 `flow.pt / vae.pt` 后,成功跑通 `flow/scripts/infer.py` 和 `vae/scripts/infer.py`。代表结果包括一个低步数 barrel smoke、一个更接近正式配置的 teapot 单样例,以及一个 VAE reconstruction smoke。
严格按 SceneTransporter 的核心思想做了一个最小 probe:不直接碰 3D 生成器,而是只比较 patch-to-part assignment。本次在 6 个 PartNeXt rendered inputs 上对比了 `SAM`、`DINO patch KMeans`、`cosine routing`、`OT-noedge` 和 `SceneTransporter 风格 OT+edge`。结果并不自动站在 OT 这一边:均值上 `KMeans/cosine = 0.760`,`OT+edge = 0.708`,`OT-noedge = 0.702`,`SAM = 0.619`。这说明论文里的 assignment 约束思想很强,但在我们当前“无 compositional latent、只有图像 patch 特征”的简化设定里,还不能直接复制出它的优势。
从 `raw_datasets/partnet-mobility-v0` 里挑了 5 个**不在官方 demo 里**的类别,直接用原始 PM mesh 自己渲染 synthetic RGBA 输入和真 part mask,再送进 PAct 官方推理链。当前结论很清楚:`Door / Safe` 明显更接近可用,`Bottle / Display` 基本塌成 fixed base,`TrashCan` 则恢复了运动但语义漂移。
把我们补出来的论文附录 D `VLM+SAM2` mask 重新送回 PAct 官方推理链,验证 `Appendix-D preprocessing -> articulated 3D tree` 是否真正闭环。当前双样例 `Dishwasher_001 + StorageFurniture_004` 已经成功跑通。
把论文附录 D 的 `GPT/VLM + SAM2 + VLM merge` 预处理链扩展到更多类别。当前页集中展示 4 个真实样例:`Dishwasher` 与 `StorageFurniture` 是成功例,`Refrigerator` 与 `Table_door` 则保留为 hard case,方便直接对比“哪类对象已经补齐,哪类对象还在失败模式里”。
把 PAct 论文附录 D 里描述的 `VLM-guided prompting pipeline followed by SAM2 refinement` 真正补成本地可运行链路。该页面展示 `Dishwasher_001` 的 Stage 0 granularity、SAM2 候选、Stage 1 articulated/fixed 分类、Stage 2 semantic merge,以及最终导出的 PAct `mask.exr`。
修复 CLI 对 `*_mask.exr` 的多标签读取后,我们按官方 README 原样重跑了全部 `22/22` real-world examples。最终 `single_fixed_base_ratio = 0.0`,跨类别恢复出了 `revolute / prismatic` articulation。这一页集中展示完整统计和 6 个代表样例。
把 `/data` 里的现成渲染图接回 PAct 正式推理链。我们先用 PNG alpha 轮廓拟合相机,再从已知 part mesh 渲出 per-part 标签,导出成 PAct 所需的 `mask.exr`,最后重新喂给官方 inference。当前已打通 Dishwasher 与 StorageFurniture 两个样例。
沿着“为什么全是 fixed”继续往前追后,我们定位到真正根因:CLI 数据集用 `imageio` 读取 `*_mask.exr` 时把多标签 part-id mask 读成了二值图。修复为 `cv2.imread(..., IMREAD_UNCHANGED)` 后,官方 README 路线在 `Table_door_002` 上恢复导出 `base + door + 2 drawers` 的 4 节点结构树,关节类型也恢复成 `revolute / prismatic`。
这页保留了我们最初严格按 README 检查 PAct 时看到的失败症状:前 4 个样例导出的 `object.json` 都只有单个 `fixed base`。后来继续追查后确认,这不是 PAct 最终能力判断,而是 CLI 读取 `EXR` 多标签 mask 时发生了二值化塌缩。请结合后续的 `PAct README EXR Fix Probe` 页面一起阅读。
核验了另一段对话里的 PAct 复现结果后,确认它满足“官方 released inference path 已在 trellis2 环境跑通”的要求:22/22 官方 real-world examples 成功完成,产出 44 个 mp4、22 个 png 和 1 个 run_command.txt。这里展示 8 个代表性样例,并附上复现说明与命令。注意:这一页验证的是官方推理路径和动画/可视化产物,不是训练、GLB 导出或 URDF 全链路复现。
继续严格沿官方 TRELLIS.2 latent / subs 路线,但把单节点匹配升级为 coarse occupancy component 匹配。核心结论是:官方 subs 经过连通组件聚合后,GT mean IoU 从 8.1e-05 提升到 0.1553,说明它们更像“part-support field”而不是可直接监督的语义 part 节点。
严格沿官方 TRELLIS.2 latent / subs 路线,导出 shape_slat decoder 的 subdivision tensors,并把它们重建成 octree-style bbox/tree proxy。该页展示当前轻量可视化、GT box 对比,以及一个最小 part-aware adapter smoke。核心发现是:official subs 信息量很足,但单节点并不直接等于语义 part,后续必须做节点聚合或组件化。
针对 box-like 样本里最难的第二个 part,系统比较 5 组 crop/context 配置。目标不是泛泛看图,而是用 first-pass success、mesh/target IoU 和 effective canvas scale 去找出真正能把困难小 part 拉起来的输入策略。
专门拿最棘手的 box-like 样本做条件清洗实验。我们把原本重叠严重的 3D boxes 用 split_midpoint 规则拆开,再走同一条正式 TRELLIS2 per-part 生成链,检查 overlap risk、retry burden 和最终装配会不会一起改善。
把正式 soft_anisotropic per-part pipeline 扩展到三类代表性样本,并用联合判据评估:mean mesh/target bbox IoU、condition overlap risk、decode stability。这个页面更接近我们真正要盯的东西:哪些结果是几何贴框成功,哪些只是被重叠 bbox 或恢复重试掩盖了问题。
将 soft_anisotropic 拟合从审计脚本合入正式 per-part TRELLIS2 pipeline 后的真实推理验证。默认公式为 u=mean(log(e_b/e_s))+(log(e_b/e_s)-mean(log(e_b/e_s)))/(1+lambda),lambda=0.25;该页面展示 2 个 parts 的正式输出、fit_info 和诊断可视化。
复用已有 TRELLIS2 raw part mesh,不重新推理,专门比较 isotropic bbox fitting 与 anisotropic bbox fitting。数学判据是 mean mesh/target bbox IoU 的提升量 Delta_IoU,用来判断当前装配误差有多少来自比例拟合而不是生成本身。
针对 OmniPart bbox 出现父子嵌套或强重叠时的装配穿插问题,新增 AABB carve 后处理:若一个 child bbox 基本落在 parent bbox 内,就从 parent mesh 中删除落入 child bbox 的面片。这个实验展示机械样本和盒状家具样本的 carve 前后诊断。
针对上一轮人形样本出现的头部缺失与下半身长条污染,检查 TRELLIS2 官方预处理后发现 alpha>0.8*255 会定义主体裁剪区域。因此将 dim_context 的非目标上下文 alpha 降到 96,让上下文只作为弱语义提示,而不被当作主体几何生成。
在修正 crop 策略后,挑选机械/武器形、花卉细结构、盒状家具三类样本做 per-part TRELLIS2 扩展测试。每个样本先跑 2 个主要 parts,验证 dim_context 输入、flash-attn 后端、bbox fitting 和轻量化 GLB 预览是否稳定。
正式验证新的主线:先用 OmniPart 给出 part 级 3D bbox 与 2D 区域,再对每个 part 单独送入 TRELLIS2 生成几何并按 bbox 装配。关键发现是,小零件若被裁成纯透明孤立图,TRELLIS2 容易出现 empty sparse coords;保留局部上下文后,同一样本的两个主要 parts 都稳定生成成功。
将 TRELLIS2 生成或整理后的机械网格先 decimate,再送入 Particulate 进行 part segmentation、joint inference,并导出可下载的 URDF / MJCF。这里先展示 3 个最有代表性的机械样本。
继续使用 PM 官方细节渲染图作为输入,这次专门对比两类相反对象:细杆结构的 Chair 与盒状外壳的 Dishwasher,观察 TRELLIS2 在不同几何先验下的生成稳定性与歧义表现。
用 PartNet-Mobility 官方 textured OBJ/MTL 重新渲染出更完整的输入图,再将单张细节图喂给 TRELLIS2,导出可交互预览 mesh,并保留输入图、输出渲染图与视频下载。