OSH in-distribution gain (Stage-2, PNM)

How much does the OpenStateHead conditioning improve in-distribution articulated reconstruction? Matched checkpoints, identical eval, identical PNM held-out samples — the only difference is the OSH mask conditioning.

2.3×
lower SLAT flow-MSE @40k
OSH 0.035 vs vanilla 0.082
4.2×
lower articulation-L2 @40k
OSH 5.8e-6 vs vanilla 2.4e-5
whole run
OSH stays below vanilla the entire 40k
not a late-training artifact
checkpointflow OSHflow vangainart-L2 OSHart-L2 vangain
step 400000.03540.08152.30×5.84e-062.44e-054.17×
flow
Flow-MSE over training — OSH (blue) consistently below vanilla (orange).
art
Articulation-L2 (24-D joint regression) — the OSH mask prior helps joint-param recovery most.