# Official PAct Diagnostic Evaluation

Generated: `2026-05-14T20:57:01+00:00`

Portal: `http://106.14.105.96:28080/experiments/pact-official-diagnostics-20260514/index.html`

This report audits the official PAct raw outputs from the GAPartNet non-PM 100-sample run.
The goal is not to score PAct-Transporter here, but to isolate where the official PAct pipeline is brittle.

## Evidence From The Official Implementation

- PAct's UI and CLI require a 2D image plus a 2D part mask; the application text explicitly describes an uploaded `Image` and `2D Mask` pair.
- In the official model code, the image-conditioned latent flows convert `masks` into a `mask_group_emb` and add that embedding into the image token conditioning path.
- Therefore mask quality is not a post-hoc convenience: it is part of the generative condition.

Key source locations:

- `/data/250010098/official_clean_repos/PAct_official/app.py`
- `/data/250010098/official_clean_repos/PAct_official/modules/pact/models/structured_latent_flow.py`
- `/data/250010098/official_clean_repos/PAct_official/modules/pact/models/sparse_structure_flow.py`

## Summary

| area | metric | value |
|---|---:|---:|
| 2D segmentation | mask parts -> generated part nodes corr | `0.5245` |
| 2D segmentation | visible movable fraction -> joint F1 corr | `0.3234` |
| 2D segmentation | mean abs generated part count - mask part count | `1.6700` |
| joint quality | strict joint F1 mean / median | `0.3695` / `0.0981` |
| joint quality | strict recall / precision mean | `0.3450` / `0.4242` |
| joint quality | axis error mean | `44.8045` deg |
| joint quality | joint count abs error mean | `2.0000` |
| internal structure | tree valid proxy rate | `0.9500` |
| internal structure | child mismatch sample rate | `0.0000` |
| internal structure | high AABB-overlap sample rate | `0.7300` |

## 1. 2D Segmentation Dependence

This run uses synthetic GT-derived masks, so it is a favorable mask setting. Even under that favorable setup, the raw PAct output remains tightly coupled to what is visible and separated in the 2D mask.

| diagnostic | value |
|---|---:|
| mask part count -> generated part node count corr | `0.5245` |
| mask part count -> generated joint count corr | `0.5059` |
| occluded movable count -> recall corr | `-0.2678` |
| mean generated part count - mask part count | `-1.6700` |

Visible movable bucket means:

| bucket | F1 | recall |
|---|---:|---:|
| `complete` | 0.4776 | 0.4597 |
| `none` | 0.0769 | 0.0769 |
| `partial` | 0.2702 | 0.2190 |

![mask-parts](visualizations/mask_parts_vs_pred_nodes.png)

![visible-f1](visualizations/visible_movable_fraction_vs_f1.png)

Interpretation: PAct is not robustly inferring hidden/missing part structure from object priors. The 2D segmentation acts as a strong bottleneck for both part count and joint count.

## 2. Joint Parameter Generation Quality

| metric | value |
|---|---:|
| mean strict F1 | `0.3695` |
| mean type match rate | `0.5800` |
| mean accepted-match axis error | `44.8045` deg |
| mean accepted-match origin error | `0.2305` |
| under-predict joint count rate | `0.4900` |
| exact joint count rate | `0.4900` |
| over-predict joint count rate | `0.0200` |
| degenerate-axis sample rate | `0.0000` |
| reversed-range sample rate | `0.5900` |
| zero-span non-fixed sample rate | `0.0000` |

![joint-f1](visualizations/joint_f1_hist.png)

![axis-error](visualizations/axis_error_hist.png)

Interpretation: the main failure is not just noisy axis regression. PAct frequently chooses the wrong number of joints or misses joints entirely, so axis/origin quality is only meaningful on the subset that survives matching.

## 3. Internal Structure Generation Quality

| metric | value |
|---|---:|
| tree valid proxy rate | `0.9500` |
| mean root count | `1.0500` |
| multi-root sample rate | `0.0500` |
| cycle sample rate | `0.0000` |
| dangling-parent sample rate | `0.0000` |
| child mismatch sample rate | `0.0000` |
| thin-part sample rate | `1.0000` |
| high AABB-overlap sample rate | `0.7300` |
| mean parent-child AABB gap | `0.0004` |
| mean axis-origin to child-AABB distance | `0.0505` |

![structure](visualizations/internal_issue_rates.png)

Interpretation: official PAct can output visually plausible separated parts while the symbolic articulation tree is internally inconsistent. This is exactly the space where transport-style global assignment and structural constraints should operate inside the architecture rather than as cosmetic post-processing.

## Worst Cases

Lowest joint F1:

| sample | object | F1 | pred joints | GT joints |
|---|---|---:|---:|---:|
| `electronics_104011` | `Printer` | 0.0000 | 4 | 27 |
| `electronics_103972` | `Printer` | 0.0000 | 1 | 22 |
| `electronics_103867` | `Printer` | 0.0000 | 2 | 17 |
| `electronics_103978` | `Printer` | 0.0000 | 0 | 16 |
| `small_appliances_103043` | `CoffeeMachine` | 0.0000 | 1 | 12 |
| `electronics_104020` | `Printer` | 0.0000 | 0 | 9 |
| `electronics_103988` | `Printer` | 0.0000 | 0 | 8 |
| `electronics_103878` | `Printer` | 0.0000 | 1 | 8 |
| `electronics_104030` | `Printer` | 0.0000 | 3 | 8 |
| `small_appliances_103016` | `CoffeeMachine` | 0.0000 | 4 | 6 |

Largest joint count error:

| sample | object | count abs error | pred joints | GT joints |
|---|---|---:|---:|---:|
| `electronics_104011` | `Printer` | 23 | 4 | 27 |
| `electronics_103972` | `Printer` | 21 | 1 | 22 |
| `electronics_103978` | `Printer` | 16 | 0 | 16 |
| `electronics_103867` | `Printer` | 15 | 2 | 17 |
| `small_appliances_103043` | `CoffeeMachine` | 11 | 1 | 12 |
| `electronics_104020` | `Printer` | 9 | 0 | 9 |
| `electronics_103988` | `Printer` | 8 | 0 | 8 |
| `major_appliances_103351` | `WashingMachine` | 8 | 2 | 10 |
| `major_appliances_103452` | `WashingMachine` | 8 | 5 | 13 |
| `electronics_103878` | `Printer` | 7 | 1 | 8 |

Largest AABB overlap:

| sample | object | max overlap | generated nodes |
|---|---|---:|---:|
| `small_appliances_103466` | `Toaster` | 1.0000 | 4 |
| `household_fixtures_102708` | `Toilet` | 1.0000 | 4 |
| `storage_100162` | `Box` | 1.0000 | 5 |
| `large_furniture_46955` | `StorageFurniture` | 1.0000 | 4 |
| `household_items_100056` | `KitchenPot` | 1.0000 | 2 |
| `small_furniture_20985` | `Table` | 1.0000 | 2 |
| `small_appliances_7265` | `Microwave` | 1.0000 | 15 |
| `storage_101583` | `Safe` | 1.0000 | 10 |
| `household_items_102181` | `TrashCan` | 1.0000 | 3 |
| `small_furniture_22692` | `Table` | 1.0000 | 3 |

## PAct Strength Cases

These are the low-joint or PM-like samples where the official PAct pipeline is comparatively stable.

| sample | object | F1 | pred joints | GT joints |
|---|---|---:|---:|---:|
| `architectural_fixtures_8983` | `Door` | 1.0000 | 1 | 1 |
| `large_furniture_45606` | `StorageFurniture` | 1.0000 | 1 | 1 |
| `major_appliances_10905` | `Refrigerator` | 1.0000 | 1 | 1 |
| `small_appliances_7236` | `Microwave` | 1.0000 | 1 | 1 |
| `small_furniture_20985` | `Table` | 1.0000 | 1 | 1 |
| `small_appliances_103074` | `CoffeeMachine` | 1.0000 | 3 | 3 |

## Architectural Readout For PAct-Transporter

The official PAct weaknesses point to three core architectural requirements:

1. Replace mask-as-token-ID conditioning with mask-aware latent transport, so missing or over-split 2D regions are reconciled against 3D part hypotheses.
2. Generate joints as a globally matched structure, not as independent local attributes attached to whatever parts the 2D mask exposed.
3. Make the articulation tree, parent-child contact, and joint-axis feasibility differentiable or at least first-class constraints during generation.

These are core architecture issues, not merely renderer or prompt issues.
