scenesmith / mesh

scenesmith-example-data_scene_064

A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.

0.500
score
coverage 66.67%
effective pass 50.00%
pass 4 fail 4 unknown 4

Overview Gallery

4 images

Scene Summary

Total 12
Pass 4
Degraded 0
Fail 4
Unknown 4
Coverage 66.67%
Eff. Pass 50.00%
Score 0.500

Judgement
chair · sittable, supportable, pushable, liftable VLM
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `interaction_clearance` metric for `kitchen_dining_chair_0` with affordance `sittable`. Relevant context: the surrounding scene. Check whether the immediate sit/stand zone at the usable sitting side has enough free space for a normal sit/stand motion. Consider the combined effect of nearby furniture around the seat edge; multiple pieces that collectively pinch the sit/stand zone should count as blocked even if no single item fully closes it. Do not require 360-degree circulation around the furniture. If you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.
Reason: In the local render and the highlighted local_context render, kitchen_dining_chair_0 is visibly entangled with the nearby bistro table base/pedestal and pressed together with kitchen_dining_chair_1, so the normal sit/stand zone at the usable seating side is not open. The overview renders also show this chair pair crammed into the corner beside the table, with no practical free edge for pulling out and sitting down now. This is a direct immediate-clearance failure, not just a loose nearby adjacency in perspective view alone. Note: the table is a visible blocker in the local views, but its object id was not provided in the scene hints, so only the identified chair blocker is listed below by id.
Blocking Objects: kitchen_dining_chair_1
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view. 2. `local_context` is a highlighted context view; subject is red and targets are green.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: interaction_clearance
Subject object: kitchen_dining_chair_0
Target objects: none

Verified asset facts:
- subject `kitchen_dining_chair_0`: category `chair`, affordances [sittable, supportable, pushable, liftable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`
- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `front: [sit]; top: [place_or_reach_objects]`
- low-confidence candidates: `none`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_dining_chair_0`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], affordances [sittable, supportable, pushable, liftable], task targets [desk, table]
- targets: none

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_dining_chair_0`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`
Target cues:
- none
Nearby same-room context objects that may affect access or clearance:
- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_dining_chair_1`: category `chair`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `medium`
- `kitchen_eucalyptus_sprig_0_f0_1`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_eucalyptus_sprig_0_f0_2`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `interaction_clearance` metric for `kitchen_dining_chair_0` with affordance `sittable`.
Relevant context: the surrounding scene.
Check whether the immediate sit/stand zone at the usable sitting side has enough free space for a normal sit/stand motion. Consider the combined effect of nearby furniture around the seat edge; multiple pieces that collectively pinch the sit/stand zone should count as blocked even if no single item fully closes it. Do not require 360-degree circulation around the furniture.
If you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "interaction_clearance__kitchen_dining_chair_0",
  "metric": "interaction_clearance",
  "subject_id": "kitchen_dining_chair_0",
  "target_ids": [],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view. 2. `local_context` is a highlighted context view; subject is red and targets are green.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: interaction_clearance\nSubject object: kitchen_dining_chair_0\nTarget objects: none\n\nVerified asset facts:\n- subject `kitchen_dining_chair_0`: category `chair`, affordances [sittable, supportable, pushable, liftable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`\n- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `front: [sit]; top: [place_or_reach_objects]`\n- low-confidence candidates: `none`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_dining_chair_0`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], affordances [sittable, supportable, pushable, liftable], task targets [desk, table]\n- targets: none\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_dining_chair_0`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`\nTarget cues:\n- none\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_dining_chair_1`: category `chair`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `medium`\n- `kitchen_eucalyptus_sprig_0_f0_1`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_eucalyptus_sprig_0_f0_2`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `interaction_clearance` metric for `kitchen_dining_chair_0` with affordance `sittable`.\nRelevant context: the surrounding scene.\nCheck whether the immediate sit/stand zone at the usable sitting side has enough free space for a normal sit/stand motion. Consider the combined effect of nearby furniture around the seat edge; multiple pieces that collectively pinch the sit/stand zone should count as blocked even if no single item fully closes it. Do not require 360-degree circulation around the furniture.\nIf you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_dining_chair_0_local.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_dining_chair_0_local_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "interaction_clearance__kitchen_dining_chair_0",
  "metric": "interaction_clearance",
  "label": "fail",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.67,
  "evidence_conflict": false,
  "reason": "In the local render and the highlighted local_context render, kitchen_dining_chair_0 is visibly entangled with the nearby bistro table base/pedestal and pressed together with kitchen_dining_chair_1, so the normal sit/stand zone at the usable seating side is not open. The overview renders also show this chair pair crammed into the corner beside the table, with no practical free edge for pulling out and sitting down now. This is a direct immediate-clearance failure, not just a loose nearby adjacency in perspective view alone. Note: the table is a visible blocker in the local views, but its object id was not provided in the scene hints, so only the identified chair blocker is listed below by id.",
  "blocking_objects": [
    "kitchen_dining_chair_1"
  ],
  "confidence": 0.92
}
Evidence
interaction_clearance__kitchen_dining_chair_0

Judgement
chair · sittable, supportable, pushable, liftable VLM
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `interaction_clearance` metric for `kitchen_dining_chair_1` with affordance `sittable`. Relevant context: the surrounding scene. Check whether the immediate sit/stand zone at the usable sitting side has enough free space for a normal sit/stand motion. Consider the combined effect of nearby furniture around the seat edge; multiple pieces that collectively pinch the sit/stand zone should count as blocked even if no single item fully closes it. Do not require 360-degree circulation around the furniture. If you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.
Reason: In the local render, the subject chair appears overturned/on its side and interpenetrated with the nearby round bistro table base rather than standing upright in a normal usable orientation. The overview renders show the same corner arrangement, with the chair and table tangled together. Because this clearly conflicts with the asset fact that the subject is a normally sittable chair with an identifiable sitting side, the immediate sit/stand zone cannot be reliably evaluated from the current scene state, so the interaction_clearance judgment is unknown rather than guessed.
Blocking Objects: none
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: interaction_clearance
Subject object: kitchen_dining_chair_1
Target objects: none

Verified asset facts:
- subject `kitchen_dining_chair_1`: category `chair`, affordances [sittable, supportable, pushable, liftable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`
- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `front: [sit]; top: [place_or_reach_objects]`
- low-confidence candidates: `none`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_dining_chair_1`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], affordances [sittable, supportable, pushable, liftable], task targets [desk, table]
- targets: none

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_dining_chair_1`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`
Target cues:
- none
Nearby same-room context objects that may affect access or clearance:
- `kitchen_dining_chair_0`: category `chair`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `medium`
- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_eucalyptus_sprig_0_f0_0`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_eucalyptus_sprig_0_f0_2`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `interaction_clearance` metric for `kitchen_dining_chair_1` with affordance `sittable`.
Relevant context: the surrounding scene.
Check whether the immediate sit/stand zone at the usable sitting side has enough free space for a normal sit/stand motion. Consider the combined effect of nearby furniture around the seat edge; multiple pieces that collectively pinch the sit/stand zone should count as blocked even if no single item fully closes it. Do not require 360-degree circulation around the furniture.
If you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "interaction_clearance__kitchen_dining_chair_1",
  "metric": "interaction_clearance",
  "subject_id": "kitchen_dining_chair_1",
  "target_ids": [],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: interaction_clearance\nSubject object: kitchen_dining_chair_1\nTarget objects: none\n\nVerified asset facts:\n- subject `kitchen_dining_chair_1`: category `chair`, affordances [sittable, supportable, pushable, liftable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`\n- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `front: [sit]; top: [place_or_reach_objects]`\n- low-confidence candidates: `none`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_dining_chair_1`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], affordances [sittable, supportable, pushable, liftable], task targets [desk, table]\n- targets: none\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_dining_chair_1`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`\nTarget cues:\n- none\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_dining_chair_0`: category `chair`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `medium`\n- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_eucalyptus_sprig_0_f0_0`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_eucalyptus_sprig_0_f0_2`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `interaction_clearance` metric for `kitchen_dining_chair_1` with affordance `sittable`.\nRelevant context: the surrounding scene.\nCheck whether the immediate sit/stand zone at the usable sitting side has enough free space for a normal sit/stand motion. Consider the combined effect of nearby furniture around the seat edge; multiple pieces that collectively pinch the sit/stand zone should count as blocked even if no single item fully closes it. Do not require 360-degree circulation around the furniture.\nIf you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_dining_chair_1_local.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "interaction_clearance__kitchen_dining_chair_1",
  "metric": "interaction_clearance",
  "label": "unknown",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.67,
  "evidence_conflict": true,
  "reason": "In the local render, the subject chair appears overturned/on its side and interpenetrated with the nearby round bistro table base rather than standing upright in a normal usable orientation. The overview renders show the same corner arrangement, with the chair and table tangled together. Because this clearly conflicts with the asset fact that the subject is a normally sittable chair with an identifiable sitting side, the immediate sit/stand zone cannot be reliably evaluated from the current scene state, so the interaction_clearance judgment is unknown rather than guessed.",
  "blocking_objects": [],
  "confidence": 0.92
}
Evidence
interaction_clearance__kitchen_dining_chair_1

Judgement
refrigerator · openable, containable
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `interaction_clearance` metric for `kitchen_fridge_0` with affordance `openable`. Relevant context: the surrounding scene. First infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space. If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.
Reason: In the local operating zone render and the highlighted local operating zone/context render, the refrigerator’s usable front face is the side facing into the room, and the floor area directly in front of that operating side remains open enough for a practical standing or slightly offset reach position. In the local plan context view, the nearby bistro table and chairs sit off to the side rather than occupying the immediate front operating zone, and no direct floor-level blocker is shown pinching the door/handle area. The overview renders are consistent with this: nearby furniture is adjacent but does not clearly intrude into the refrigerator’s immediate opening/standing space on the front side.
Blocking Objects: none
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: interaction_clearance
Subject object: kitchen_fridge_0
Target objects: none

Verified asset facts:
- subject `kitchen_fridge_0`: no offline asset YAML loaded; heuristic category `refrigerator`, affordances [openable, containable], usable face `front`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_fridge_0`: category `refrigerator`, group `appliance_storage`, keywords [refrigerator, fridge, cooling cabinet], affordances [openable, containable], task targets [wall, clear_space]
- targets: none

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_fridge_0`: category `refrigerator`, room `kitchen`, placement `floor`, footprint `large`, height `mid`, elevation `floor_level`
Target cues:
- none
Nearby same-room context objects that may affect access or clearance:
- `kitchen_semi_flush_ceiling_light_0`: category `pathway light`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_kitchen_prep_cart_0`: category `serving cart`, placement `floor`, proximity `mid`, bbox_relation `overlapping_or_touching`, footprint `medium`
- `kitchen_canister_jar_0`: category `jar`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`
- `kitchen_kitchen_towel_0`: category `towel`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`
- `kitchen_teaspoon_0`: category `teaspoon_teaspoon`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`
- `kitchen_chefs_knife_0_s1_1`: category `chefs_knife_s1`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `small`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `interaction_clearance` metric for `kitchen_fridge_0` with affordance `openable`.
Relevant context: the surrounding scene.
First infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space.
If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "interaction_clearance__kitchen_fridge_0",
  "metric": "interaction_clearance",
  "subject_id": "kitchen_fridge_0",
  "target_ids": [],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: interaction_clearance\nSubject object: kitchen_fridge_0\nTarget objects: none\n\nVerified asset facts:\n- subject `kitchen_fridge_0`: no offline asset YAML loaded; heuristic category `refrigerator`, affordances [openable, containable], usable face `front`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_fridge_0`: category `refrigerator`, group `appliance_storage`, keywords [refrigerator, fridge, cooling cabinet], affordances [openable, containable], task targets [wall, clear_space]\n- targets: none\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_fridge_0`: category `refrigerator`, room `kitchen`, placement `floor`, footprint `large`, height `mid`, elevation `floor_level`\nTarget cues:\n- none\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_semi_flush_ceiling_light_0`: category `pathway light`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_kitchen_prep_cart_0`: category `serving cart`, placement `floor`, proximity `mid`, bbox_relation `overlapping_or_touching`, footprint `medium`\n- `kitchen_canister_jar_0`: category `jar`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`\n- `kitchen_kitchen_towel_0`: category `towel`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`\n- `kitchen_teaspoon_0`: category `teaspoon_teaspoon`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`\n- `kitchen_chefs_knife_0_s1_1`: category `chefs_knife_s1`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `small`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `interaction_clearance` metric for `kitchen_fridge_0` with affordance `openable`.\nRelevant context: the surrounding scene.\nFirst infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space.\nIf you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_fridge_0_local_operating_zone.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_fridge_0_local_operating_zone_context.png",
    "render_check_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_fridge_0_local_depth_probe.png",
    "render_check_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_fridge_0_local_plan_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "interaction_clearance__kitchen_fridge_0",
  "metric": "interaction_clearance",
  "label": "pass",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.69,
  "evidence_conflict": false,
  "reason": "In the local operating zone render and the highlighted local operating zone/context render, the refrigerator’s usable front face is the side facing into the room, and the floor area directly in front of that operating side remains open enough for a practical standing or slightly offset reach position. In the local plan context view, the nearby bistro table and chairs sit off to the side rather than occupying the immediate front operating zone, and no direct floor-level blocker is shown pinching the door/handle area. The overview renders are consistent with this: nearby furniture is adjacent but does not clearly intrude into the refrigerator’s immediate opening/standing space on the front side.",
  "blocking_objects": [],
  "confidence": 0.73
}
Evidence
interaction_clearance__kitchen_fridge_0

Judgement
cabinet · openable, containable, supportable
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `interaction_clearance` metric for `kitchen_base_cabinet_run_0` with affordance `openable`. Relevant context: the surrounding scene. First infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space. If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.
Reason: In the local operating zone render and the highlighted local operating zone context render, the most likely operating side of kitchen_base_cabinet_run_0 is the long front edge facing into the room, but that side is immediately pinched by an opposing blue cabinet run placed directly in front of it. The local depth probe and especially the local plan context view show this opposing cabinet spanning nearly the full length of the subject’s front, leaving no normal standing/opening zone directly before the doors/handles. This is a direct blockage of the immediate operating space, not just perspective overlap, so the cabinet is not practically openable now from its usable side.
Blocking Objects: opposing blue cabinet run directly in front of kitchen_base_cabinet_run_0
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: interaction_clearance
Subject object: kitchen_base_cabinet_run_0
Target objects: none

Verified asset facts:
- subject `kitchen_base_cabinet_run_0`: no offline asset YAML loaded; heuristic category `cabinet`, affordances [openable, containable, supportable], usable face `top`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_base_cabinet_run_0`: category `cabinet`, group `storage`, keywords [cabinet, storage cabinet, cupboard], affordances [openable, containable, supportable], task targets [wall, clear_space]
- targets: none

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_base_cabinet_run_0`: category `cabinet`, room `kitchen`, placement `floor`, footprint `large`, height `low`, elevation `floor_level`
Target cues:
- none
Nearby same-room context objects that may affect access or clearance:
- `kitchen_hand_soap_dispenser_0`: category `soap dispenser`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_kitchen_prep_cart_0`: category `serving cart`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `medium`
- `kitchen_dish_drying_mat_0`: category `place mat`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_dish_soap_pump_bottle_0`: category `bottle`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_kitchen_sponge_scrubber_0`: category `sponge_scrubber_sponge`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_electric_kettle_1`: category `kettle`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `interaction_clearance` metric for `kitchen_base_cabinet_run_0` with affordance `openable`.
Relevant context: the surrounding scene.
First infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space.
If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "interaction_clearance__kitchen_base_cabinet_run_0",
  "metric": "interaction_clearance",
  "subject_id": "kitchen_base_cabinet_run_0",
  "target_ids": [],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: interaction_clearance\nSubject object: kitchen_base_cabinet_run_0\nTarget objects: none\n\nVerified asset facts:\n- subject `kitchen_base_cabinet_run_0`: no offline asset YAML loaded; heuristic category `cabinet`, affordances [openable, containable, supportable], usable face `top`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_base_cabinet_run_0`: category `cabinet`, group `storage`, keywords [cabinet, storage cabinet, cupboard], affordances [openable, containable, supportable], task targets [wall, clear_space]\n- targets: none\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_base_cabinet_run_0`: category `cabinet`, room `kitchen`, placement `floor`, footprint `large`, height `low`, elevation `floor_level`\nTarget cues:\n- none\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_hand_soap_dispenser_0`: category `soap dispenser`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_kitchen_prep_cart_0`: category `serving cart`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `medium`\n- `kitchen_dish_drying_mat_0`: category `place mat`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_dish_soap_pump_bottle_0`: category `bottle`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_kitchen_sponge_scrubber_0`: category `sponge_scrubber_sponge`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_electric_kettle_1`: category `kettle`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `interaction_clearance` metric for `kitchen_base_cabinet_run_0` with affordance `openable`.\nRelevant context: the surrounding scene.\nFirst infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space.\nIf you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_base_cabinet_run_0_local_operating_zone.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_base_cabinet_run_0_local_operating_zone_context.png",
    "render_check_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_base_cabinet_run_0_local_depth_probe.png",
    "render_check_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_base_cabinet_run_0_local_plan_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "interaction_clearance__kitchen_base_cabinet_run_0",
  "metric": "interaction_clearance",
  "label": "fail",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.78,
  "evidence_conflict": false,
  "reason": "In the local operating zone render and the highlighted local operating zone context render, the most likely operating side of kitchen_base_cabinet_run_0 is the long front edge facing into the room, but that side is immediately pinched by an opposing blue cabinet run placed directly in front of it. The local depth probe and especially the local plan context view show this opposing cabinet spanning nearly the full length of the subject’s front, leaving no normal standing/opening zone directly before the doors/handles. This is a direct blockage of the immediate operating space, not just perspective overlap, so the cabinet is not practically openable now from its usable side.",
  "blocking_objects": [
    "opposing blue cabinet run directly in front of kitchen_base_cabinet_run_0"
  ],
  "confidence": 0.9
}
Evidence
interaction_clearance__kitchen_base_cabinet_run_0

Judgement
cocktail shaker · graspable, containable, openable VLM
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `interaction_clearance` metric for `kitchen_pepper_shaker_0` with affordance `openable`. Relevant context: kitchen_base_cabinet_run_0. First infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space. If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.
Reason: In the local operating zone render and the highlighted local operating zone context render, the red subject shaker sits on the wall shelf with its front/usable side exposed and a reachable hand position available from the room side. The nearby mug, salt shaker, and frame are adjacent on the shelf but do not visibly occupy the immediate operating space needed to grasp and open the shaker. The local depth-probe and local plan-context/overview views also do not show any object directly blocking the shelf edge in front of the shaker, so a normal slightly offset hand access position remains usable now.
Blocking Objects: none
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: interaction_clearance
Subject object: kitchen_pepper_shaker_0
Target objects: kitchen_base_cabinet_run_0

Verified asset facts:
- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, affordances [graspable, containable, openable], usable face `front`, benchmark relevance `functional`, confidence `0.58`, source `asset_annotation`
- interaction facts: placement class `unknown`, access type `none`, interaction height `none`, surface map `front: [grasp, access_contents, open]`
- low-confidence candidates: `none`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, group `unknown`, keywords [cocktail shaker], affordances [graspable, containable, openable], task targets [table, desk, nightstand, coffee_table, shelf, cabinet]
- target `kitchen_base_cabinet_run_0`: category `cabinet`, group `storage`, keywords [cabinet, storage cabinet, cupboard], task targets [wall, clear_space]

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, room `kitchen`, placement `floor`, footprint `tiny`, height `very_low`, elevation `high`
Target cues:
- target `kitchen_base_cabinet_run_0`: category `cabinet`, room `kitchen`, placement `floor`, footprint `large`, height `low`, elevation `floor_level`, subject_proximity `far`
Nearby same-room context objects that may affect access or clearance:
- `kitchen_salt_shaker_0`: category `cocktail shaker`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_print_3`: category `print_print`, placement `floor`, proximity `touching`, bbox_relation `very_close`, footprint `small`
- `kitchen_floating_wall_shelf_0`: category `wall shelf`, placement `floor`, proximity `touching`, bbox_relation `very_close`, footprint `medium`
- `kitchen_picture_frame_0`: category `picture frame`, placement `floor`, proximity `near`, bbox_relation `very_close`, footprint `small`
- `kitchen_coffee_mug_3`: category `mug`, placement `floor`, proximity `near`, bbox_relation `very_close`, footprint `tiny`
- `kitchen_potted_basil_herb_plant_0`: category `potted plant`, placement `floor`, proximity `near`, bbox_relation `nearby`, footprint `tiny`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `interaction_clearance` metric for `kitchen_pepper_shaker_0` with affordance `openable`.
Relevant context: kitchen_base_cabinet_run_0.
First infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space.
If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "interaction_clearance__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0",
  "metric": "interaction_clearance",
  "subject_id": "kitchen_pepper_shaker_0",
  "target_ids": [
    "kitchen_base_cabinet_run_0"
  ],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: interaction_clearance\nSubject object: kitchen_pepper_shaker_0\nTarget objects: kitchen_base_cabinet_run_0\n\nVerified asset facts:\n- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, affordances [graspable, containable, openable], usable face `front`, benchmark relevance `functional`, confidence `0.58`, source `asset_annotation`\n- interaction facts: placement class `unknown`, access type `none`, interaction height `none`, surface map `front: [grasp, access_contents, open]`\n- low-confidence candidates: `none`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, group `unknown`, keywords [cocktail shaker], affordances [graspable, containable, openable], task targets [table, desk, nightstand, coffee_table, shelf, cabinet]\n- target `kitchen_base_cabinet_run_0`: category `cabinet`, group `storage`, keywords [cabinet, storage cabinet, cupboard], task targets [wall, clear_space]\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, room `kitchen`, placement `floor`, footprint `tiny`, height `very_low`, elevation `high`\nTarget cues:\n- target `kitchen_base_cabinet_run_0`: category `cabinet`, room `kitchen`, placement `floor`, footprint `large`, height `low`, elevation `floor_level`, subject_proximity `far`\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_salt_shaker_0`: category `cocktail shaker`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_print_3`: category `print_print`, placement `floor`, proximity `touching`, bbox_relation `very_close`, footprint `small`\n- `kitchen_floating_wall_shelf_0`: category `wall shelf`, placement `floor`, proximity `touching`, bbox_relation `very_close`, footprint `medium`\n- `kitchen_picture_frame_0`: category `picture frame`, placement `floor`, proximity `near`, bbox_relation `very_close`, footprint `small`\n- `kitchen_coffee_mug_3`: category `mug`, placement `floor`, proximity `near`, bbox_relation `very_close`, footprint `tiny`\n- `kitchen_potted_basil_herb_plant_0`: category `potted plant`, placement `floor`, proximity `near`, bbox_relation `nearby`, footprint `tiny`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `interaction_clearance` metric for `kitchen_pepper_shaker_0` with affordance `openable`.\nRelevant context: kitchen_base_cabinet_run_0.\nFirst infer the most likely operating side from door panels, handles, hinges, or the side that faces into the room. Then check only that side for a normal standing zone and immediate hand/door operating clearance. Do not assume a large hinged swing arc unless the render clearly shows hinged doors needing that arc. Do not require the entire front face to be empty; a nearby bench or seat off to one side is acceptable when the floor directly before the door/handle area remains usable. A narrow diagonal side gap or a squeeze-by path does not count as clear operating space.\nIf you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0_local_operating_zone.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0_local_operating_zone_context.png",
    "render_check_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0_local_depth_probe.png",
    "render_check_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/interaction_clearance__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0_local_plan_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "interaction_clearance__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0",
  "metric": "interaction_clearance",
  "label": "pass",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.58,
  "evidence_conflict": false,
  "reason": "In the local operating zone render and the highlighted local operating zone context render, the red subject shaker sits on the wall shelf with its front/usable side exposed and a reachable hand position available from the room side. The nearby mug, salt shaker, and frame are adjacent on the shelf but do not visibly occupy the immediate operating space needed to grasp and open the shaker. The local depth-probe and local plan-context/overview views also do not show any object directly blocking the shelf edge in front of the shaker, so a normal slightly offset hand access position remains usable now.",
  "blocking_objects": [],
  "confidence": 0.74
}
Evidence
interaction_clearance__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0

Judgement
coffee table · supportable VLM
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `affordance_exposure` metric for `kitchen_bistro_table_0` with affordance `supportable`. Relevant context: kitchen_dining_chair_0, kitchen_dining_chair_1. Check whether the usable top surface is visible enough to judge placement or reach. Do not require all table edges to be exposed. If you choose `degraded` or `fail`, name at least one direct blocker or the directly constrained functional region. If no direct blocker is visible, prefer `pass` or `unknown`.
Reason: In the local render and the highlighted local_context render, the subject bistro table’s round top is visually visible, but the table appears tipped onto its side rather than upright. That directly conflicts with the provided asset fact that the usable affordance is the table’s supportable top surface, so I cannot reliably judge the intended top-surface exposure as a normal usable support surface from this evidence alone; returning unknown due to the asset-fact conflict rather than a visibility blocker.
Blocking Objects: none
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view. 2. `local_context` is a highlighted context view; subject is red and targets are green.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: affordance_exposure
Subject object: kitchen_bistro_table_0
Target objects: kitchen_dining_chair_0, kitchen_dining_chair_1

Verified asset facts:
- subject `kitchen_bistro_table_0`: category `coffee table`, affordances [supportable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`
- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `top: [place_or_reach_objects]`
- low-confidence candidates: `none`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_bistro_table_0`: category `coffee table`, group `unknown`, keywords [coffee table], affordances [supportable], task targets [chair, refrigerator, room, graspable_object]
- target `kitchen_dining_chair_0`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], task targets [desk, table]
- target `kitchen_dining_chair_1`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], task targets [desk, table]

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_bistro_table_0`: category `coffee table`, room `kitchen`, placement `floor`, footprint `medium`, height `mid`, elevation `floor_level`
Target cues:
- target `kitchen_dining_chair_0`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`, subject_proximity `near`
- target `kitchen_dining_chair_1`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`, subject_proximity `near`
Nearby same-room context objects that may affect access or clearance:
- `kitchen_coffee_mug_0`: category `mug`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_coffee_mug_1`: category `mug`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_pendant_light_0`: category `pathway light`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `medium`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `affordance_exposure` metric for `kitchen_bistro_table_0` with affordance `supportable`.
Relevant context: kitchen_dining_chair_0, kitchen_dining_chair_1.
Check whether the usable top surface is visible enough to judge placement or reach. Do not require all table edges to be exposed.
If you choose `degraded` or `fail`, name at least one direct blocker or the directly constrained functional region. If no direct blocker is visible, prefer `pass` or `unknown`.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "affordance_exposure__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1",
  "metric": "affordance_exposure",
  "subject_id": "kitchen_bistro_table_0",
  "target_ids": [
    "kitchen_dining_chair_0",
    "kitchen_dining_chair_1"
  ],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view. 2. `local_context` is a highlighted context view; subject is red and targets are green.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: affordance_exposure\nSubject object: kitchen_bistro_table_0\nTarget objects: kitchen_dining_chair_0, kitchen_dining_chair_1\n\nVerified asset facts:\n- subject `kitchen_bistro_table_0`: category `coffee table`, affordances [supportable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`\n- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `top: [place_or_reach_objects]`\n- low-confidence candidates: `none`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_bistro_table_0`: category `coffee table`, group `unknown`, keywords [coffee table], affordances [supportable], task targets [chair, refrigerator, room, graspable_object]\n- target `kitchen_dining_chair_0`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], task targets [desk, table]\n- target `kitchen_dining_chair_1`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], task targets [desk, table]\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_bistro_table_0`: category `coffee table`, room `kitchen`, placement `floor`, footprint `medium`, height `mid`, elevation `floor_level`\nTarget cues:\n- target `kitchen_dining_chair_0`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`, subject_proximity `near`\n- target `kitchen_dining_chair_1`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`, subject_proximity `near`\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_coffee_mug_0`: category `mug`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_coffee_mug_1`: category `mug`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_pendant_light_0`: category `pathway light`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `medium`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `affordance_exposure` metric for `kitchen_bistro_table_0` with affordance `supportable`.\nRelevant context: kitchen_dining_chair_0, kitchen_dining_chair_1.\nCheck whether the usable top surface is visible enough to judge placement or reach. Do not require all table edges to be exposed.\nIf you choose `degraded` or `fail`, name at least one direct blocker or the directly constrained functional region. If no direct blocker is visible, prefer `pass` or `unknown`.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/affordance_exposure__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1_local.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/affordance_exposure__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1_local_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "affordance_exposure__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1",
  "metric": "affordance_exposure",
  "label": "unknown",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.67,
  "evidence_conflict": true,
  "reason": "In the local render and the highlighted local_context render, the subject bistro table’s round top is visually visible, but the table appears tipped onto its side rather than upright. That directly conflicts with the provided asset fact that the usable affordance is the table’s supportable top surface, so I cannot reliably judge the intended top-surface exposure as a normal usable support surface from this evidence alone; returning unknown due to the asset-fact conflict rather than a visibility blocker.",
  "blocking_objects": [],
  "confidence": 0.88
}
Evidence
affordance_exposure__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1

Judgement
chair · sittable, supportable, pushable, liftable VLM
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `spatial_accessibility` metric for `kitchen_dining_chair_0` with affordance `sittable`. Relevant context: the surrounding scene. Check whether a person can physically walk up to the seating edge and sit from the front or from one genuinely open side. Require one clearly usable approach corridor from open room space all the way to the seat edge itself, not merely to the furniture cluster or to an outer corner of the seat. A standard coffee table in front of a sofa or loveseat is normal unless the local renders show that it closes off the usable seating edge. Consider the combined effect of nearby furniture on the front and side approach zones only when that pinch is visually evident in the local renders. Close adjacency to a bed or wall is acceptable only if one normal sitting approach remains. If you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.
Reason: In the local render and the highlighted local_context render, kitchen_dining_chair_0 appears tipped/rotated onto its side and interpenetrating the nearby bistro table base rather than standing as a normal usable chair. The overview renders show the same abnormal pose in the corner. This directly conflicts with the converter-side asset fact that the subject remains a sittable chair with a usable seating edge, so I cannot reliably judge a normal walk-up seating approach for the intended affordance from the current evidence without guessing the functional state of the object itself.
Blocking Objects: none
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view. 2. `local_context` is a highlighted context view; subject is red and targets are green.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: spatial_accessibility
Subject object: kitchen_dining_chair_0
Target objects: none

Verified asset facts:
- subject `kitchen_dining_chair_0`: category `chair`, affordances [sittable, supportable, pushable, liftable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`
- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `front: [sit]; top: [place_or_reach_objects]`
- low-confidence candidates: `none`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_dining_chair_0`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], affordances [sittable, supportable, pushable, liftable], task targets [desk, table]
- targets: none

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_dining_chair_0`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`
Target cues:
- none
Nearby same-room context objects that may affect access or clearance:
- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_dining_chair_1`: category `chair`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `medium`
- `kitchen_eucalyptus_sprig_0_f0_1`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_eucalyptus_sprig_0_f0_2`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `spatial_accessibility` metric for `kitchen_dining_chair_0` with affordance `sittable`.
Relevant context: the surrounding scene.
Check whether a person can physically walk up to the seating edge and sit from the front or from one genuinely open side. Require one clearly usable approach corridor from open room space all the way to the seat edge itself, not merely to the furniture cluster or to an outer corner of the seat. A standard coffee table in front of a sofa or loveseat is normal unless the local renders show that it closes off the usable seating edge. Consider the combined effect of nearby furniture on the front and side approach zones only when that pinch is visually evident in the local renders. Close adjacency to a bed or wall is acceptable only if one normal sitting approach remains.
If you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "spatial_accessibility__kitchen_dining_chair_0",
  "metric": "spatial_accessibility",
  "subject_id": "kitchen_dining_chair_0",
  "target_ids": [],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view. 2. `local_context` is a highlighted context view; subject is red and targets are green.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: spatial_accessibility\nSubject object: kitchen_dining_chair_0\nTarget objects: none\n\nVerified asset facts:\n- subject `kitchen_dining_chair_0`: category `chair`, affordances [sittable, supportable, pushable, liftable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`\n- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `front: [sit]; top: [place_or_reach_objects]`\n- low-confidence candidates: `none`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_dining_chair_0`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], affordances [sittable, supportable, pushable, liftable], task targets [desk, table]\n- targets: none\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_dining_chair_0`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`\nTarget cues:\n- none\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_dining_chair_1`: category `chair`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `medium`\n- `kitchen_eucalyptus_sprig_0_f0_1`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_eucalyptus_sprig_0_f0_2`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `spatial_accessibility` metric for `kitchen_dining_chair_0` with affordance `sittable`.\nRelevant context: the surrounding scene.\nCheck whether a person can physically walk up to the seating edge and sit from the front or from one genuinely open side. Require one clearly usable approach corridor from open room space all the way to the seat edge itself, not merely to the furniture cluster or to an outer corner of the seat. A standard coffee table in front of a sofa or loveseat is normal unless the local renders show that it closes off the usable seating edge. Consider the combined effect of nearby furniture on the front and side approach zones only when that pinch is visually evident in the local renders. Close adjacency to a bed or wall is acceptable only if one normal sitting approach remains.\nIf you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_dining_chair_0_local.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_dining_chair_0_local_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "spatial_accessibility__kitchen_dining_chair_0",
  "metric": "spatial_accessibility",
  "label": "unknown",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.67,
  "evidence_conflict": true,
  "reason": "In the local render and the highlighted local_context render, kitchen_dining_chair_0 appears tipped/rotated onto its side and interpenetrating the nearby bistro table base rather than standing as a normal usable chair. The overview renders show the same abnormal pose in the corner. This directly conflicts with the converter-side asset fact that the subject remains a sittable chair with a usable seating edge, so I cannot reliably judge a normal walk-up seating approach for the intended affordance from the current evidence without guessing the functional state of the object itself.",
  "blocking_objects": [],
  "confidence": 0.93
}
Evidence
spatial_accessibility__kitchen_dining_chair_0

Judgement
chair · sittable, supportable, pushable, liftable VLM
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `spatial_accessibility` metric for `kitchen_dining_chair_1` with affordance `sittable`. Relevant context: the surrounding scene. Check whether a person can physically walk up to the seating edge and sit from the front or from one genuinely open side. Require one clearly usable approach corridor from open room space all the way to the seat edge itself, not merely to the furniture cluster or to an outer corner of the seat. A standard coffee table in front of a sofa or loveseat is normal unless the local renders show that it closes off the usable seating edge. Consider the combined effect of nearby furniture on the front and side approach zones only when that pinch is visually evident in the local renders. Close adjacency to a bed or wall is acceptable only if one normal sitting approach remains. If you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.
Reason: In the local render, kitchen_dining_chair_1 is overturned and trapped under/against the nearby bistro table, with the round tabletop/base and table support crossing over the chair’s seating area and approach side. This means there is no normal walk-up corridor to the seat edge and no practical front or side sitting approach. The overview renders also show this furniture cluster jammed into the corner by the walls, confirming the chair is not realistically accessible for sitting now. The direct blocker visible in the local render is the overturned bistro table assembly around the chair, with the corner walls further eliminating alternate approach space around the cluster. This judges only current spatial accessibility, not the chair’s semantic affordance class itself.
Blocking Objects: bistro_table, wall/corner enclosure
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: spatial_accessibility
Subject object: kitchen_dining_chair_1
Target objects: none

Verified asset facts:
- subject `kitchen_dining_chair_1`: category `chair`, affordances [sittable, supportable, pushable, liftable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`
- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `front: [sit]; top: [place_or_reach_objects]`
- low-confidence candidates: `none`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_dining_chair_1`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], affordances [sittable, supportable, pushable, liftable], task targets [desk, table]
- targets: none

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_dining_chair_1`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`
Target cues:
- none
Nearby same-room context objects that may affect access or clearance:
- `kitchen_dining_chair_0`: category `chair`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `medium`
- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_eucalyptus_sprig_0_f0_0`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_eucalyptus_sprig_0_f0_2`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `spatial_accessibility` metric for `kitchen_dining_chair_1` with affordance `sittable`.
Relevant context: the surrounding scene.
Check whether a person can physically walk up to the seating edge and sit from the front or from one genuinely open side. Require one clearly usable approach corridor from open room space all the way to the seat edge itself, not merely to the furniture cluster or to an outer corner of the seat. A standard coffee table in front of a sofa or loveseat is normal unless the local renders show that it closes off the usable seating edge. Consider the combined effect of nearby furniture on the front and side approach zones only when that pinch is visually evident in the local renders. Close adjacency to a bed or wall is acceptable only if one normal sitting approach remains.
If you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "spatial_accessibility__kitchen_dining_chair_1",
  "metric": "spatial_accessibility",
  "subject_id": "kitchen_dining_chair_1",
  "target_ids": [],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: spatial_accessibility\nSubject object: kitchen_dining_chair_1\nTarget objects: none\n\nVerified asset facts:\n- subject `kitchen_dining_chair_1`: category `chair`, affordances [sittable, supportable, pushable, liftable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`\n- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `front: [sit]; top: [place_or_reach_objects]`\n- low-confidence candidates: `none`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_dining_chair_1`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], affordances [sittable, supportable, pushable, liftable], task targets [desk, table]\n- targets: none\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_dining_chair_1`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`\nTarget cues:\n- none\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_dining_chair_0`: category `chair`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `medium`\n- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_eucalyptus_sprig_0_f0_0`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_eucalyptus_sprig_0_f0_2`: category `eucalyptus_sprig_f0`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `spatial_accessibility` metric for `kitchen_dining_chair_1` with affordance `sittable`.\nRelevant context: the surrounding scene.\nCheck whether a person can physically walk up to the seating edge and sit from the front or from one genuinely open side. Require one clearly usable approach corridor from open room space all the way to the seat edge itself, not merely to the furniture cluster or to an outer corner of the seat. A standard coffee table in front of a sofa or loveseat is normal unless the local renders show that it closes off the usable seating edge. Consider the combined effect of nearby furniture on the front and side approach zones only when that pinch is visually evident in the local renders. Close adjacency to a bed or wall is acceptable only if one normal sitting approach remains.\nIf you choose `degraded` or `fail`, name the blocker or the directly constrained approach or sit/stand zone. Consider the combined effect of nearby furniture only when the local renders clearly show that the front and side access are genuinely pinched or eliminated. A standard coffee table in front of a sofa or loveseat is normal unless the local evidence shows it closes off the usable seating edge. Do not infer exact centimeter clearance from auxiliary geometry or bounding-box numbers.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_dining_chair_1_local.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "spatial_accessibility__kitchen_dining_chair_1",
  "metric": "spatial_accessibility",
  "label": "fail",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.67,
  "evidence_conflict": false,
  "reason": "In the local render, kitchen_dining_chair_1 is overturned and trapped under/against the nearby bistro table, with the round tabletop/base and table support crossing over the chair’s seating area and approach side. This means there is no normal walk-up corridor to the seat edge and no practical front or side sitting approach. The overview renders also show this furniture cluster jammed into the corner by the walls, confirming the chair is not realistically accessible for sitting now. The direct blocker visible in the local render is the overturned bistro table assembly around the chair, with the corner walls further eliminating alternate approach space around the cluster. This judges only current spatial accessibility, not the chair’s semantic affordance class itself.",
  "blocking_objects": [
    "bistro_table",
    "wall/corner enclosure"
  ],
  "confidence": 0.96
}
Evidence
spatial_accessibility__kitchen_dining_chair_1

Judgement
refrigerator · openable, containable
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `spatial_accessibility` metric for `kitchen_fridge_0` with affordance `openable`. Relevant context: the surrounding scene. Check whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path. If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.
Reason: Consistency rewrite from fail: the paired openable interaction_clearance check passed for kitchen_fridge_0, so the local evidence supports a usable standing/operating zone. Original spatial_accessibility reason: In the local_plan_context render, the green kitchen prep cart sits directly across the fridge’s front operating side, leaving no normal front-facing standing area at the usable face. The local_operating_zone_context and local_depth_probe renders also show the prep cart tightly abutting the same side of the fridge, so the approach is blocked rather than merely overlapping in perspective. A practical path to stand and open the refrigerator is not available now due to kitchen_kitchen_prep_cart_0 occupying the front approach zone.
Blocking Objects: none
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: spatial_accessibility
Subject object: kitchen_fridge_0
Target objects: none

Verified asset facts:
- subject `kitchen_fridge_0`: no offline asset YAML loaded; heuristic category `refrigerator`, affordances [openable, containable], usable face `front`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_fridge_0`: category `refrigerator`, group `appliance_storage`, keywords [refrigerator, fridge, cooling cabinet], affordances [openable, containable], task targets [wall, clear_space]
- targets: none

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_fridge_0`: category `refrigerator`, room `kitchen`, placement `floor`, footprint `large`, height `mid`, elevation `floor_level`
Target cues:
- none
Nearby same-room context objects that may affect access or clearance:
- `kitchen_semi_flush_ceiling_light_0`: category `pathway light`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_kitchen_prep_cart_0`: category `serving cart`, placement `floor`, proximity `mid`, bbox_relation `overlapping_or_touching`, footprint `medium`
- `kitchen_canister_jar_0`: category `jar`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`
- `kitchen_kitchen_towel_0`: category `towel`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`
- `kitchen_teaspoon_0`: category `teaspoon_teaspoon`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`
- `kitchen_chefs_knife_0_s1_1`: category `chefs_knife_s1`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `small`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `spatial_accessibility` metric for `kitchen_fridge_0` with affordance `openable`.
Relevant context: the surrounding scene.
Check whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path.
If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "spatial_accessibility__kitchen_fridge_0",
  "metric": "spatial_accessibility",
  "subject_id": "kitchen_fridge_0",
  "target_ids": [],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: spatial_accessibility\nSubject object: kitchen_fridge_0\nTarget objects: none\n\nVerified asset facts:\n- subject `kitchen_fridge_0`: no offline asset YAML loaded; heuristic category `refrigerator`, affordances [openable, containable], usable face `front`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_fridge_0`: category `refrigerator`, group `appliance_storage`, keywords [refrigerator, fridge, cooling cabinet], affordances [openable, containable], task targets [wall, clear_space]\n- targets: none\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_fridge_0`: category `refrigerator`, room `kitchen`, placement `floor`, footprint `large`, height `mid`, elevation `floor_level`\nTarget cues:\n- none\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_semi_flush_ceiling_light_0`: category `pathway light`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_kitchen_prep_cart_0`: category `serving cart`, placement `floor`, proximity `mid`, bbox_relation `overlapping_or_touching`, footprint `medium`\n- `kitchen_canister_jar_0`: category `jar`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`\n- `kitchen_kitchen_towel_0`: category `towel`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`\n- `kitchen_teaspoon_0`: category `teaspoon_teaspoon`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `tiny`\n- `kitchen_chefs_knife_0_s1_1`: category `chefs_knife_s1`, placement `floor`, proximity `mid`, bbox_relation `nearby`, footprint `small`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `spatial_accessibility` metric for `kitchen_fridge_0` with affordance `openable`.\nRelevant context: the surrounding scene.\nCheck whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path.\nIf you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_fridge_0_local_operating_zone.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_fridge_0_local_operating_zone_context.png",
    "render_check_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_fridge_0_local_depth_probe.png",
    "render_check_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_fridge_0_local_plan_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "spatial_accessibility__kitchen_fridge_0",
  "metric": "spatial_accessibility",
  "label": "pass",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.86,
  "evidence_conflict": false,
  "reason": "Consistency rewrite from fail: the paired openable interaction_clearance check passed for kitchen_fridge_0, so the local evidence supports a usable standing/operating zone. Original spatial_accessibility reason: In the local_plan_context render, the green kitchen prep cart sits directly across the fridge’s front operating side, leaving no normal front-facing standing area at the usable face. The local_operating_zone_context and local_depth_probe renders also show the prep cart tightly abutting the same side of the fridge, so the approach is blocked rather than merely overlapping in perspective. A practical path to stand and open the refrigerator is not available now due to kitchen_kitchen_prep_cart_0 occupying the front approach zone.",
  "blocking_objects": [],
  "confidence": 0.85
}
Evidence
spatial_accessibility__kitchen_fridge_0

Judgement
coffee table · supportable VLM
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `spatial_accessibility` metric for `kitchen_bistro_table_0` with affordance `supportable`. Relevant context: kitchen_dining_chair_0, kitchen_dining_chair_1. Check whether a person can approach the room-facing side and reach the usable top surface. Do not require circulation on all sides. If you choose `degraded` or `fail`, name at least one direct blocker or the directly constrained functional region. If no direct blocker is visible, prefer `pass` or `unknown`.
Reason: In the local render and the highlighted local_context render, the subject bistro table appears tipped onto its side rather than standing upright, with the round top oriented vertically and the chairs tangled against it. That directly conflicts with the provided asset fact that the table's usable affordance is the supportable top surface, so I cannot reliably judge whether a person can normally approach the room-facing side and reach a usable top surface in the current scene. The overview renders show the same overturned arrangement in the corner, but they do not resolve this asset-fact conflict into a normal accessible tabletop setup.
Blocking Objects: none
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view. 2. `local_context` is a highlighted context view; subject is red and targets are green.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: spatial_accessibility
Subject object: kitchen_bistro_table_0
Target objects: kitchen_dining_chair_0, kitchen_dining_chair_1

Verified asset facts:
- subject `kitchen_bistro_table_0`: category `coffee table`, affordances [supportable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`
- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `top: [place_or_reach_objects]`
- low-confidence candidates: `none`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_bistro_table_0`: category `coffee table`, group `unknown`, keywords [coffee table], affordances [supportable], task targets [chair, refrigerator, room, graspable_object]
- target `kitchen_dining_chair_0`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], task targets [desk, table]
- target `kitchen_dining_chair_1`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], task targets [desk, table]

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_bistro_table_0`: category `coffee table`, room `kitchen`, placement `floor`, footprint `medium`, height `mid`, elevation `floor_level`
Target cues:
- target `kitchen_dining_chair_0`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`, subject_proximity `near`
- target `kitchen_dining_chair_1`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`, subject_proximity `near`
Nearby same-room context objects that may affect access or clearance:
- `kitchen_coffee_mug_0`: category `mug`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_coffee_mug_1`: category `mug`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_pendant_light_0`: category `pathway light`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `medium`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `spatial_accessibility` metric for `kitchen_bistro_table_0` with affordance `supportable`.
Relevant context: kitchen_dining_chair_0, kitchen_dining_chair_1.
Check whether a person can approach the room-facing side and reach the usable top surface. Do not require circulation on all sides.
If you choose `degraded` or `fail`, name at least one direct blocker or the directly constrained functional region. If no direct blocker is visible, prefer `pass` or `unknown`.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "spatial_accessibility__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1",
  "metric": "spatial_accessibility",
  "subject_id": "kitchen_bistro_table_0",
  "target_ids": [
    "kitchen_dining_chair_0",
    "kitchen_dining_chair_1"
  ],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local` is the main clean RGB local inspection view. 2. `local_context` is a highlighted context view; subject is red and targets are green.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: spatial_accessibility\nSubject object: kitchen_bistro_table_0\nTarget objects: kitchen_dining_chair_0, kitchen_dining_chair_1\n\nVerified asset facts:\n- subject `kitchen_bistro_table_0`: category `coffee table`, affordances [supportable], usable face `top`, benchmark relevance `functional`, confidence `0.67`, source `asset_annotation`\n- interaction facts: placement class `floor_furniture`, access type `none`, interaction height `none`, surface map `top: [place_or_reach_objects]`\n- low-confidence candidates: `none`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_bistro_table_0`: category `coffee table`, group `unknown`, keywords [coffee table], affordances [supportable], task targets [chair, refrigerator, room, graspable_object]\n- target `kitchen_dining_chair_0`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], task targets [desk, table]\n- target `kitchen_dining_chair_1`: category `chair`, group `seating`, keywords [chair, side chair, dining chair, seating], task targets [desk, table]\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_bistro_table_0`: category `coffee table`, room `kitchen`, placement `floor`, footprint `medium`, height `mid`, elevation `floor_level`\nTarget cues:\n- target `kitchen_dining_chair_0`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`, subject_proximity `near`\n- target `kitchen_dining_chair_1`: category `chair`, room `kitchen`, placement `floor`, footprint `medium`, height `low`, elevation `floor_level`, subject_proximity `near`\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_coffee_mug_0`: category `mug`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_coffee_mug_1`: category `mug`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_croissant_pastry_0_s0_1`: category `croissant_pastry_s0`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_dessert_plate_0_s0_0`: category `plate`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_bud_vase_0_f0_c`: category `vase`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_pendant_light_0`: category `pathway light`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `medium`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `spatial_accessibility` metric for `kitchen_bistro_table_0` with affordance `supportable`.\nRelevant context: kitchen_dining_chair_0, kitchen_dining_chair_1.\nCheck whether a person can approach the room-facing side and reach the usable top surface. Do not require circulation on all sides.\nIf you choose `degraded` or `fail`, name at least one direct blocker or the directly constrained functional region. If no direct blocker is visible, prefer `pass` or `unknown`.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1_local.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1_local_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "spatial_accessibility__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1",
  "metric": "spatial_accessibility",
  "label": "unknown",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.67,
  "evidence_conflict": true,
  "reason": "In the local render and the highlighted local_context render, the subject bistro table appears tipped onto its side rather than standing upright, with the round top oriented vertically and the chairs tangled against it. That directly conflicts with the provided asset fact that the table's usable affordance is the supportable top surface, so I cannot reliably judge whether a person can normally approach the room-facing side and reach a usable top surface in the current scene. The overview renders show the same overturned arrangement in the corner, but they do not resolve this asset-fact conflict into a normal accessible tabletop setup.",
  "blocking_objects": [],
  "confidence": 0.88
}
Evidence
spatial_accessibility__kitchen_bistro_table_0__to__kitchen_dining_chair_0__kitchen_dining_chair_1

Judgement
cabinet · openable, containable, supportable
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `spatial_accessibility` metric for `kitchen_base_cabinet_run_0` with affordance `openable`. Relevant context: the surrounding scene. Check whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path. If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.
Reason: In the local operating zone render and the highlighted local operating zone context view, the subject cabinet run’s usable front side is immediately faced by an adjacent blue cabinet run/island cabinet, leaving only a very tight slit rather than a normal standing approach. The local depth probe and especially the local plan context view confirm that this blue cabinet spans most of the operating side directly in front of the red-highlighted subject, so there is no practical front-facing or slightly offset position from which to reach and open the cabinet normally now. The overview renders are consistent with this pinched aisle condition rather than showing an alternate accessible front approach along the usable side of the subject cabinet run.
Blocking Objects: adjacent blue cabinet run/island cabinet
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: spatial_accessibility
Subject object: kitchen_base_cabinet_run_0
Target objects: none

Verified asset facts:
- subject `kitchen_base_cabinet_run_0`: no offline asset YAML loaded; heuristic category `cabinet`, affordances [openable, containable, supportable], usable face `top`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_base_cabinet_run_0`: category `cabinet`, group `storage`, keywords [cabinet, storage cabinet, cupboard], affordances [openable, containable, supportable], task targets [wall, clear_space]
- targets: none

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_base_cabinet_run_0`: category `cabinet`, room `kitchen`, placement `floor`, footprint `large`, height `low`, elevation `floor_level`
Target cues:
- none
Nearby same-room context objects that may affect access or clearance:
- `kitchen_hand_soap_dispenser_0`: category `soap dispenser`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_kitchen_prep_cart_0`: category `serving cart`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `medium`
- `kitchen_dish_drying_mat_0`: category `place mat`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`
- `kitchen_dish_soap_pump_bottle_0`: category `bottle`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_kitchen_sponge_scrubber_0`: category `sponge_scrubber_sponge`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_electric_kettle_1`: category `kettle`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `spatial_accessibility` metric for `kitchen_base_cabinet_run_0` with affordance `openable`.
Relevant context: the surrounding scene.
Check whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path.
If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "spatial_accessibility__kitchen_base_cabinet_run_0",
  "metric": "spatial_accessibility",
  "subject_id": "kitchen_base_cabinet_run_0",
  "target_ids": [],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: spatial_accessibility\nSubject object: kitchen_base_cabinet_run_0\nTarget objects: none\n\nVerified asset facts:\n- subject `kitchen_base_cabinet_run_0`: no offline asset YAML loaded; heuristic category `cabinet`, affordances [openable, containable, supportable], usable face `top`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_base_cabinet_run_0`: category `cabinet`, group `storage`, keywords [cabinet, storage cabinet, cupboard], affordances [openable, containable, supportable], task targets [wall, clear_space]\n- targets: none\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_base_cabinet_run_0`: category `cabinet`, room `kitchen`, placement `floor`, footprint `large`, height `low`, elevation `floor_level`\nTarget cues:\n- none\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_hand_soap_dispenser_0`: category `soap dispenser`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_kitchen_prep_cart_0`: category `serving cart`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `medium`\n- `kitchen_dish_drying_mat_0`: category `place mat`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`\n- `kitchen_dish_soap_pump_bottle_0`: category `bottle`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_kitchen_sponge_scrubber_0`: category `sponge_scrubber_sponge`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_electric_kettle_1`: category `kettle`, placement `floor`, proximity `near`, bbox_relation `overlapping_or_touching`, footprint `small`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `spatial_accessibility` metric for `kitchen_base_cabinet_run_0` with affordance `openable`.\nRelevant context: the surrounding scene.\nCheck whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path.\nIf you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_base_cabinet_run_0_local_operating_zone.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_base_cabinet_run_0_local_operating_zone_context.png",
    "render_check_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_base_cabinet_run_0_local_depth_probe.png",
    "render_check_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_base_cabinet_run_0_local_plan_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "spatial_accessibility__kitchen_base_cabinet_run_0",
  "metric": "spatial_accessibility",
  "label": "fail",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.87,
  "evidence_conflict": false,
  "reason": "In the local operating zone render and the highlighted local operating zone context view, the subject cabinet run’s usable front side is immediately faced by an adjacent blue cabinet run/island cabinet, leaving only a very tight slit rather than a normal standing approach. The local depth probe and especially the local plan context view confirm that this blue cabinet spans most of the operating side directly in front of the red-highlighted subject, so there is no practical front-facing or slightly offset position from which to reach and open the cabinet normally now. The overview renders are consistent with this pinched aisle condition rather than showing an alternate accessible front approach along the usable side of the subject cabinet run.",
  "blocking_objects": [
    "adjacent blue cabinet run/island cabinet"
  ],
  "confidence": 0.89
}
Evidence
spatial_accessibility__kitchen_base_cabinet_run_0

Judgement
cocktail shaker · graspable, containable, openable VLM
Question
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby. Evaluate only the `spatial_accessibility` metric for `kitchen_pepper_shaker_0` with affordance `openable`. Relevant context: kitchen_base_cabinet_run_0. Check whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path. If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.
Reason: In the local_plan_context render, the red-highlighted shaker sits on the wall shelf with open room-side space directly in front of its usable side; the nearby mug, salt shaker, frame, and plant are adjacent on the shelf but do not occupy the standing/reach zone in front of it. In the overview renders, there is still a practical approach from the room side toward that wall shelf area, and no local view shows the operating side pinched or blocked across its front-facing access region. Therefore at least one normal front or slightly offset reach position remains usable now.
Blocking Objects: none
Prompt
Show prompt
You are an affordance-aware functional critic for 3D indoor scenes.

Core principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.

Workflow:
1. Read the task instruction and target metric.
2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.
3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.
4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.
5. Evaluate ONLY the requested metric and return a JSON object matching the schema.

Scene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.
Task instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Metric: spatial_accessibility
Subject object: kitchen_pepper_shaker_0
Target objects: kitchen_base_cabinet_run_0

Verified asset facts:
- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, affordances [graspable, containable, openable], usable face `front`, benchmark relevance `functional`, confidence `0.58`, source `asset_annotation`
- interaction facts: placement class `unknown`, access type `none`, interaction height `none`, surface map `front: [grasp, access_contents, open]`
- low-confidence candidates: `none`
Use these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.

Scene evidence object hints:
- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, group `unknown`, keywords [cocktail shaker], affordances [graspable, containable, openable], task targets [table, desk, nightstand, coffee_table, shelf, cabinet]
- target `kitchen_base_cabinet_run_0`: category `cabinet`, group `storage`, keywords [cabinet, storage cabinet, cupboard], task targets [wall, clear_space]

Auxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):
- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, room `kitchen`, placement `floor`, footprint `tiny`, height `very_low`, elevation `high`
Target cues:
- target `kitchen_base_cabinet_run_0`: category `cabinet`, room `kitchen`, placement `floor`, footprint `large`, height `low`, elevation `floor_level`, subject_proximity `far`
Nearby same-room context objects that may affect access or clearance:
- `kitchen_salt_shaker_0`: category `cocktail shaker`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`
- `kitchen_print_3`: category `print_print`, placement `floor`, proximity `touching`, bbox_relation `very_close`, footprint `small`
- `kitchen_floating_wall_shelf_0`: category `wall shelf`, placement `floor`, proximity `touching`, bbox_relation `very_close`, footprint `medium`
- `kitchen_picture_frame_0`: category `picture frame`, placement `floor`, proximity `near`, bbox_relation `very_close`, footprint `small`
- `kitchen_coffee_mug_3`: category `mug`, placement `floor`, proximity `near`, bbox_relation `very_close`, footprint `tiny`
- `kitchen_potted_basil_herb_plant_0`: category `potted plant`, placement `floor`, proximity `near`, bbox_relation `nearby`, footprint `tiny`
Do not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.

Metric-specific rule:
Task: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.
Evaluate only the `spatial_accessibility` metric for `kitchen_pepper_shaker_0` with affordance `openable`.
Relevant context: kitchen_base_cabinet_run_0.
Check whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path.
If you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.

Image order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.
Evidence guardrails:
- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.
- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.
- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.
- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.
- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.
Output JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.
Allowed labels: pass, degraded, fail, unknown.
2D perspective overlap alone is not enough evidence of a blocker.
Normal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.
If you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.
If evidence is insufficient, use unknown instead of guessing.
Raw JSON
Request JSON
{
  "check_id": "spatial_accessibility__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0",
  "metric": "spatial_accessibility",
  "subject_id": "kitchen_pepper_shaker_0",
  "target_ids": [
    "kitchen_base_cabinet_run_0"
  ],
  "prompt_text": "You are an affordance-aware functional critic for 3D indoor scenes.\n\nCore principle: judge whether the task CAN be done in the current scene, without assuming it has already been pre-solved.\n\nWorkflow:\n1. Read the task instruction and target metric.\n2. Inspect the local check renders first and treat them as the primary evidence for the requested interaction.\n3. Local render order and intended use: 1. `local_operating_zone` is the clean RGB view for the most likely door-facing standing and opening zone. 2. `local_operating_zone_context` is a slightly wider highlighted context view of the same operating side. 3. `local_depth_probe` is an oblique RGB view meant to clarify depth and whether a blocker sits in front of the operating zone. 4. `local_plan_context` is a high local layout view used to check whether a nearby object actually spans or occupies the operating/approach zone rather than merely overlapping in perspective.\n4. Then inspect the 4 attached overview renders for room-scale layout and approach context. These overviews are captured from a slightly elevated height and are not meant to mimic adult eye level.\n5. Evaluate ONLY the requested metric and return a JSON object matching the schema.\n\nScene summary: scenesmith scene with 51 objects. Room type: kitchen. Planned checks: 12. Room shell: 3.80m x 3.40m x 2.70m.\nTask instruction: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nMetric: spatial_accessibility\nSubject object: kitchen_pepper_shaker_0\nTarget objects: kitchen_base_cabinet_run_0\n\nVerified asset facts:\n- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, affordances [graspable, containable, openable], usable face `front`, benchmark relevance `functional`, confidence `0.58`, source `asset_annotation`\n- interaction facts: placement class `unknown`, access type `none`, interaction height `none`, surface map `front: [grasp, access_contents, open]`\n- low-confidence candidates: `none`\nUse these as converter-side asset facts. Do not reclassify the subject; judge only whether the listed affordance remains usable in the current scene. If the image clearly contradicts these facts, return unknown and explain the conflict.\n\nScene evidence object hints:\n- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, group `unknown`, keywords [cocktail shaker], affordances [graspable, containable, openable], task targets [table, desk, nightstand, coffee_table, shelf, cabinet]\n- target `kitchen_base_cabinet_run_0`: category `cabinet`, group `storage`, keywords [cabinet, storage cabinet, cupboard], task targets [wall, clear_space]\n\nAuxiliary scene cues (coarse geometry only; use these cues just to identify rough scale and nearby context):\n- subject `kitchen_pepper_shaker_0`: category `cocktail shaker`, room `kitchen`, placement `floor`, footprint `tiny`, height `very_low`, elevation `high`\nTarget cues:\n- target `kitchen_base_cabinet_run_0`: category `cabinet`, room `kitchen`, placement `floor`, footprint `large`, height `low`, elevation `floor_level`, subject_proximity `far`\nNearby same-room context objects that may affect access or clearance:\n- `kitchen_salt_shaker_0`: category `cocktail shaker`, placement `floor`, proximity `touching`, bbox_relation `overlapping_or_touching`, footprint `tiny`\n- `kitchen_print_3`: category `print_print`, placement `floor`, proximity `touching`, bbox_relation `very_close`, footprint `small`\n- `kitchen_floating_wall_shelf_0`: category `wall shelf`, placement `floor`, proximity `touching`, bbox_relation `very_close`, footprint `medium`\n- `kitchen_picture_frame_0`: category `picture frame`, placement `floor`, proximity `near`, bbox_relation `very_close`, footprint `small`\n- `kitchen_coffee_mug_3`: category `mug`, placement `floor`, proximity `near`, bbox_relation `very_close`, footprint `tiny`\n- `kitchen_potted_basil_herb_plant_0`: category `potted plant`, placement `floor`, proximity `near`, bbox_relation `nearby`, footprint `tiny`\nDo not estimate exact centimeter clearance from these coarse cues. They are only for object identity, rough scale, and coarse proximity; the local renders remain the deciding evidence.\n\nMetric-specific rule:\nTask: A kitchen featuring a bistro table with two chairs in a corner and a refrigerator positioned nearby.\nEvaluate only the `spatial_accessibility` metric for `kitchen_pepper_shaker_0` with affordance `openable`.\nRelevant context: kitchen_base_cabinet_run_0.\nCheck whether at least one practical front-facing or slightly offset standing position can reach the operating side so the object can be opened. A nearby bed corner or bench should not cause degraded by itself if a normal reach from the front region is still possible. Use degraded only when that front-region approach is directly pinched, occupied, or effectively unusable across the usable operating side. Do not count a cramped diagonal squeeze, shoulder-turned sidestep, or tiny side sliver as a valid approach path.\nIf you choose `degraded` or `fail`, name the blocker or the blocked standing/opening zone directly in front of the operating side. If at least one practical front-facing or slightly offset standing position remains and can still reach the operating side normally, prefer `pass`. Do not fail an openable object merely because a nearby seat or small furniture item is offset near one part of the front; require evidence that the usable operating side is actually occupied or pinched. If the operating side cannot be identified from the evidence, return `unknown` instead of guessing.\n\nImage order is fixed: local check renders always come first, followed by the overview renders. If a local render is highlighted, the subject is red and the targets are green.\nEvidence guardrails:\n- Treat the local renders as the deciding evidence for direct blockage, operating-side clearance, and immediate approach space.\n- Use overview renders only to understand room-scale layout or to break ties when the local evidence is incomplete.\n- To output `degraded` or `fail`, the local evidence should explicitly show a directly constrained operating/approach zone, not just a nearby object or a single perspective overlap.\n- When a `local_plan_context` view is present, use it to verify whether the suspected blocker really spans the usable operating/approach zone; if it shows a normal nearby standing/reach position remains, prefer `pass`.\n- If the overview render suggests a blockage but the local evidence does not clearly show that blockage, prefer `pass` or `unknown`, not `degraded` or `fail`.\nOutput JSON with keys: check_id, metric, label, asset_fact_used, asset_fact_confidence, evidence_conflict, reason, blocking_objects, confidence.\nAllowed labels: pass, degraded, fail, unknown.\n2D perspective overlap alone is not enough evidence of a blocker.\nNormal adjacency priors should not override direct evidence in the local renders. Nightstands beside beds, benches touching bed ends, corner wardrobes, and wall decor are acceptable only when they do not directly intrude into the required functional region.\nIf you choose degraded or fail, the reason must name a direct blocker or directly constrained functional region.\nIf evidence is insufficient, use unknown instead of guessing.",
  "files": {
    "render_check_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0_local_operating_zone.png",
    "render_check_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0_local_operating_zone_context.png",
    "render_check_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0_local_depth_probe.png",
    "render_check_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/checks/spatial_accessibility__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0_local_plan_context.png",
    "render_overview_image": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_00_rgb.png",
    "render_overview_image_01": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_01_rgb.png",
    "render_overview_image_02": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_02_rgb.png",
    "render_overview_image_03": "/data/250010098/SceneBenchmark/res/vlm_runs/scenesmith_ud4/scenesmith-example-data_scene_064/render_bundle/views/view_03_rgb.png"
  }
}
Result JSON
{
  "check_id": "spatial_accessibility__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0",
  "metric": "spatial_accessibility",
  "label": "pass",
  "asset_fact_used": true,
  "asset_fact_confidence": 0.58,
  "evidence_conflict": false,
  "reason": "In the local_plan_context render, the red-highlighted shaker sits on the wall shelf with open room-side space directly in front of its usable side; the nearby mug, salt shaker, frame, and plant are adjacent on the shelf but do not occupy the standing/reach zone in front of it. In the overview renders, there is still a practical approach from the room side toward that wall shelf area, and no local view shows the operating side pinched or blocked across its front-facing access region. Therefore at least one normal front or slightly offset reach position remains usable now.",
  "blocking_objects": [],
  "confidence": 0.73
}
Evidence
spatial_accessibility__kitchen_pepper_shaker_0__to__kitchen_base_cabinet_run_0