sperepa commited on
Commit
d7fb330
·
verified ·
1 Parent(s): f308fae

Upload folder using huggingface_hub

Browse files
Dockerfile ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # Multi-stage build using openenv-base
8
+ # This Dockerfile is flexible and works for both:
9
+ # - In-repo environments (with local OpenEnv sources)
10
+ # - Standalone environments (with openenv from PyPI/Git)
11
+ # The build script (openenv build) handles context detection and sets appropriate build args.
12
+
13
+ ARG BASE_IMAGE=ghcr.io/meta-pytorch/openenv-base:latest
14
+ FROM ${BASE_IMAGE} AS builder
15
+
16
+ WORKDIR /app
17
+
18
+ # Ensure git is available (required for installing dependencies from VCS)
19
+ RUN apt-get update && \
20
+ apt-get install -y --no-install-recommends git && \
21
+ rm -rf /var/lib/apt/lists/*
22
+
23
+ # Build argument to control whether we're building standalone or in-repo
24
+ ARG BUILD_MODE=in-repo
25
+ ARG ENV_NAME=hack_meta
26
+
27
+ # Copy environment code (always at root of build context)
28
+ COPY . /app/env
29
+
30
+ # For in-repo builds, openenv is already vendored in the build context
31
+ # For standalone builds, openenv will be installed via pyproject.toml
32
+ WORKDIR /app/env
33
+
34
+ # Ensure uv is available (for local builds where base image lacks it)
35
+ RUN if ! command -v uv >/dev/null 2>&1; then \
36
+ curl -LsSf https://astral.sh/uv/install.sh | sh && \
37
+ mv /root/.local/bin/uv /usr/local/bin/uv && \
38
+ mv /root/.local/bin/uvx /usr/local/bin/uvx; \
39
+ fi
40
+
41
+ # Install dependencies using uv sync
42
+ # If uv.lock exists, use it; otherwise resolve on the fly
43
+ RUN --mount=type=cache,target=/root/.cache/uv \
44
+ if [ -f uv.lock ]; then \
45
+ uv sync --frozen --no-install-project --no-editable; \
46
+ else \
47
+ uv sync --no-install-project --no-editable; \
48
+ fi
49
+
50
+ RUN --mount=type=cache,target=/root/.cache/uv \
51
+ if [ -f uv.lock ]; then \
52
+ uv sync --frozen --no-editable; \
53
+ else \
54
+ uv sync --no-editable; \
55
+ fi
56
+
57
+ # Final runtime stage
58
+ FROM ${BASE_IMAGE}
59
+
60
+ WORKDIR /app
61
+
62
+ # Copy the virtual environment from builder
63
+ COPY --from=builder /app/env/.venv /app/.venv
64
+
65
+ # Copy the environment code
66
+ COPY --from=builder /app/env /app/env
67
+
68
+ # Set PATH to use the virtual environment
69
+ ENV PATH="/app/.venv/bin:$PATH"
70
+
71
+ # Set PYTHONPATH so imports work correctly
72
+ ENV PYTHONPATH="/app/env:$PYTHONPATH"
73
+
74
+ # Health check
75
+ HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
76
+ CMD curl -f http://localhost:8000/health || exit 1
77
+
78
+ # Run the FastAPI server
79
+ # The module path is constructed to work with the /app/env structure
80
+ ENV ENABLE_WEB_INTERFACE=true
81
+ CMD ["sh", "-c", "cd /app/env && uvicorn server.app:app --host 0.0.0.0 --port 8000"]
README.md CHANGED
@@ -1,10 +1,180 @@
1
  ---
2
- title: Hack Meta
3
- emoji: 🌖
4
- colorFrom: indigo
5
- colorTo: gray
6
  sdk: docker
7
  pinned: false
 
 
 
 
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Disaster Response Coordination Environment
3
+ emoji: 🚨
4
+ colorFrom: red
5
+ colorTo: yellow
6
  sdk: docker
7
  pinned: false
8
+ app_port: 8000
9
+ base_path: /web
10
+ tags:
11
+ - openenv
12
  ---
13
 
14
+ # Disaster Response Coordination Environment
15
+
16
+ This OpenEnv environment simulates an Emergency Operations Center allocating scarce resources across simultaneous disaster targets. The agent must reduce preventable deaths, critical injuries, exposure harm, and infrastructure failure across a ladder of progressively harder scenes.
17
+
18
+ ## Motivation
19
+
20
+ This is a real-world coordination problem rather than a toy task. It evaluates whether an agent can:
21
+
22
+ - prioritize under time pressure
23
+ - reason about vulnerability and deadlines
24
+ - handle mixed rescue and infrastructure triage
25
+ - avoid harmful but superficially plausible actions
26
+
27
+ The environment is designed to provide rich per-step reward while keeping the true harm model hidden from the agent.
28
+
29
+ ## Action Space
30
+
31
+ Each turn the agent submits a `DisasterAction` with zero or more resource assignments:
32
+
33
+ ```json
34
+ {
35
+ "assignments": [
36
+ {"resource_id": "engineering_strike", "target_id": "hospital_power"},
37
+ {"resource_id": "tunnel_rescue", "target_id": "tunnel_train"}
38
+ ]
39
+ }
40
+ ```
41
+
42
+ Constraints:
43
+
44
+ - a resource may be assigned at most once per turn
45
+ - unavailable resources must not be assigned
46
+ - resolved or failed targets should not be assigned
47
+ - assignments with no capability overlap are ineffective and penalized
48
+
49
+ ## Observation Space
50
+
51
+ The agent sees:
52
+
53
+ - scene id, name, and level
54
+ - narrative briefing
55
+ - visible target state:
56
+ - status
57
+ - estimated people
58
+ - observed risk
59
+ - `critical_now`
60
+ - `priority_band`
61
+ - vulnerability label
62
+ - progress
63
+ - time remaining
64
+ - recommended capabilities
65
+ - visible resource state:
66
+ - capabilities
67
+ - availability
68
+ - remaining uses
69
+ - available-until turn
70
+ - structured feedback from the previous step
71
+
72
+ The latent harm model remains hidden so the policy cannot self-score.
73
+
74
+ ## Task Ladder
75
+
76
+ The environment contains a genuine easy-to-hard difficulty range:
77
+
78
+ 1. `scene_1`: Flash Flood, Two Rescue Calls, One Boat
79
+ 2. `scene_2`: Flood Rescue vs Medical Transport
80
+ 3. `scene_3`: Building Collapse vs Highway Hazmat Crash
81
+ 4. `scene_4`: Wildfire Suburb vs Nursing Home
82
+ 5. `scene_5`: Hospital Backup Power vs Tunnel Train Entrapment
83
+ 6. `scene_6`: Toxic Plume vs Downtown Office Tower Fire
84
+ 7. `scene_7`: Bridge Collapse During VIP Event Weekend
85
+ 8. `scene_8`: Regional Multi-Disaster with Scarce Air Assets
86
+
87
+ For submission purposes, this exceeds the minimum requirement of three tasks with easy, medium, and hard coverage.
88
+
89
+ ## Reward And Grading
90
+
91
+ Per-step reward is dense and shaped:
92
+
93
+ - positive reward for reducing latent remaining harm
94
+ - penalties for invalid actions
95
+ - penalties for ineffective assignments
96
+ - penalties for leaving compatible resources idle during critical windows
97
+ - penalties for deadline misses, churn, and failed targets
98
+
99
+ Final evaluation uses a normalized score against a no-op baseline:
100
+
101
+ - `final_score` in `[0, 100]`
102
+ - `grader_score = final_score / 100.0` in `[0.0, 1.0]`
103
+
104
+ This keeps grading deterministic and reproducible while preserving a meaningful learning signal.
105
+
106
+ ## Baselines
107
+
108
+ The repo-root [`inference.py`](/c:/Users/pavan/meta-pytorch-hackathon/inference.py) supports:
109
+
110
+ - `heuristic`
111
+ - `random`
112
+ - `llm`
113
+
114
+ Recent observed behavior:
115
+
116
+ - strong scenes: `scene_4`, `scene_6`, `scene_7`
117
+ - middling scenes: `scene_2`, `scene_5`
118
+ - weak scenes: `scene_1`, `scene_3`
119
+ - hard-fail scene: `scene_8`
120
+
121
+ ## Validate Locally
122
+
123
+ From this directory:
124
+
125
+ ```powershell
126
+ .\.venv\Scripts\openenv.exe validate
127
+ ```
128
+
129
+ ## Run Locally
130
+
131
+ Run the API locally:
132
+
133
+ ```powershell
134
+ .\.venv\Scripts\python.exe -m server.app
135
+ ```
136
+
137
+ Or:
138
+
139
+ ```powershell
140
+ uvicorn server.app:app --host 0.0.0.0 --port 8000
141
+ ```
142
+
143
+ ## Docker
144
+
145
+ Build from this directory:
146
+
147
+ ```powershell
148
+ docker build -t hack_meta-env:latest -f server/Dockerfile .
149
+ ```
150
+
151
+ Run:
152
+
153
+ ```powershell
154
+ docker run --rm -p 8000:8000 hack_meta-env:latest
155
+ ```
156
+
157
+ ## Hugging Face Space
158
+
159
+ This package directory is the deployable environment root. Deploy from `hack_meta/`, not from the repo root.
160
+
161
+ Before pushing:
162
+
163
+ 1. configure environment secrets in the Space settings
164
+ 2. validate locally
165
+ 3. confirm `reset()` responds successfully
166
+
167
+ ## Package Layout
168
+
169
+ ```text
170
+ hack_meta/
171
+ |-- client.py
172
+ |-- models.py
173
+ |-- openenv.yaml
174
+ |-- pyproject.toml
175
+ |-- README.md
176
+ `-- server/
177
+ |-- app.py
178
+ |-- Dockerfile
179
+ `-- hack_meta_environment.py
180
+ ```
__init__.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Disaster response scene ladder package exports."""
2
+
3
+ from .models import (
4
+ DisasterAction,
5
+ DisasterObservation,
6
+ DisasterReward,
7
+ ResourceAssignment,
8
+ ResourceStatus,
9
+ TargetStatus,
10
+ )
11
+
12
+ try:
13
+ from .client import DisasterResponseEnv
14
+ except ImportError: # pragma: no cover
15
+ DisasterResponseEnv = None # type: ignore[assignment]
16
+
17
+ __all__ = [
18
+ "DisasterAction",
19
+ "DisasterObservation",
20
+ "DisasterReward",
21
+ "ResourceAssignment",
22
+ "ResourceStatus",
23
+ "TargetStatus",
24
+ "DisasterResponseEnv",
25
+ ]
client.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Disaster response scene ladder client."""
2
+
3
+ from typing import Dict
4
+
5
+ from openenv.core import EnvClient
6
+ from openenv.core.client_types import StepResult
7
+ from openenv.core.env_server.types import State
8
+
9
+ from .models import (
10
+ DisasterAction,
11
+ DisasterObservation,
12
+ ResourceStatus,
13
+ TargetStatus,
14
+ )
15
+
16
+
17
+ class DisasterResponseEnv(EnvClient[DisasterAction, DisasterObservation, State]):
18
+ """Client for the scene-based disaster response environment."""
19
+
20
+ def _step_payload(self, action: DisasterAction) -> Dict:
21
+ return {
22
+ "assignments": [
23
+ {
24
+ "resource_id": assignment.resource_id,
25
+ "target_id": assignment.target_id,
26
+ }
27
+ for assignment in action.assignments
28
+ ]
29
+ }
30
+
31
+ def _parse_result(self, payload: Dict) -> StepResult[DisasterObservation]:
32
+ obs_data = payload.get("observation", {})
33
+
34
+ targets = {
35
+ target_id: TargetStatus(**target_data)
36
+ for target_id, target_data in obs_data.get("targets", {}).items()
37
+ }
38
+ resources = {
39
+ resource_id: ResourceStatus(**resource_data)
40
+ for resource_id, resource_data in obs_data.get("resources", {}).items()
41
+ }
42
+
43
+ observation = DisasterObservation(
44
+ scene_id=obs_data.get("scene_id", ""),
45
+ scene_name=obs_data.get("scene_name", ""),
46
+ level=obs_data.get("level", 0),
47
+ narrative=obs_data.get("narrative", ""),
48
+ targets=targets,
49
+ resources=resources,
50
+ resolved_count=obs_data.get("resolved_count", 0),
51
+ turn=obs_data.get("turn", 0),
52
+ max_turns=obs_data.get("max_turns", 0),
53
+ feedback=obs_data.get("feedback", ""),
54
+ final_score=obs_data.get("final_score"),
55
+ done=payload.get("done", False),
56
+ reward=payload.get("reward", 0.0),
57
+ metadata=obs_data.get("metadata", {}),
58
+ )
59
+
60
+ return StepResult(
61
+ observation=observation,
62
+ reward=payload.get("reward"),
63
+ done=payload.get("done", False),
64
+ )
65
+
66
+ def _parse_state(self, payload: Dict) -> State:
67
+ return State(
68
+ episode_id=payload.get("episode_id"),
69
+ step_count=payload.get("step_count", 0),
70
+ scene_id=payload.get("scene_id"),
71
+ scene_name=payload.get("scene_name"),
72
+ level=payload.get("level"),
73
+ )
models.py ADDED
@@ -0,0 +1,171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Data models for the scene-based disaster response environment.
3
+ """
4
+
5
+ from typing import Dict, List, Optional
6
+
7
+ from openenv.core.env_server.types import Action, Observation
8
+ from pydantic import BaseModel, Field
9
+
10
+
11
+ class ResourceAssignment(BaseModel):
12
+ """Assign one resource to one target for the current turn."""
13
+
14
+ resource_id: str = Field(..., description="Resource to deploy this turn")
15
+ target_id: str = Field(..., description="Target to support this turn")
16
+
17
+
18
+ class DisasterAction(Action):
19
+ """
20
+ Action for the disaster response ladder.
21
+
22
+ Each turn the agent assigns scarce resources to targets. A resource may only
23
+ appear once in the action list for the turn.
24
+ """
25
+
26
+ assignments: List[ResourceAssignment] = Field(
27
+ default_factory=list,
28
+ description=(
29
+ "Per-turn resource assignments. Each item maps one resource_id to "
30
+ "one target_id. Resources not listed remain idle."
31
+ ),
32
+ )
33
+
34
+
35
+ class ResourceStatus(BaseModel):
36
+ """Visible status for a deployable resource."""
37
+
38
+ name: str = Field(..., description="Human-readable resource name")
39
+ capabilities: List[str] = Field(
40
+ default_factory=list,
41
+ description="Operational capabilities this resource can provide",
42
+ )
43
+ available: bool = Field(..., description="Whether the resource can be deployed")
44
+ remaining_uses: Optional[int] = Field(
45
+ default=None,
46
+ description="How many episode-wide uses remain, if finite",
47
+ )
48
+ available_until_turn: Optional[int] = Field(
49
+ default=None,
50
+ description="Last turn on which the resource can still be used, if limited",
51
+ )
52
+ description: str = Field(..., description="Operational description")
53
+
54
+
55
+ class TargetStatus(BaseModel):
56
+ """Visible status for a response target within the current scene."""
57
+
58
+ name: str = Field(..., description="Human-readable target name")
59
+ category: str = Field(..., description="Target category such as victims or infrastructure")
60
+ status: str = Field(
61
+ ...,
62
+ description="One of: active, contained, resolved, or failed",
63
+ )
64
+ estimated_people: str = Field(
65
+ ...,
66
+ description="Visible people estimate or affected population note",
67
+ )
68
+ observed_risk: float = Field(
69
+ ...,
70
+ description="Observed urgency signal in the range 0.0 to 1.0",
71
+ )
72
+ critical_now: bool = Field(
73
+ ...,
74
+ description="Whether this target is in an immediate decision window",
75
+ )
76
+ priority_band: str = Field(
77
+ ...,
78
+ description="Model-facing priority label: immediate, high, medium, monitor, or failed",
79
+ )
80
+ vulnerability: str = Field(
81
+ ...,
82
+ description="Visible vulnerability band for the target population",
83
+ )
84
+ visibility: float = Field(
85
+ ...,
86
+ description="How visible the incident is publicly, 0.0 to 1.0",
87
+ )
88
+ progress: float = Field(
89
+ ...,
90
+ description="Mitigation progress from 0.0 to 1.0",
91
+ )
92
+ time_remaining: int = Field(
93
+ ...,
94
+ description="Approximate turns before the target becomes much harder to save",
95
+ )
96
+ recommended_capabilities: List[str] = Field(
97
+ default_factory=list,
98
+ description="Capabilities that can materially improve the target",
99
+ )
100
+ last_assigned_resources: List[str] = Field(
101
+ default_factory=list,
102
+ description="Resources deployed to this target on the previous turn",
103
+ )
104
+ description: str = Field(..., description="Operational context and constraints")
105
+
106
+
107
+ class DisasterObservation(Observation):
108
+ """
109
+ Observation returned after each turn of the scene ladder.
110
+
111
+ The simulator exposes the operational picture but keeps the full latent harm
112
+ model internal so rewards cannot be self-scored by the agent.
113
+ """
114
+
115
+ scene_id: str = Field(..., description="Stable scene identifier")
116
+ scene_name: str = Field(..., description="Human-readable scene name")
117
+ level: int = Field(..., description="Difficulty level for the scene")
118
+ narrative: str = Field(..., description="Top-level scene briefing")
119
+ targets: Dict[str, TargetStatus] = Field(
120
+ default_factory=dict,
121
+ description="Visible target statuses keyed by target ID",
122
+ )
123
+ resources: Dict[str, ResourceStatus] = Field(
124
+ default_factory=dict,
125
+ description="Deployable resource statuses keyed by resource ID",
126
+ )
127
+ resolved_count: int = Field(
128
+ default=0,
129
+ description="Number of targets resolved so far",
130
+ )
131
+ turn: int = Field(default=0, description="Current turn number")
132
+ max_turns: int = Field(default=0, description="Maximum turns in the scene")
133
+ feedback: str = Field(
134
+ default="",
135
+ description="Structured feedback on the last action and simulator update",
136
+ )
137
+ final_score: Optional[float] = Field(
138
+ default=None,
139
+ description="Normalized 0-100 score once the episode is complete",
140
+ )
141
+
142
+
143
+ class DisasterReward(BaseModel):
144
+ """
145
+ Typed reward model for the disaster response ladder.
146
+
147
+ OpenEnv responses still carry the scalar reward at step time, but this model
148
+ makes the reward contract explicit for spec compliance and documentation.
149
+ """
150
+
151
+ value: float = Field(..., description="Scalar step reward returned by the environment")
152
+ final_score: Optional[float] = Field(
153
+ default=None,
154
+ description="Normalized 0-100 end-of-episode score when available",
155
+ )
156
+ fatalities: Optional[float] = Field(
157
+ default=None,
158
+ description="Cumulative fatalities observed in audit metrics, if available",
159
+ )
160
+ critical_injuries: Optional[float] = Field(
161
+ default=None,
162
+ description="Cumulative critical injuries observed in audit metrics, if available",
163
+ )
164
+ deadline_misses: Optional[float] = Field(
165
+ default=None,
166
+ description="Weighted count of missed critical windows, if available",
167
+ )
168
+ failed_targets: Optional[float] = Field(
169
+ default=None,
170
+ description="Weighted count of targets that reached failed status, if available",
171
+ )
openenv.yaml ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ spec_version: 1
2
+ name: hack_meta
3
+ type: space
4
+ runtime: fastapi
5
+ app: server.app:app
6
+ port: 8000
7
+
openenv_hack_meta.egg-info/PKG-INFO ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ Metadata-Version: 2.4
2
+ Name: openenv-hack_meta
3
+ Version: 0.1.0
4
+ Summary: Disaster Response Coordination Environment for OpenEnv — EOC agent allocates teams across simultaneous incidents to minimise casualties
5
+ Requires-Python: >=3.10
6
+ Requires-Dist: openenv-core[core]>=0.2.2
7
+ Requires-Dist: python-dotenv>=1.2.2
8
+ Provides-Extra: dev
9
+ Requires-Dist: pytest>=8.0.0; extra == "dev"
10
+ Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
openenv_hack_meta.egg-info/SOURCES.txt ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ README.md
2
+ __init__.py
3
+ client.py
4
+ models.py
5
+ pyproject.toml
6
+ ./__init__.py
7
+ ./client.py
8
+ ./models.py
9
+ openenv_hack_meta.egg-info/PKG-INFO
10
+ openenv_hack_meta.egg-info/SOURCES.txt
11
+ openenv_hack_meta.egg-info/dependency_links.txt
12
+ openenv_hack_meta.egg-info/entry_points.txt
13
+ openenv_hack_meta.egg-info/requires.txt
14
+ openenv_hack_meta.egg-info/top_level.txt
15
+ server/__init__.py
16
+ server/app.py
17
+ server/hack_meta_environment.py
openenv_hack_meta.egg-info/dependency_links.txt ADDED
@@ -0,0 +1 @@
 
 
1
+
openenv_hack_meta.egg-info/entry_points.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ [console_scripts]
2
+ server = hack_meta.server.app:main
openenv_hack_meta.egg-info/requires.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ openenv-core[core]>=0.2.2
2
+ python-dotenv>=1.2.2
3
+
4
+ [dev]
5
+ pytest>=8.0.0
6
+ pytest-cov>=4.0.0
openenv_hack_meta.egg-info/top_level.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ hack_meta
pyproject.toml ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ [build-system]
8
+ requires = ["setuptools>=45", "wheel"]
9
+ build-backend = "setuptools.build_meta"
10
+
11
+ [project]
12
+ name = "openenv-hack_meta"
13
+ version = "0.1.0"
14
+ description = "Disaster Response Coordination Environment for OpenEnv — EOC agent allocates teams across simultaneous incidents to minimise casualties"
15
+ requires-python = ">=3.10"
16
+ dependencies = [
17
+ # Core OpenEnv runtime (provides FastAPI server + HTTP client types)
18
+ # install from github
19
+ # "openenv-core[core] @ git+https://github.com/meta-pytorch/OpenEnv.git",
20
+ "openenv-core[core]>=0.2.2",
21
+ # Environment-specific dependencies
22
+ # Add all dependencies needed for your environment here
23
+ # Examples:
24
+ # "numpy>=1.19.0",
25
+ # "torch>=2.0.0",
26
+ # "gymnasium>=0.29.0",
27
+ # "openspiel>=1.0.0",
28
+ # "smolagents>=1.22.0,<2",
29
+ "python-dotenv>=1.2.2",
30
+ ]
31
+
32
+ [project.optional-dependencies]
33
+ dev = [
34
+ "pytest>=8.0.0",
35
+ "pytest-cov>=4.0.0",
36
+ ]
37
+
38
+ [project.scripts]
39
+ # Server entry point - enables running via: uv run --project . server
40
+ # or: python -m hack_meta.server.app
41
+ server = "hack_meta.server.app:main"
42
+
43
+ [tool.setuptools]
44
+ include-package-data = true
45
+ packages = ["hack_meta", "hack_meta.server"]
46
+ package-dir = { "hack_meta" = ".", "hack_meta.server" = "server" }
server/__init__.py ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """Disaster Response Coordination environment server components."""
8
+
9
+ from .hack_meta_environment import DisasterResponseEnvironment
10
+
11
+ __all__ = ["DisasterResponseEnvironment"]
server/app.py ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Meta Platforms, Inc. and affiliates.
2
+ # All rights reserved.
3
+ #
4
+ # This source code is licensed under the BSD-style license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ """
8
+ FastAPI application for the Hack Meta Environment.
9
+
10
+ This module creates an HTTP server that exposes the HackMetaEnvironment
11
+ over HTTP and WebSocket endpoints, compatible with EnvClient.
12
+
13
+ Endpoints:
14
+ - POST /reset: Reset the environment
15
+ - POST /step: Execute an action
16
+ - GET /state: Get current environment state
17
+ - GET /schema: Get action/observation schemas
18
+ - WS /ws: WebSocket endpoint for persistent sessions
19
+
20
+ Usage:
21
+ # Development (with auto-reload):
22
+ uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
23
+
24
+ # Production:
25
+ uvicorn server.app:app --host 0.0.0.0 --port 8000 --workers 4
26
+
27
+ # Or run directly:
28
+ python -m server.app
29
+ """
30
+
31
+ try:
32
+ from openenv.core.env_server.http_server import create_app
33
+ except Exception as e: # pragma: no cover
34
+ raise ImportError(
35
+ "openenv is required for the web interface. Install dependencies with '\n uv sync\n'"
36
+ ) from e
37
+
38
+ try:
39
+ from ..models import DisasterAction, DisasterObservation
40
+ from .hack_meta_environment import DisasterResponseEnvironment
41
+ except ImportError:
42
+ from models import DisasterAction, DisasterObservation
43
+ from server.hack_meta_environment import DisasterResponseEnvironment
44
+
45
+
46
+ # Create the app with web interface and README integration
47
+ app = create_app(
48
+ DisasterResponseEnvironment,
49
+ DisasterAction,
50
+ DisasterObservation,
51
+ env_name="disaster_response",
52
+ max_concurrent_envs=10, # supports concurrent WebSocket sessions
53
+ )
54
+
55
+
56
+ def main() -> None:
57
+ """
58
+ Entry point for direct execution via uv run or python -m.
59
+
60
+ This function enables running the server without Docker:
61
+ uv run --project . server
62
+ uv run --project . server --port 8001
63
+ python -m hack_meta.server.app
64
+
65
+ For production deployments, consider using uvicorn directly with
66
+ multiple workers:
67
+ uvicorn hack_meta.server.app:app --workers 4
68
+ """
69
+ import argparse
70
+ import uvicorn
71
+
72
+ parser = argparse.ArgumentParser()
73
+ parser.add_argument("--host", default="0.0.0.0")
74
+ parser.add_argument("--port", type=int, default=8000)
75
+ args = parser.parse_args()
76
+
77
+ uvicorn.run(app, host=args.host, port=args.port)
78
+
79
+
80
+ if __name__ == "__main__":
81
+ main()
server/hack_meta_environment.py ADDED
@@ -0,0 +1,601 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Scene-based disaster response coordination environment.
3
+ """
4
+
5
+ from __future__ import annotations
6
+
7
+ from copy import deepcopy
8
+ from typing import Any, Dict, List, Optional
9
+ from uuid import uuid4
10
+
11
+ from openenv.core.env_server.interfaces import Environment
12
+ from openenv.core.env_server.types import State
13
+
14
+ try:
15
+ from ..models import (
16
+ DisasterAction,
17
+ DisasterObservation,
18
+ ResourceStatus,
19
+ TargetStatus,
20
+ )
21
+ from .scene_catalog import DEFAULT_SCENE_ID, SCENE_CATALOG, SceneConfig, ordered_scene_ids
22
+ except ImportError:
23
+ from models import DisasterAction, DisasterObservation, ResourceStatus, TargetStatus
24
+ from server.scene_catalog import DEFAULT_SCENE_ID, SCENE_CATALOG, SceneConfig, ordered_scene_ids
25
+
26
+
27
+ class DisasterResponseEnvironment(Environment):
28
+ """
29
+ Multi-scene disaster response environment with hidden-state reward shaping.
30
+
31
+ The agent sees targets, resources, and timing cues, but rewards come from a
32
+ latent harm model so the policy cannot self-certify mediocre behavior.
33
+ """
34
+
35
+ SUPPORTS_CONCURRENT_SESSIONS: bool = True
36
+
37
+ def __init__(self) -> None:
38
+ self._state = State(episode_id=str(uuid4()), step_count=0)
39
+ self._scene: SceneConfig = SCENE_CATALOG[DEFAULT_SCENE_ID]
40
+ self._targets: Dict[str, Dict[str, Any]] = {}
41
+ self._resources: Dict[str, Dict[str, Any]] = {}
42
+ self._metrics: Dict[str, float] = {}
43
+ self._turn: int = 0
44
+ self._baseline_harm: float = 0.0
45
+ self._final_score: Optional[float] = None
46
+
47
+ def reset(
48
+ self,
49
+ seed: Optional[int] = None,
50
+ episode_id: Optional[str] = None,
51
+ scene_id: Optional[str] = None,
52
+ level: Optional[int] = None,
53
+ **kwargs: Any,
54
+ ) -> DisasterObservation:
55
+ self._state = State(
56
+ episode_id=episode_id or str(uuid4()),
57
+ step_count=0,
58
+ )
59
+ self._scene = self._select_scene(scene_id=scene_id, level=level)
60
+ self._targets = self._init_targets(self._scene)
61
+ self._resources = self._init_resources(self._scene)
62
+ self._metrics = {
63
+ "fatalities": 0.0,
64
+ "critical_injuries": 0.0,
65
+ "exposure_harm": 0.0,
66
+ "service_loss": 0.0,
67
+ "invalid_actions": 0.0,
68
+ "ineffective_assignments": 0.0,
69
+ "deadline_misses": 0.0,
70
+ "reassignment_churn": 0.0,
71
+ "resolved_targets": 0.0,
72
+ "failed_targets": 0.0,
73
+ }
74
+ self._turn = 0
75
+ self._final_score = None
76
+ self._baseline_harm = self._simulate_noop_baseline()
77
+
78
+ feedback = (
79
+ f"Level {self._scene.level}: {self._scene.name}\n"
80
+ f"{self._scene.briefing}\n"
81
+ f"Why this is hard: {self._scene.why_harder}\n"
82
+ "Objective: minimize preventable deaths, critical injuries, exposure, and service collapse.\n"
83
+ "Submit assignments as a JSON list of {resource_id, target_id} objects."
84
+ )
85
+ return self._build_observation(feedback=feedback, reward=0.0, done=False)
86
+
87
+ def step(self, action: DisasterAction, **kwargs: Any) -> DisasterObservation: # type: ignore[override]
88
+ self._turn += 1
89
+ self._state.step_count += 1
90
+
91
+ feedback_parts: List[str] = []
92
+ prev_potential = self._potential(self._targets)
93
+
94
+ assignments_by_target: Dict[str, List[str]] = {tid: [] for tid in self._targets}
95
+ used_resources: set[str] = set()
96
+ penalty = 0.0
97
+
98
+ for assignment in action.assignments:
99
+ resource_id = assignment.resource_id
100
+ target_id = assignment.target_id
101
+
102
+ if resource_id not in self._resources:
103
+ penalty += 6.0
104
+ self._metrics["invalid_actions"] += 1
105
+ feedback_parts.append(f"[ERR] Unknown resource '{resource_id}'")
106
+ continue
107
+ if target_id not in self._targets:
108
+ penalty += 6.0
109
+ self._metrics["invalid_actions"] += 1
110
+ feedback_parts.append(f"[ERR] Unknown target '{target_id}'")
111
+ continue
112
+ if resource_id in used_resources:
113
+ penalty += 5.0
114
+ self._metrics["invalid_actions"] += 1
115
+ feedback_parts.append(f"[ERR] Resource '{resource_id}' assigned more than once")
116
+ continue
117
+ if not self._resource_available(self._resources[resource_id], self._turn):
118
+ penalty += 5.0
119
+ self._metrics["invalid_actions"] += 1
120
+ feedback_parts.append(f"[ERR] Resource '{resource_id}' is unavailable")
121
+ continue
122
+ if self._targets[target_id]["status"] == "resolved":
123
+ penalty += 3.0
124
+ self._metrics["ineffective_assignments"] += 1
125
+ feedback_parts.append(f"[WARN] Target '{target_id}' already resolved")
126
+ continue
127
+
128
+ used_resources.add(resource_id)
129
+ assignments_by_target[target_id].append(resource_id)
130
+
131
+ penalty += self._apply_idle_penalty(used_resources)
132
+ penalty += self._advance_system(assignments_by_target, feedback_parts)
133
+
134
+ next_potential = self._potential(self._targets)
135
+ reward = round((next_potential - prev_potential) / 10.0 - penalty, 3)
136
+
137
+ done = self._all_targets_resolved() or self._turn >= self._scene.max_turns
138
+ if done:
139
+ self._final_score = self._compute_final_score()
140
+ feedback_parts.append(
141
+ f"Episode complete. Final score={self._final_score:.1f}/100."
142
+ )
143
+
144
+ feedback = " | ".join(feedback_parts) if feedback_parts else "Assignments executed."
145
+ return self._build_observation(feedback=feedback, reward=reward, done=done)
146
+
147
+ @property
148
+ def state(self) -> State:
149
+ return State(
150
+ episode_id=self._state.episode_id,
151
+ step_count=self._state.step_count,
152
+ scene_id=self._scene.scene_id,
153
+ scene_name=self._scene.name,
154
+ level=self._scene.level,
155
+ )
156
+
157
+ def _select_scene(
158
+ self,
159
+ scene_id: Optional[str],
160
+ level: Optional[int],
161
+ ) -> SceneConfig:
162
+ if scene_id:
163
+ if scene_id not in SCENE_CATALOG:
164
+ raise ValueError(f"Unknown scene_id '{scene_id}'")
165
+ return SCENE_CATALOG[scene_id]
166
+ if level is not None:
167
+ for candidate in SCENE_CATALOG.values():
168
+ if candidate.level == level:
169
+ return candidate
170
+ raise ValueError(f"Unknown level '{level}'")
171
+ return SCENE_CATALOG[DEFAULT_SCENE_ID]
172
+
173
+ def _init_targets(self, scene: SceneConfig) -> Dict[str, Dict[str, Any]]:
174
+ targets: Dict[str, Dict[str, Any]] = {}
175
+ for cfg in scene.targets:
176
+ targets[cfg.target_id] = {
177
+ "config": cfg,
178
+ "status": "active",
179
+ "progress": 0.0,
180
+ "risk": cfg.initial_risk,
181
+ "people_remaining": cfg.people_true,
182
+ "time_remaining": cfg.deadline_turns,
183
+ "last_assigned_resources": [],
184
+ "deadline_missed": False,
185
+ "failed": False,
186
+ }
187
+ return targets
188
+
189
+ def _init_resources(self, scene: SceneConfig) -> Dict[str, Dict[str, Any]]:
190
+ resources: Dict[str, Dict[str, Any]] = {}
191
+ for cfg in scene.resources:
192
+ resources[cfg.resource_id] = {
193
+ "config": cfg,
194
+ "remaining_uses": cfg.max_uses,
195
+ "last_target_id": None,
196
+ }
197
+ return resources
198
+
199
+ def _resource_available(self, resource: Dict[str, Any], turn: int) -> bool:
200
+ cfg = resource["config"]
201
+ if cfg.available_until_turn is not None and turn > cfg.available_until_turn:
202
+ return False
203
+ if resource["remaining_uses"] is not None and resource["remaining_uses"] <= 0:
204
+ return False
205
+ return True
206
+
207
+ def _apply_idle_penalty(self, used_resources: set[str]) -> float:
208
+ penalty = 0.0
209
+ critical_targets = [
210
+ target
211
+ for target in self._targets.values()
212
+ if target["status"] != "resolved" and target["time_remaining"] <= 2
213
+ ]
214
+ if not critical_targets:
215
+ return penalty
216
+
217
+ for resource_id, resource in self._resources.items():
218
+ if resource_id in used_resources or not self._resource_available(resource, self._turn):
219
+ continue
220
+ if self._resource_can_help_any_target(resource["config"].capabilities, critical_targets):
221
+ penalty += 3.0
222
+ return penalty
223
+
224
+ def _resource_can_help_any_target(
225
+ self,
226
+ capabilities: Dict[str, float],
227
+ targets: List[Dict[str, Any]],
228
+ ) -> bool:
229
+ for target in targets:
230
+ weights = target["config"].capability_weights
231
+ if any(capability in weights for capability in capabilities):
232
+ return True
233
+ return False
234
+
235
+ def _advance_system(
236
+ self,
237
+ assignments_by_target: Dict[str, List[str]],
238
+ feedback_parts: List[str],
239
+ ) -> float:
240
+ penalty = 0.0
241
+ newly_resolved: List[str] = []
242
+ deadline_hits: List[str] = []
243
+
244
+ for target_id, target in self._targets.items():
245
+ cfg = target["config"]
246
+ resource_ids = assignments_by_target.get(target_id, [])
247
+ response_power = 0.0
248
+ assigned_names: List[str] = []
249
+
250
+ for resource_id in resource_ids:
251
+ resource = self._resources[resource_id]
252
+ resource_cfg = resource["config"]
253
+ match = max(
254
+ (
255
+ resource_cfg.capabilities[capability] * weight
256
+ for capability, weight in cfg.capability_weights.items()
257
+ if capability in resource_cfg.capabilities
258
+ ),
259
+ default=0.0,
260
+ )
261
+ if match <= 0.0:
262
+ penalty += 3.0
263
+ self._metrics["ineffective_assignments"] += 1
264
+ feedback_parts.append(
265
+ f"[WARN] {resource_id} does not materially help {target_id}"
266
+ )
267
+ continue
268
+
269
+ if resource["last_target_id"] not in (None, target_id):
270
+ penalty += 1.0
271
+ self._metrics["reassignment_churn"] += 1
272
+ response_power += match
273
+ assigned_names.append(resource_id)
274
+ resource["last_target_id"] = target_id
275
+ if resource["remaining_uses"] is not None:
276
+ resource["remaining_uses"] -= 1
277
+
278
+ target["last_assigned_resources"] = assigned_names
279
+ if target["status"] == "resolved" or target["failed"]:
280
+ continue
281
+
282
+ progress_gain = cfg.progress_per_power * response_power
283
+ protection = min(0.92, target["progress"] * 0.55 + response_power * cfg.protection_per_power)
284
+ target["progress"] = min(1.0, target["progress"] + progress_gain)
285
+
286
+ target["risk"] = max(
287
+ 0.15,
288
+ min(
289
+ 2.5,
290
+ target["risk"] + cfg.escalation_rate - response_power * cfg.risk_reduction_per_power,
291
+ ),
292
+ )
293
+
294
+ time_pressure = 1.0 + max(0, 1 - max(target["time_remaining"], 0) / max(1, cfg.deadline_turns)) * 0.6
295
+ if target["time_remaining"] <= 0:
296
+ time_pressure += 0.4
297
+
298
+ protective_gap = max(0.05, 1.0 - protection)
299
+
300
+ deaths_now = target["people_remaining"] * cfg.death_rate * target["risk"] * time_pressure * protective_gap
301
+ critical_now = target["people_remaining"] * cfg.critical_rate * target["risk"] * time_pressure * protective_gap
302
+ exposure_now = cfg.exposed_population * cfg.exposure_rate * target["risk"] * time_pressure * protective_gap
303
+ service_now = cfg.service_scale * cfg.service_rate * target["risk"] * time_pressure * protective_gap
304
+
305
+ self._metrics["fatalities"] += deaths_now
306
+ self._metrics["critical_injuries"] += critical_now
307
+ self._metrics["exposure_harm"] += exposure_now
308
+ self._metrics["service_loss"] += service_now
309
+
310
+ if target["people_remaining"] > 0.0:
311
+ target["people_remaining"] = max(0.0, target["people_remaining"] - deaths_now)
312
+
313
+ if target["progress"] >= 1.0 or (target["progress"] >= 0.86 and target["risk"] <= 0.25):
314
+ if target["status"] != "resolved":
315
+ target["status"] = "resolved"
316
+ self._metrics["resolved_targets"] += 1
317
+ newly_resolved.append(cfg.name)
318
+ continue
319
+
320
+ if not target["deadline_missed"] and target["time_remaining"] <= 0 and target["progress"] < 0.60:
321
+ target["deadline_missed"] = True
322
+ weighted_miss = cfg.deadline_weight * cfg.vulnerability
323
+ self._metrics["deadline_misses"] += weighted_miss
324
+ penalty += 4.0 * weighted_miss
325
+ deadline_hits.append(cfg.name)
326
+
327
+ if target["time_remaining"] < -2 and target["progress"] < 0.35 and not target["failed"]:
328
+ target["failed"] = True
329
+ target["status"] = "failed"
330
+ weighted_fail = cfg.deadline_weight * cfg.vulnerability
331
+ self._metrics["failed_targets"] += weighted_fail
332
+ penalty += 6.0 * weighted_fail
333
+ elif target["progress"] >= 0.55:
334
+ target["status"] = "contained"
335
+ else:
336
+ target["status"] = "active"
337
+
338
+ target["time_remaining"] -= 1
339
+
340
+ if newly_resolved:
341
+ feedback_parts.append("Resolved: " + ", ".join(newly_resolved))
342
+ if deadline_hits:
343
+ feedback_parts.append("Critical window missed: " + ", ".join(deadline_hits))
344
+
345
+ hot_targets = self._hot_target_summaries(limit=3)
346
+ if hot_targets:
347
+ feedback_parts.append("Hot targets: " + ", ".join(hot_targets))
348
+
349
+ return penalty
350
+
351
+ def _hot_target_summaries(self, limit: int) -> List[str]:
352
+ active_targets = [
353
+ target
354
+ for target in self._targets.values()
355
+ if target["status"] not in {"resolved", "failed"}
356
+ ]
357
+ active_targets.sort(
358
+ key=lambda target: (
359
+ -target["risk"],
360
+ target["time_remaining"],
361
+ -target["config"].vulnerability,
362
+ )
363
+ )
364
+ summaries: List[str] = []
365
+ for target in active_targets[:limit]:
366
+ summaries.append(
367
+ f"{target['config'].target_id}(risk={target['risk']:.2f}, t={target['time_remaining']})"
368
+ )
369
+ return summaries
370
+
371
+ def _potential(self, targets: Dict[str, Dict[str, Any]]) -> float:
372
+ total = 0.0
373
+ for target in targets.values():
374
+ if target["status"] == "resolved":
375
+ continue
376
+ cfg = target["config"]
377
+ if target["failed"]:
378
+ total += (
379
+ 140.0 * max(0.0, target["people_remaining"])
380
+ + 24.0 * cfg.exposed_population
381
+ + 28.0 * cfg.service_scale
382
+ + 40.0 * cfg.deadline_weight * cfg.vulnerability
383
+ )
384
+ continue
385
+ urgency = target["risk"] * (1.0 + max(0, 2 - target["time_remaining"]) * 0.35)
386
+ protective_gap = max(0.05, 1.0 - target["progress"] * 0.75)
387
+ expected_deaths = target["people_remaining"] * cfg.death_rate * urgency * protective_gap * cfg.vulnerability
388
+ expected_critical = target["people_remaining"] * cfg.critical_rate * urgency * protective_gap * cfg.vulnerability
389
+ expected_exposure = cfg.exposed_population * cfg.exposure_rate * urgency * protective_gap
390
+ expected_service = cfg.service_scale * cfg.service_rate * urgency * protective_gap
391
+ equity_gap = cfg.equity_weight * cfg.vulnerability * urgency * protective_gap * (1.0 - cfg.visibility)
392
+ deadline_gap = max(0.0, 1.0 - max(target["time_remaining"], 0) / max(1, cfg.deadline_turns))
393
+ total += (
394
+ 100.0 * expected_deaths
395
+ + 35.0 * expected_critical
396
+ + 12.0 * expected_exposure
397
+ + 18.0 * expected_service
398
+ + 10.0 * equity_gap
399
+ + 8.0 * deadline_gap * cfg.deadline_weight
400
+ )
401
+ return -total
402
+
403
+ def _simulate_noop_baseline(self) -> float:
404
+ targets = deepcopy(self._targets)
405
+ resources = deepcopy(self._resources)
406
+ metrics = deepcopy(self._metrics)
407
+ for turn in range(1, self._scene.max_turns + 1):
408
+ empty_assignments = {target_id: [] for target_id in targets}
409
+ self._advance_copy(targets, resources, metrics, empty_assignments, turn)
410
+ return max(1.0, self._compute_total_harm(metrics))
411
+
412
+ def _advance_copy(
413
+ self,
414
+ targets: Dict[str, Dict[str, Any]],
415
+ resources: Dict[str, Dict[str, Any]],
416
+ metrics: Dict[str, float],
417
+ assignments_by_target: Dict[str, List[str]],
418
+ turn: int,
419
+ ) -> None:
420
+ for target_id, target in targets.items():
421
+ cfg = target["config"]
422
+ response_power = 0.0
423
+ for resource_id in assignments_by_target.get(target_id, []):
424
+ resource = resources[resource_id]
425
+ resource_cfg = resource["config"]
426
+ match = max(
427
+ (
428
+ resource_cfg.capabilities[capability] * weight
429
+ for capability, weight in cfg.capability_weights.items()
430
+ if capability in resource_cfg.capabilities
431
+ ),
432
+ default=0.0,
433
+ )
434
+ if match <= 0.0:
435
+ metrics["ineffective_assignments"] += 1
436
+ continue
437
+ response_power += match
438
+ if resource["remaining_uses"] is not None:
439
+ resource["remaining_uses"] -= 1
440
+
441
+ if target["status"] in {"resolved", "failed"}:
442
+ continue
443
+
444
+ progress_gain = cfg.progress_per_power * response_power
445
+ protection = min(0.92, target["progress"] * 0.55 + response_power * cfg.protection_per_power)
446
+ target["progress"] = min(1.0, target["progress"] + progress_gain)
447
+ target["risk"] = max(
448
+ 0.15,
449
+ min(
450
+ 2.5,
451
+ target["risk"] + cfg.escalation_rate - response_power * cfg.risk_reduction_per_power,
452
+ ),
453
+ )
454
+
455
+ time_pressure = 1.0 + max(0, 1 - max(target["time_remaining"], 0) / max(1, cfg.deadline_turns)) * 0.6
456
+ if target["time_remaining"] <= 0:
457
+ time_pressure += 0.4
458
+ protective_gap = max(0.05, 1.0 - protection)
459
+
460
+ deaths_now = target["people_remaining"] * cfg.death_rate * target["risk"] * time_pressure * protective_gap
461
+ critical_now = target["people_remaining"] * cfg.critical_rate * target["risk"] * time_pressure * protective_gap
462
+ exposure_now = cfg.exposed_population * cfg.exposure_rate * target["risk"] * time_pressure * protective_gap
463
+ service_now = cfg.service_scale * cfg.service_rate * target["risk"] * time_pressure * protective_gap
464
+
465
+ metrics["fatalities"] += deaths_now
466
+ metrics["critical_injuries"] += critical_now
467
+ metrics["exposure_harm"] += exposure_now
468
+ metrics["service_loss"] += service_now
469
+
470
+ if target["people_remaining"] > 0.0:
471
+ target["people_remaining"] = max(0.0, target["people_remaining"] - deaths_now)
472
+
473
+ if target["progress"] >= 1.0 or (target["progress"] >= 0.86 and target["risk"] <= 0.25):
474
+ target["status"] = "resolved"
475
+ metrics["resolved_targets"] += 1
476
+ continue
477
+
478
+ if not target["deadline_missed"] and target["time_remaining"] <= 0 and target["progress"] < 0.60:
479
+ target["deadline_missed"] = True
480
+ metrics["deadline_misses"] += cfg.deadline_weight * cfg.vulnerability
481
+
482
+ if target["time_remaining"] < -2 and target["progress"] < 0.35 and not target["failed"]:
483
+ target["failed"] = True
484
+ target["status"] = "failed"
485
+ metrics["failed_targets"] += cfg.deadline_weight * cfg.vulnerability
486
+ elif target["progress"] >= 0.55:
487
+ target["status"] = "contained"
488
+ else:
489
+ target["status"] = "active"
490
+
491
+ target["time_remaining"] -= 1
492
+
493
+ def _compute_total_harm(self, metrics: Dict[str, float]) -> float:
494
+ return (
495
+ 100.0 * metrics["fatalities"]
496
+ + 35.0 * metrics["critical_injuries"]
497
+ + 12.0 * metrics["exposure_harm"]
498
+ + 18.0 * metrics["service_loss"]
499
+ + 18.0 * metrics["deadline_misses"]
500
+ + 24.0 * metrics["failed_targets"]
501
+ + 4.0 * metrics["invalid_actions"]
502
+ + 2.0 * metrics["ineffective_assignments"]
503
+ + 1.0 * metrics["reassignment_churn"]
504
+ )
505
+
506
+ def _compute_final_score(self) -> float:
507
+ realized_harm = self._compute_total_harm(self._metrics)
508
+ raw = 100.0 * (self._baseline_harm - realized_harm) / self._baseline_harm
509
+ return max(0.0, min(100.0, round(raw, 2)))
510
+
511
+ def _all_targets_resolved(self) -> bool:
512
+ return all(target["status"] == "resolved" for target in self._targets.values())
513
+
514
+ def _priority_band(self, target: Dict[str, Any]) -> str:
515
+ cfg = target["config"]
516
+ if target["failed"]:
517
+ return "failed"
518
+ urgency = target["risk"] * cfg.vulnerability
519
+ if target["time_remaining"] <= 1 or urgency >= 1.6:
520
+ return "immediate"
521
+ if target["time_remaining"] <= 2 or urgency >= 1.15:
522
+ return "high"
523
+ if target["time_remaining"] <= 3 or urgency >= 0.8:
524
+ return "medium"
525
+ return "monitor"
526
+
527
+ def _build_observation(
528
+ self,
529
+ feedback: str,
530
+ reward: float,
531
+ done: bool,
532
+ ) -> DisasterObservation:
533
+ targets = {
534
+ target_id: TargetStatus(
535
+ name=target["config"].name,
536
+ category=target["config"].category,
537
+ status=target["status"],
538
+ estimated_people=target["config"].estimated_people,
539
+ observed_risk=round(
540
+ max(
541
+ 0.05,
542
+ min(
543
+ 1.0,
544
+ target["config"].observed_risk
545
+ + (target["risk"] - target["config"].initial_risk) * 0.35,
546
+ ),
547
+ ),
548
+ 3,
549
+ ),
550
+ critical_now=(target["time_remaining"] <= 1 and target["status"] not in {"resolved", "failed"}),
551
+ priority_band=self._priority_band(target),
552
+ vulnerability=target["config"].vulnerability_label,
553
+ visibility=target["config"].visibility,
554
+ progress=round(target["progress"], 3),
555
+ time_remaining=target["time_remaining"],
556
+ recommended_capabilities=list(target["config"].recommended_capabilities),
557
+ last_assigned_resources=list(target["last_assigned_resources"]),
558
+ description=(
559
+ f"{target['config'].description} Critical window: {target['config'].deadline_note}"
560
+ ),
561
+ )
562
+ for target_id, target in self._targets.items()
563
+ }
564
+ resources = {
565
+ resource_id: ResourceStatus(
566
+ name=resource["config"].name,
567
+ capabilities=sorted(resource["config"].capabilities.keys()),
568
+ available=self._resource_available(resource, self._turn + 1 if not done else self._turn),
569
+ remaining_uses=resource["remaining_uses"],
570
+ available_until_turn=resource["config"].available_until_turn,
571
+ description=resource["config"].description,
572
+ )
573
+ for resource_id, resource in self._resources.items()
574
+ }
575
+ resolved_count = sum(1 for target in self._targets.values() if target["status"] == "resolved")
576
+ metadata: Dict[str, Any] = {
577
+ "scene_ids": ordered_scene_ids(),
578
+ "score_method": "normalized_against_noop_baseline",
579
+ }
580
+ if done and self._final_score is not None:
581
+ metadata["audit_metrics"] = {
582
+ key: round(value, 2) for key, value in self._metrics.items()
583
+ }
584
+ metadata["baseline_harm"] = round(self._baseline_harm, 2)
585
+
586
+ return DisasterObservation(
587
+ scene_id=self._scene.scene_id,
588
+ scene_name=self._scene.name,
589
+ level=self._scene.level,
590
+ narrative=self._scene.briefing,
591
+ targets=targets,
592
+ resources=resources,
593
+ resolved_count=resolved_count,
594
+ turn=self._turn,
595
+ max_turns=self._scene.max_turns,
596
+ feedback=feedback,
597
+ final_score=self._final_score if done else None,
598
+ done=done,
599
+ reward=reward,
600
+ metadata=metadata,
601
+ )
server/requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ openenv[core]>=0.2.0
2
+ fastapi>=0.115.0
3
+ uvicorn>=0.24.0
4
+
5
+
6
+
server/scene_catalog.py ADDED
@@ -0,0 +1,810 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Scene ladder configuration for the disaster response environment.
3
+ """
4
+
5
+ from __future__ import annotations
6
+
7
+ from dataclasses import dataclass
8
+ from typing import Dict, List, Optional
9
+
10
+
11
+ @dataclass(frozen=True)
12
+ class ResourceConfig:
13
+ resource_id: str
14
+ name: str
15
+ capabilities: Dict[str, float]
16
+ description: str
17
+ max_uses: Optional[int] = None
18
+ available_until_turn: Optional[int] = None
19
+
20
+
21
+ @dataclass(frozen=True)
22
+ class TargetConfig:
23
+ target_id: str
24
+ name: str
25
+ category: str
26
+ description: str
27
+ estimated_people: str
28
+ observed_risk: float
29
+ visibility: float
30
+ vulnerability_label: str
31
+ vulnerability: float
32
+ deadline_turns: int
33
+ deadline_note: str
34
+ recommended_capabilities: List[str]
35
+ capability_weights: Dict[str, float]
36
+ people_true: float = 0.0
37
+ exposed_population: float = 0.0
38
+ service_scale: float = 0.0
39
+ initial_risk: float = 1.0
40
+ progress_per_power: float = 0.22
41
+ risk_reduction_per_power: float = 0.18
42
+ protection_per_power: float = 0.20
43
+ escalation_rate: float = 0.10
44
+ death_rate: float = 0.010
45
+ critical_rate: float = 0.015
46
+ exposure_rate: float = 0.0
47
+ service_rate: float = 0.0
48
+ deadline_weight: float = 1.0
49
+ equity_weight: float = 0.0
50
+
51
+
52
+ @dataclass(frozen=True)
53
+ class SceneConfig:
54
+ scene_id: str
55
+ level: int
56
+ name: str
57
+ briefing: str
58
+ why_harder: str
59
+ max_turns: int
60
+ resources: List[ResourceConfig]
61
+ targets: List[TargetConfig]
62
+
63
+
64
+ SCENE_CATALOG: Dict[str, SceneConfig] = {
65
+ "scene_1": SceneConfig(
66
+ scene_id="scene_1",
67
+ level=1,
68
+ name="Flash Flood - Two Rescue Calls, One Boat",
69
+ briefing=(
70
+ "A sudden urban flash flood creates two simultaneous rescue calls in nearby "
71
+ "streets. One family of four is stranded in a ground-floor house. Two elderly "
72
+ "residents are trapped in a vehicle in faster-moving water. Only one rescue "
73
+ "boat can arrive within the first operational window."
74
+ ),
75
+ why_harder=(
76
+ "Same hazard type and short distances make this level readable, but the two "
77
+ "groups differ in vulnerability and time-to-failure."
78
+ ),
79
+ max_turns=4,
80
+ resources=[
81
+ ResourceConfig(
82
+ resource_id="boat_alpha",
83
+ name="Swift-Water Boat Alpha",
84
+ capabilities={"swift_water": 1.0},
85
+ description="Single rescue boat able to complete one rescue push per turn.",
86
+ ),
87
+ ],
88
+ targets=[
89
+ TargetConfig(
90
+ target_id="house_family",
91
+ name="Family in Flooded House",
92
+ category="victims",
93
+ description="Family of four, including children, stranded at a ground-floor home.",
94
+ estimated_people="4 people",
95
+ observed_risk=0.68,
96
+ visibility=0.45,
97
+ vulnerability_label="high",
98
+ vulnerability=1.20,
99
+ deadline_turns=2,
100
+ deadline_note="Children likely lose safe shelter after 2 turns.",
101
+ recommended_capabilities=["swift_water"],
102
+ capability_weights={"swift_water": 1.0},
103
+ people_true=4,
104
+ initial_risk=0.95,
105
+ progress_per_power=0.50,
106
+ escalation_rate=0.09,
107
+ death_rate=0.040,
108
+ critical_rate=0.090,
109
+ deadline_weight=1.2,
110
+ ),
111
+ TargetConfig(
112
+ target_id="elderly_vehicle",
113
+ name="Elderly Residents in Vehicle",
114
+ category="victims",
115
+ description="Two elderly residents trapped in a vehicle with rising current.",
116
+ estimated_people="2 people",
117
+ observed_risk=0.82,
118
+ visibility=0.55,
119
+ vulnerability_label="very high",
120
+ vulnerability=1.45,
121
+ deadline_turns=1,
122
+ deadline_note="Vehicle stability may fail after 1 turn.",
123
+ recommended_capabilities=["swift_water"],
124
+ capability_weights={"swift_water": 1.0},
125
+ people_true=2,
126
+ initial_risk=1.15,
127
+ progress_per_power=0.58,
128
+ escalation_rate=0.13,
129
+ death_rate=0.090,
130
+ critical_rate=0.120,
131
+ deadline_weight=1.7,
132
+ ),
133
+ ],
134
+ ),
135
+ "scene_2": SceneConfig(
136
+ scene_id="scene_2",
137
+ level=2,
138
+ name="Flood Rescue vs Medical Transport",
139
+ briefing=(
140
+ "Flooded roads isolate a nursing home while several families remain on rooftops "
141
+ "across two nearby blocks. Two high-water vehicles are available. The nursing "
142
+ "home has twelve immobile residents needing oxygen support, but the rooftop "
143
+ "rescues are more visually urgent."
144
+ ),
145
+ why_harder=(
146
+ "Visible rescue competes with less visible medical deterioration, and limited "
147
+ "transport capacity forces medical triage under flood conditions."
148
+ ),
149
+ max_turns=5,
150
+ resources=[
151
+ ResourceConfig(
152
+ resource_id="hwv_alpha",
153
+ name="High-Water Vehicle Alpha",
154
+ capabilities={"medical_transport": 1.0, "swift_water": 0.75},
155
+ description="Can transport fragile patients or conduct flood rescue trips.",
156
+ ),
157
+ ResourceConfig(
158
+ resource_id="hwv_bravo",
159
+ name="High-Water Vehicle Bravo",
160
+ capabilities={"medical_transport": 1.0, "swift_water": 0.75},
161
+ description="Second high-water vehicle with the same flood mobility profile.",
162
+ ),
163
+ ResourceConfig(
164
+ resource_id="med_coord",
165
+ name="Medical Coordination Cell",
166
+ capabilities={"medical_coordination": 0.85},
167
+ description="Coordinates oxygen, receiving facilities, and priority loading.",
168
+ ),
169
+ ],
170
+ targets=[
171
+ TargetConfig(
172
+ target_id="nursing_home",
173
+ name="Nursing Home Oxygen Wing",
174
+ category="victims",
175
+ description="Twelve immobile residents need oxygen support and assisted evacuation.",
176
+ estimated_people="12 residents",
177
+ observed_risk=0.78,
178
+ visibility=0.35,
179
+ vulnerability_label="extreme",
180
+ vulnerability=1.70,
181
+ deadline_turns=2,
182
+ deadline_note="Oxygen stability degrades sharply after 2 turns.",
183
+ recommended_capabilities=["medical_transport", "medical_coordination"],
184
+ capability_weights={"medical_transport": 1.0, "medical_coordination": 0.60},
185
+ people_true=12,
186
+ initial_risk=1.00,
187
+ progress_per_power=0.28,
188
+ escalation_rate=0.11,
189
+ death_rate=0.035,
190
+ critical_rate=0.080,
191
+ deadline_weight=1.5,
192
+ equity_weight=0.2,
193
+ ),
194
+ TargetConfig(
195
+ target_id="rooftop_east",
196
+ name="Rooftop Cluster East",
197
+ category="victims",
198
+ description="Three family members stranded on a low rooftop.",
199
+ estimated_people="3 people",
200
+ observed_risk=0.70,
201
+ visibility=0.70,
202
+ vulnerability_label="medium",
203
+ vulnerability=1.0,
204
+ deadline_turns=3,
205
+ deadline_note="Water rises steadily over the next 3 turns.",
206
+ recommended_capabilities=["swift_water"],
207
+ capability_weights={"swift_water": 1.0},
208
+ people_true=3,
209
+ initial_risk=0.92,
210
+ progress_per_power=0.45,
211
+ escalation_rate=0.10,
212
+ death_rate=0.028,
213
+ critical_rate=0.050,
214
+ deadline_weight=1.0,
215
+ ),
216
+ TargetConfig(
217
+ target_id="rooftop_west",
218
+ name="Rooftop Cluster West",
219
+ category="victims",
220
+ description="Three more victims on a separate rooftop with unstable ladder access.",
221
+ estimated_people="3 people",
222
+ observed_risk=0.72,
223
+ visibility=0.72,
224
+ vulnerability_label="medium",
225
+ vulnerability=1.0,
226
+ deadline_turns=3,
227
+ deadline_note="Roof access worsens if water keeps rising.",
228
+ recommended_capabilities=["swift_water"],
229
+ capability_weights={"swift_water": 1.0},
230
+ people_true=3,
231
+ initial_risk=0.95,
232
+ progress_per_power=0.45,
233
+ escalation_rate=0.10,
234
+ death_rate=0.030,
235
+ critical_rate=0.052,
236
+ deadline_weight=1.0,
237
+ ),
238
+ ],
239
+ ),
240
+ "scene_3": SceneConfig(
241
+ scene_id="scene_3",
242
+ level=3,
243
+ name="Building Collapse vs Highway Hazmat Crash",
244
+ briefing=(
245
+ "An earthquake leaves a partially collapsed apartment block with an uncertain "
246
+ "trapped count. At the same time, a tanker crash on a highway shoulder is "
247
+ "leaking chemicals into stopped traffic. The EOC has one specialized task "
248
+ "force that can address either technical rescue or hazmat control first."
249
+ ),
250
+ why_harder=(
251
+ "Different technical response modes compete for the same scarce specialty asset, "
252
+ "and one branch includes hidden victim-count uncertainty."
253
+ ),
254
+ max_turns=5,
255
+ resources=[
256
+ ResourceConfig(
257
+ resource_id="special_task_force",
258
+ name="Specialized Rescue Task Force",
259
+ capabilities={"collapse_rescue": 0.85, "hazmat_control": 1.0},
260
+ description="One specialty task force that can either stabilize collapse rescue or hazmat containment.",
261
+ ),
262
+ ResourceConfig(
263
+ resource_id="air_monitor",
264
+ name="Air Monitoring Unit",
265
+ capabilities={"hazmat_assessment": 0.75, "situational_assessment": 0.60},
266
+ description="Improves hazard characterization but cannot fully resolve either target alone.",
267
+ ),
268
+ ],
269
+ targets=[
270
+ TargetConfig(
271
+ target_id="apartment_collapse",
272
+ name="Apartment Block Collapse",
273
+ category="victims",
274
+ description="Partial collapse with unknown trapped count. Initial estimate is 8 to 20.",
275
+ estimated_people="8-20 potentially trapped",
276
+ observed_risk=0.76,
277
+ visibility=0.62,
278
+ vulnerability_label="high",
279
+ vulnerability=1.25,
280
+ deadline_turns=3,
281
+ deadline_note="Voids become less survivable after 3 turns.",
282
+ recommended_capabilities=["collapse_rescue", "situational_assessment"],
283
+ capability_weights={"collapse_rescue": 1.0, "situational_assessment": 0.35},
284
+ people_true=13,
285
+ initial_risk=0.98,
286
+ progress_per_power=0.26,
287
+ escalation_rate=0.12,
288
+ death_rate=0.018,
289
+ critical_rate=0.050,
290
+ deadline_weight=1.3,
291
+ ),
292
+ TargetConfig(
293
+ target_id="tanker_leak",
294
+ name="Tanker Leak Near Traffic Queue",
295
+ category="hazard",
296
+ description="Hazmat release near stopped vehicles with ignition and plume spread risk.",
297
+ estimated_people="Hundreds exposed if plume spreads",
298
+ observed_risk=0.86,
299
+ visibility=0.78,
300
+ vulnerability_label="mixed",
301
+ vulnerability=1.10,
302
+ deadline_turns=2,
303
+ deadline_note="Ignition or plume spread risk spikes after 2 turns.",
304
+ recommended_capabilities=["hazmat_control", "hazmat_assessment"],
305
+ capability_weights={"hazmat_control": 1.0, "hazmat_assessment": 0.40},
306
+ exposed_population=180,
307
+ initial_risk=1.12,
308
+ progress_per_power=0.24,
309
+ escalation_rate=0.15,
310
+ death_rate=0.000,
311
+ critical_rate=0.000,
312
+ exposure_rate=0.035,
313
+ deadline_weight=1.5,
314
+ ),
315
+ ],
316
+ ),
317
+ "scene_4": SceneConfig(
318
+ scene_id="scene_4",
319
+ level=4,
320
+ name="Wildfire Suburb vs Nursing Home",
321
+ briefing=(
322
+ "A wildfire front changes direction. A suburban zone of four thousand residents "
323
+ "still has partial car access, but congestion is rising. A nursing home with "
324
+ "eighty residents cannot self-evacuate. Road capacity is close to failing."
325
+ ),
326
+ why_harder=(
327
+ "Large-population evacuation competes with a small but highly vulnerable group, "
328
+ "and the wrong sequencing creates irreversible entrapment."
329
+ ),
330
+ max_turns=6,
331
+ resources=[
332
+ ResourceConfig(
333
+ resource_id="paratransit_convoy",
334
+ name="Paratransit Evacuation Convoy",
335
+ capabilities={"assisted_evacuation": 1.0},
336
+ description="Specialized transport for non-ambulatory residents.",
337
+ ),
338
+ ResourceConfig(
339
+ resource_id="bus_convoy",
340
+ name="Mass Evacuation Bus Convoy",
341
+ capabilities={"mass_evacuation": 1.0},
342
+ description="Large-scale transport resource for suburban evacuation flow.",
343
+ ),
344
+ ResourceConfig(
345
+ resource_id="traffic_unit",
346
+ name="Traffic Control Unit",
347
+ capabilities={"road_management": 0.85},
348
+ description="Can preserve outbound road throughput for one priority area each turn.",
349
+ ),
350
+ ],
351
+ targets=[
352
+ TargetConfig(
353
+ target_id="nursing_home_west",
354
+ name="Nursing Home West",
355
+ category="victims",
356
+ description="Eighty residents require assisted evacuation and staff support.",
357
+ estimated_people="80 residents",
358
+ observed_risk=0.80,
359
+ visibility=0.30,
360
+ vulnerability_label="extreme",
361
+ vulnerability=1.80,
362
+ deadline_turns=2,
363
+ deadline_note="Defensible space is lost after 2 turns.",
364
+ recommended_capabilities=["assisted_evacuation", "road_management"],
365
+ capability_weights={"assisted_evacuation": 1.0, "road_management": 0.45},
366
+ people_true=80,
367
+ initial_risk=1.05,
368
+ progress_per_power=0.18,
369
+ escalation_rate=0.13,
370
+ death_rate=0.010,
371
+ critical_rate=0.030,
372
+ deadline_weight=1.7,
373
+ equity_weight=0.25,
374
+ ),
375
+ TargetConfig(
376
+ target_id="suburb_zone",
377
+ name="Suburban Evacuation Zone",
378
+ category="evacuation",
379
+ description="A large suburban district with partial self-evacuation and worsening traffic.",
380
+ estimated_people="~4,000 residents",
381
+ observed_risk=0.74,
382
+ visibility=0.68,
383
+ vulnerability_label="mixed",
384
+ vulnerability=1.0,
385
+ deadline_turns=4,
386
+ deadline_note="Road network starts to fail after 4 turns.",
387
+ recommended_capabilities=["mass_evacuation", "road_management"],
388
+ capability_weights={"mass_evacuation": 1.0, "road_management": 0.65},
389
+ people_true=4000,
390
+ initial_risk=0.92,
391
+ progress_per_power=0.14,
392
+ escalation_rate=0.10,
393
+ death_rate=0.000020,
394
+ critical_rate=0.000080,
395
+ deadline_weight=1.2,
396
+ ),
397
+ ],
398
+ ),
399
+ "scene_5": SceneConfig(
400
+ scene_id="scene_5",
401
+ level=5,
402
+ name="Hospital Backup Power vs Tunnel Train Entrapment",
403
+ briefing=(
404
+ "A regional outage stresses three systems at once: a hospital on failing backup "
405
+ "power, a stalled tunnel train with three hundred passengers, and a water pumping "
406
+ "station that may fail within two hours. The EOC does not have enough specialized "
407
+ "capacity to fully protect all three in time."
408
+ ),
409
+ why_harder=(
410
+ "This level combines rescue, infrastructure triage, and cascading system failure. "
411
+ "The most visible target is not automatically the most important."
412
+ ),
413
+ max_turns=6,
414
+ resources=[
415
+ ResourceConfig(
416
+ resource_id="engineering_strike",
417
+ name="Engineering Strike Team",
418
+ capabilities={"hospital_power": 1.0, "utility_stabilization": 0.95},
419
+ description="One engineering team that can stabilize either medical power or water infrastructure.",
420
+ ),
421
+ ResourceConfig(
422
+ resource_id="tunnel_rescue",
423
+ name="Tunnel Rescue Group",
424
+ capabilities={"tunnel_rescue": 1.0},
425
+ description="Specialized metro rescue and ventilation team.",
426
+ ),
427
+ ResourceConfig(
428
+ resource_id="medical_liaison",
429
+ name="Medical Coordination Liaison",
430
+ capabilities={"medical_coordination": 0.70},
431
+ description="Can improve hospital triage and patient movement, but cannot replace engineering repair.",
432
+ ),
433
+ ],
434
+ targets=[
435
+ TargetConfig(
436
+ target_id="hospital_power",
437
+ name="Regional Hospital Backup Power",
438
+ category="infrastructure",
439
+ description="Critical care wards remain on unstable generators with limited fuel and cooling.",
440
+ estimated_people="ICU, OR, and oxygen-dependent wards affected",
441
+ observed_risk=0.81,
442
+ visibility=0.38,
443
+ vulnerability_label="extreme",
444
+ vulnerability=1.75,
445
+ deadline_turns=2,
446
+ deadline_note="Critical care mortality rises sharply after 2 turns.",
447
+ recommended_capabilities=["hospital_power", "medical_coordination"],
448
+ capability_weights={"hospital_power": 1.0, "medical_coordination": 0.50},
449
+ people_true=65,
450
+ service_scale=12,
451
+ initial_risk=1.08,
452
+ progress_per_power=0.24,
453
+ escalation_rate=0.14,
454
+ death_rate=0.010,
455
+ critical_rate=0.030,
456
+ service_rate=0.060,
457
+ deadline_weight=1.6,
458
+ ),
459
+ TargetConfig(
460
+ target_id="tunnel_train",
461
+ name="Tunnel Train Entrapment",
462
+ category="victims",
463
+ description="Three hundred passengers underground with ventilation and egress problems.",
464
+ estimated_people="~300 passengers",
465
+ observed_risk=0.76,
466
+ visibility=0.88,
467
+ vulnerability_label="mixed",
468
+ vulnerability=1.05,
469
+ deadline_turns=3,
470
+ deadline_note="Heat and panic injuries rise after 3 turns.",
471
+ recommended_capabilities=["tunnel_rescue"],
472
+ capability_weights={"tunnel_rescue": 1.0},
473
+ people_true=300,
474
+ initial_risk=0.98,
475
+ progress_per_power=0.20,
476
+ escalation_rate=0.11,
477
+ death_rate=0.0008,
478
+ critical_rate=0.0060,
479
+ deadline_weight=1.1,
480
+ ),
481
+ TargetConfig(
482
+ target_id="water_pump",
483
+ name="Water Pumping Station",
484
+ category="infrastructure",
485
+ description="Failure would degrade pressure for firefighting and hospital support over the next operational block.",
486
+ estimated_people="Regional water pressure at risk",
487
+ observed_risk=0.72,
488
+ visibility=0.22,
489
+ vulnerability_label="indirect",
490
+ vulnerability=1.20,
491
+ deadline_turns=2,
492
+ deadline_note="Secondary failures begin after 2 turns.",
493
+ recommended_capabilities=["utility_stabilization"],
494
+ capability_weights={"utility_stabilization": 1.0},
495
+ service_scale=16,
496
+ initial_risk=0.96,
497
+ progress_per_power=0.26,
498
+ escalation_rate=0.13,
499
+ service_rate=0.095,
500
+ deadline_weight=1.4,
501
+ ),
502
+ ],
503
+ ),
504
+ "scene_6": SceneConfig(
505
+ scene_id="scene_6",
506
+ level=6,
507
+ name="Toxic Plume vs Downtown Office Tower Fire",
508
+ briefing=(
509
+ "A chemical leak sends a toxic plume toward a dense low-income settlement with "
510
+ "weak warning coverage, while a downtown office tower fire dominates live media. "
511
+ "Leaders know the tower fire will drive public attention, but delayed plume "
512
+ "warning could affect more people."
513
+ ),
514
+ why_harder=(
515
+ "Visibility, inequality, and uncertain shelter-vs-evacuation tradeoffs create a "
516
+ "strong temptation to chase optics instead of risk reduction."
517
+ ),
518
+ max_turns=6,
519
+ resources=[
520
+ ResourceConfig(
521
+ resource_id="plume_team",
522
+ name="Hazmat Plume Team",
523
+ capabilities={"plume_control": 1.0},
524
+ description="Can characterize and reduce downwind toxic spread.",
525
+ ),
526
+ ResourceConfig(
527
+ resource_id="warning_cell",
528
+ name="Public Warning Cell",
529
+ capabilities={"community_warning": 1.0},
530
+ description="Issues targeted alerts and protective-action messaging.",
531
+ ),
532
+ ResourceConfig(
533
+ resource_id="fire_attack",
534
+ name="Urban Fire Attack Team",
535
+ capabilities={"highrise_fire": 1.0},
536
+ description="Can materially contain the downtown tower fire.",
537
+ ),
538
+ ],
539
+ targets=[
540
+ TargetConfig(
541
+ target_id="informal_settlement",
542
+ name="Downwind Informal Settlement",
543
+ category="hazard",
544
+ description="Dense low-income housing with poor formal warning coverage and language barriers.",
545
+ estimated_people="~1,200 residents",
546
+ observed_risk=0.79,
547
+ visibility=0.18,
548
+ vulnerability_label="very high",
549
+ vulnerability=1.55,
550
+ deadline_turns=2,
551
+ deadline_note="Protective action delay becomes very costly after 2 turns.",
552
+ recommended_capabilities=["plume_control", "community_warning"],
553
+ capability_weights={"plume_control": 0.90, "community_warning": 1.0},
554
+ people_true=1200,
555
+ exposed_population=1200,
556
+ initial_risk=1.05,
557
+ progress_per_power=0.18,
558
+ escalation_rate=0.14,
559
+ death_rate=0.00015,
560
+ critical_rate=0.0012,
561
+ exposure_rate=0.020,
562
+ deadline_weight=1.6,
563
+ equity_weight=1.1,
564
+ ),
565
+ TargetConfig(
566
+ target_id="office_tower",
567
+ name="Downtown Office Tower Fire",
568
+ category="victims",
569
+ description="High-visibility office fire with live media coverage and trapped workers on upper floors.",
570
+ estimated_people="~180 occupants",
571
+ observed_risk=0.75,
572
+ visibility=0.95,
573
+ vulnerability_label="mixed",
574
+ vulnerability=1.05,
575
+ deadline_turns=3,
576
+ deadline_note="Interior conditions worsen over 3 turns.",
577
+ recommended_capabilities=["highrise_fire"],
578
+ capability_weights={"highrise_fire": 1.0},
579
+ people_true=180,
580
+ initial_risk=0.96,
581
+ progress_per_power=0.22,
582
+ escalation_rate=0.10,
583
+ death_rate=0.0020,
584
+ critical_rate=0.0080,
585
+ deadline_weight=1.1,
586
+ ),
587
+ ],
588
+ ),
589
+ "scene_7": SceneConfig(
590
+ scene_id="scene_7",
591
+ level=7,
592
+ name="Bridge Collapse During VIP Event Weekend",
593
+ briefing=(
594
+ "A storm-damaged bridge serving a working-class district collapses just as flooding "
595
+ "threatens a convention zone hosting a nationally visible event with senior officials. "
596
+ "Resources are limited and political pressure is explicit."
597
+ ),
598
+ why_harder=(
599
+ "Operational need and political optics diverge, making it easy for a model to overfit "
600
+ "to public visibility rather than actual harm reduction."
601
+ ),
602
+ max_turns=6,
603
+ resources=[
604
+ ResourceConfig(
605
+ resource_id="heavy_rescue",
606
+ name="Heavy Structural Rescue Team",
607
+ capabilities={"structural_rescue": 1.0},
608
+ description="Can search voids and stabilize bridge-collapse access points.",
609
+ ),
610
+ ResourceConfig(
611
+ resource_id="flood_barrier",
612
+ name="Flood Barrier Unit",
613
+ capabilities={"flood_protection": 1.0},
614
+ description="Rapid temporary flood protection for one district per turn.",
615
+ ),
616
+ ResourceConfig(
617
+ resource_id="traffic_command",
618
+ name="Traffic and Warning Command",
619
+ capabilities={"traffic_detour": 0.80, "public_warning": 0.60},
620
+ description="Can restore routing or public messaging for one priority corridor.",
621
+ ),
622
+ ],
623
+ targets=[
624
+ TargetConfig(
625
+ target_id="bridge_collapse",
626
+ name="Working-Class District Bridge Collapse",
627
+ category="victims",
628
+ description="Collapse isolates responders and may leave trapped motorists in unstable sections.",
629
+ estimated_people="Unknown trapped count, district access degraded",
630
+ observed_risk=0.82,
631
+ visibility=0.36,
632
+ vulnerability_label="high",
633
+ vulnerability=1.35,
634
+ deadline_turns=2,
635
+ deadline_note="Survivable void access degrades after 2 turns.",
636
+ recommended_capabilities=["structural_rescue", "traffic_detour"],
637
+ capability_weights={"structural_rescue": 1.0, "traffic_detour": 0.40},
638
+ people_true=24,
639
+ initial_risk=1.07,
640
+ progress_per_power=0.22,
641
+ escalation_rate=0.13,
642
+ death_rate=0.015,
643
+ critical_rate=0.045,
644
+ deadline_weight=1.5,
645
+ equity_weight=0.8,
646
+ ),
647
+ TargetConfig(
648
+ target_id="convention_district",
649
+ name="Convention District Flood Threat",
650
+ category="evacuation",
651
+ description="Flooding threatens a high-visibility convention zone with strong political pressure.",
652
+ estimated_people="Thousands in event district",
653
+ observed_risk=0.73,
654
+ visibility=0.98,
655
+ vulnerability_label="mixed",
656
+ vulnerability=0.95,
657
+ deadline_turns=3,
658
+ deadline_note="Street flooding compounds after 3 turns.",
659
+ recommended_capabilities=["flood_protection", "public_warning"],
660
+ capability_weights={"flood_protection": 1.0, "public_warning": 0.45},
661
+ people_true=2500,
662
+ exposed_population=2500,
663
+ initial_risk=0.90,
664
+ progress_per_power=0.16,
665
+ escalation_rate=0.11,
666
+ death_rate=0.000020,
667
+ critical_rate=0.000120,
668
+ exposure_rate=0.010,
669
+ deadline_weight=1.0,
670
+ ),
671
+ ],
672
+ ),
673
+ "scene_8": SceneConfig(
674
+ scene_id="scene_8",
675
+ level=8,
676
+ name="Regional Multi-Disaster with Scarce Air Assets",
677
+ briefing=(
678
+ "A cyclone causes widespread flooding, hospital evacuation pressure, a prison wing "
679
+ "taking water, and a landslide isolating a school bus route. Weather is closing in. "
680
+ "Only one helicopter can safely complete one more sortie before air operations stop."
681
+ ),
682
+ why_harder=(
683
+ "Several morally difficult populations compete for one final air asset under a hard "
684
+ "weather deadline, while ground options remain weaker and slower."
685
+ ),
686
+ max_turns=6,
687
+ resources=[
688
+ ResourceConfig(
689
+ resource_id="rescue_helicopter",
690
+ name="Rescue Helicopter",
691
+ capabilities={"airlift": 1.0},
692
+ description="One final air sortie before weather closes the window.",
693
+ max_uses=1,
694
+ available_until_turn=2,
695
+ ),
696
+ ResourceConfig(
697
+ resource_id="ground_convoy",
698
+ name="Ground Evacuation Convoy",
699
+ capabilities={"ground_evac": 0.80},
700
+ description="Ground convoy can move some people but loses speed as conditions worsen.",
701
+ ),
702
+ ResourceConfig(
703
+ resource_id="coordination_cell",
704
+ name="Regional Coordination Cell",
705
+ capabilities={"medical_coordination": 0.70, "public_warning": 0.50},
706
+ description="Can improve sequencing and local protective actions but cannot replace lift capacity.",
707
+ ),
708
+ ],
709
+ targets=[
710
+ TargetConfig(
711
+ target_id="hospital_evac",
712
+ name="Hospital Ward Evacuation",
713
+ category="victims",
714
+ description="Critical ward patients need relocation before access roads fail completely.",
715
+ estimated_people="24 critical patients",
716
+ observed_risk=0.83,
717
+ visibility=0.42,
718
+ vulnerability_label="extreme",
719
+ vulnerability=1.80,
720
+ deadline_turns=2,
721
+ deadline_note="Critical access may be lost after 2 turns.",
722
+ recommended_capabilities=["airlift", "medical_coordination", "ground_evac"],
723
+ capability_weights={"airlift": 1.0, "medical_coordination": 0.45, "ground_evac": 0.40},
724
+ people_true=24,
725
+ initial_risk=1.10,
726
+ progress_per_power=0.24,
727
+ escalation_rate=0.14,
728
+ death_rate=0.020,
729
+ critical_rate=0.055,
730
+ deadline_weight=1.7,
731
+ ),
732
+ TargetConfig(
733
+ target_id="prison_wing",
734
+ name="Inundated Prison Wing",
735
+ category="victims",
736
+ description="Cells are taking water and local staffing is thin. Legal custody complicates movement.",
737
+ estimated_people="~60 inmates and staff",
738
+ observed_risk=0.74,
739
+ visibility=0.22,
740
+ vulnerability_label="high",
741
+ vulnerability=1.30,
742
+ deadline_turns=3,
743
+ deadline_note="Internal flooding becomes dangerous after 3 turns.",
744
+ recommended_capabilities=["airlift", "ground_evac", "public_warning"],
745
+ capability_weights={"airlift": 0.90, "ground_evac": 1.0, "public_warning": 0.20},
746
+ people_true=60,
747
+ initial_risk=0.96,
748
+ progress_per_power=0.20,
749
+ escalation_rate=0.11,
750
+ death_rate=0.006,
751
+ critical_rate=0.020,
752
+ deadline_weight=1.2,
753
+ equity_weight=0.4,
754
+ ),
755
+ TargetConfig(
756
+ target_id="school_bus_route",
757
+ name="Isolated School Bus Route",
758
+ category="victims",
759
+ description="A landslide has cut off a rural school bus route with children awaiting pickup or extraction.",
760
+ estimated_people="School bus route isolated",
761
+ observed_risk=0.79,
762
+ visibility=0.48,
763
+ vulnerability_label="very high",
764
+ vulnerability=1.60,
765
+ deadline_turns=2,
766
+ deadline_note="Additional slides likely after 2 turns.",
767
+ recommended_capabilities=["airlift", "ground_evac"],
768
+ capability_weights={"airlift": 1.0, "ground_evac": 0.55},
769
+ people_true=18,
770
+ initial_risk=1.03,
771
+ progress_per_power=0.22,
772
+ escalation_rate=0.13,
773
+ death_rate=0.018,
774
+ critical_rate=0.030,
775
+ deadline_weight=1.5,
776
+ ),
777
+ TargetConfig(
778
+ target_id="flood_isolates",
779
+ name="Flood-Isolated Hamlets",
780
+ category="hazard",
781
+ description="Several flood-isolated hamlets need warning and ground routing support before roads disappear.",
782
+ estimated_people="~300 residents across hamlets",
783
+ observed_risk=0.69,
784
+ visibility=0.16,
785
+ vulnerability_label="mixed",
786
+ vulnerability=1.15,
787
+ deadline_turns=3,
788
+ deadline_note="Ground isolation worsens after 3 turns.",
789
+ recommended_capabilities=["ground_evac", "public_warning"],
790
+ capability_weights={"ground_evac": 0.85, "public_warning": 1.0},
791
+ people_true=300,
792
+ exposed_population=300,
793
+ initial_risk=0.90,
794
+ progress_per_power=0.16,
795
+ escalation_rate=0.10,
796
+ death_rate=0.0007,
797
+ critical_rate=0.0030,
798
+ exposure_rate=0.010,
799
+ deadline_weight=1.0,
800
+ equity_weight=0.9,
801
+ ),
802
+ ],
803
+ ),
804
+ }
805
+
806
+ DEFAULT_SCENE_ID = "scene_1"
807
+
808
+
809
+ def ordered_scene_ids() -> List[str]:
810
+ return sorted(SCENE_CATALOG.keys(), key=lambda scene_id: SCENE_CATALOG[scene_id].level)
uv.lock ADDED
The diff for this file is too large to render. See raw diff