Meta x OpenEnv Hackathon (India 2026)

SafeGen Arena

An OpenEnv-compliant RL environment that trains a small Defender LM to safely rewrite image-generation prompts — allow / transform / reject, per prompt, with intent preserved.

Read first

Blog post (full story)
Why this loop works — for judges
README
Architecture, results, repro steps

Code & runs

GitHub repo
Somin-Aggarwal/SafeGen-Arena
Colab training notebook
Judge-runnable end-to-end
WandB run (v4 shipped)
1300 GRPO steps, +0.33 plateau
Shipped LoRA adapter (v4)
17 MB, GRPO-trained

OpenEnv API (live)

GET /health GET /state POST /reset POST /step GET /docs (OpenAPI)