tbd
Rare and challenging driving scenarios are critical for autonomous vehicle development. Since they are difficult to encounter, simulating or generating them using generative models is a popular approach. Following previous efforts of structuring driving scenario representations in a layer model, we propose a structured five-layer model to improve evaluation and generation of rare scenarios. We use this model alongside large foundational models to generate new driving scenarios using a data augmentation strategy. Unlike previous representations, our structure introduces subclasses and characteristics for every agent of the scenario, allowing us to compare them using an embedding in our layer-model space. We study and adapt two metrics to evaluate the relevance of a synthetic dataset in the context of a structured representation: the diversity score estimates how different the scenarios of a dataset are from one another, while the originality score calculates how similar a synthetic dataset is from a real reference set. This paper showcase both metrics in different generation setup, as well as a qualitative evaluation of synthetic videos generated from structured scenario descriptions.

Layer-wise evaluation:
| Metric | L1 | L2 | L3 | L4 | L5 | |
|---|---|---|---|---|---|---|
| Generated Scenes | CO | 0.93 | 0.59 | n/a | 0.85 | 0.81 |
| CD | 0.68 | 0.18 | n/a | 0.63 | 0.50 | |
| Reference Scenes | CD | 0.54 | 0.64 | n/a | 0.91 | 0.52 |
Scenario structure evaluation:
| Metric | Structure | L1 | L2 | L3 | L4 | L5 |
|---|---|---|---|---|---|---|
| Originality | Unstructured | 0.91 | 0.88 | 0.85 | 0.89 | 0.83 |
| Soft | 0.87 | 0.84 | 0.83 | 0.84 | 0.82 | |
| Hard | 0.86 | 0.84 | 0.84 | 0.86 | 0.81 | |
| Diversity | Reference | 0.92 | 0.92 | 0.89 | 0.92 | 0.96 |
| Unstructured | 0.88 | 0.85 | 0.78 | 0.84 | 0.82 | |
| Soft | 0.81 | 0.78 | 0.76 | 0.78 | 0.80 | |
| Hard | 0.88 | 0.83 | 0.83 | 0.86 | 0.81 |
System prompts:
Role: You are a driving scenario generator. Your purpose is to generate new confusing and challenging Edge Case scenarios from the input scenario.
Format: Your output must follow the 5 layer model of the input scenario description where:
For each layer, your textual description must be concise, but as exhaustive as possible. For the fourth layer in particular, define each component in relation to the ego vehicle.
Task: Please only modify the layer specified in the prompt to generate an Edge Case and change nothing in the other layers (MOST IMPORTANT) Your output must contains EXACTLY THE SAME TEXT in every layer other than the one you are tasked to modify (MOST IMPORTANT)
Synthetic scene generated by Veo 3 after editing layer 4:
structured, image-guided:
unstructured, image-guided:
scene 3:
structured, image-guided:
unstructured, image-guided: