| | --- |
| | language: |
| | - en |
| | pipeline_tag: text-generation |
| | tags: |
| | - gguf |
| | - llama.cpp |
| | - lmstudio |
| | - traffic-signal-control |
| | - simulation |
| | license: cc-by-nc-4.0 |
| | --- |
| | |
| | # DeepSignal (GGUF) |
| |
|
| | This repository provides GGUF model files for local inference (e.g., `llama.cpp` / LM Studio). It contains two models for traffic-signal-control tasks. |
| | For details, check our repository at [`AIMSLaboratory/DeepSignal`](https://github.com/AIMSLaboratory/DeepSignal). |
| |
|
| | ## Models |
| |
|
| | This repository contains two models: |
| |
|
| | - **DeepSignal-Phase-4B-V1** — next signal-phase prediction (predicts which phase to activate next and for how long) |
| | - **DeepSignal-CyclePlan-4B-V1** — signal-cycle timing optimization (outputs green-time allocations for every phase in the upcoming cycle) |
| |
|
| | ## Model Files |
| |
|
| | | Filename | Model | Quantization | Size | Description | |
| | |:---|:---:|:---:|:---:|:---| |
| | | `DeepSignal-Phase-4B_V1.F16.gguf` | Phase | F16 (full precision) | ~8 GB | Phase model, full precision | |
| | | `DeepSignal-CyclePlan-4B_V1.F16.gguf` | CyclePlan | F16 (full precision) | ~8 GB | CyclePlan model, full precision | |
| | | `DeepSignal-CyclePlan-4B_V1.Q4_K_M.gguf` | CyclePlan | Q4_K_M (4-bit quantized) | ~2.5 GB | CyclePlan model, quantized (recommended for local inference) | |
| |
|
| | ## DeepSignal-Phase-4B-V1 |
| |
|
| | DeepSignal-Phase-4B-V1 is designed for **next signal-phase prediction**. Given the current traffic scene and state at an intersection, it predicts which signal phase to activate next and for how long. |
| |
|
| | **Quickstart (llama.cpp):** |
| |
|
| | ```bash |
| | llama-cli -m DeepSignal-Phase-4B_V1.F16.gguf -p "You are a traffic management expert. You can use your traffic knowledge to solve the traffic signal control task. |
| | Based on the given traffic scene and state, predict the next signal phase and its duration. |
| | You must answer directly, the format must be: next signal phase: {number}, duration: {seconds} seconds |
| | where the number is the phase index (starting from 0) and the seconds is the duration (usually between 20-90 seconds)." |
| | ``` |
| |
|
| | *You need to input the scene (total number of phases, which phases control which lanes/directions, and current phase ID/number, etc.) and state (number of queuing vehicles per lane, throughput vehicles per lane during the current phase, etc.)* |
| |
|
| | ## DeepSignal-CyclePlan-4B-V1 |
| |
|
| | DeepSignal-CyclePlan-4B-V1 is designed for **signal-cycle timing optimization**. It takes predicted traffic state data for the upcoming cycle as input and outputs green-time allocations for every phase. |
| |
|
| | **System Prompt:** |
| |
|
| | ``` |
| | You are a traffic signal timing optimization expert. |
| | Please carefully analyze the predicted traffic states for each phase in the next cycle, provide the timing plan for the next cycle, and give your reasoning process. |
| | Place the reasoning process between <start_working_out> and <end_working_out>. |
| | Then, place your final plan between <SOLUTION> and </SOLUTION>. |
| | ``` |
| |
|
| | **Quickstart (llama.cpp, Q4_K_M recommended for local inference):** |
| |
|
| | ```bash |
| | llama-cli -m DeepSignal-CyclePlan-4B_V1.Q4_K_M.gguf \ |
| | -p 'You are a traffic signal timing optimization expert. |
| | Please carefully analyze the predicted traffic states for each phase in the next cycle, provide the timing plan for the next cycle, and give your reasoning process. |
| | Place the reasoning process between <start_working_out> and <end_working_out>. |
| | Then, place your final plan between <SOLUTION> and </SOLUTION>. |
| | |
| | 【cycle_predict_input_json】{ |
| | "prediction": { |
| | "as_of": "2026-02-22T10:00:00", |
| | "phase_waits": [ |
| | {"phase_id": 0, "pred_saturation": 0.8, "min_green": 20, "max_green": 60, "capacity": 100}, |
| | {"phase_id": 1, "pred_saturation": 0.5, "min_green": 15, "max_green": 45, "capacity": 80} |
| | ] |
| | } |
| | }【/cycle_predict_input_json】 |
| | |
| | Task (must complete): |
| | Mainly based on prediction.phase_waits pred_saturation (already calculated), output the final green light time for each phase in the next cycle (unit: seconds), while satisfying hard constraints.' |
| | ``` |
| |
|
| | **Input format**: JSON wrapped in `【cycle_predict_input_json】...【/cycle_predict_input_json】` tags, containing `prediction.phase_waits` — an array of per-phase objects with `phase_id`, `pred_saturation`, `min_green`, `max_green`, and `capacity`. Here `pred_saturation = pred_wait / capacity`, where `pred_wait` is the predicted number of waiting vehicles for this phase in the next cycle, which can be computed using forecasting models (e.g., LSTM, TCN) based on historical traffic data. |
| |
|
| | **Output format**: A JSON array of objects `[{"phase_id": <int>, "final": <int>}, ...]`, where `final` is the allocated green time in integer seconds for each phase. |
| |
|
| | ## Evaluation (Traffic Simulation) |
| |
|
| | ### Performance Metrics Comparison by Model (Phase) * |
| |
|
| | | Model | Avg Saturation | Avg Cumulative Queue Length (veh⋅min) | Avg Throughput (veh/5min) | Avg Response Time (s) | |
| | |:---:|:---:|:---:|:---:|:---:| |
| | | [`GPT-OSS-20B (thinking)`](https://huggingface.co/openai/gpt-oss-20b) | 0.380 | 14.088 | 77.910 | 6.768 | |
| | | **DeepSignal-Phase-4B (thinking, Ours)** | 0.422 | 15.703 | **79.883** | 2.131 | |
| | | [`Qwen3-30B-A3B`](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct) | 0.431 | 17.046 | 79.059 | 2.727 | |
| | | [`Qwen3-4B`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | 0.466 | 57.699 | 75.712 | 1.994 | |
| | | Max Pressure | 0.465 | 23.022 | 77.236 | ** | |
| | | [`LightGPT-8B-Llama3`](https://huggingface.co/lightgpt/LightGPT-8B-Llama3) | 0.523 | 54.384 | 75.512 | 3.025*** | |
| | |
| | `*`: Each simulation scenario runs for 60 minutes. We discard the first **5 minutes** as warm-up, then compute metrics over the next **20 minutes** (minute 5 to 25). We cap the evaluation window because, when an LLM controls signal timing for only a single intersection, spillback from neighboring intersections may occur after ~20+ minutes and destabilize the scenario. All evaluations are conducted on a **Mac Studio M3 Ultra**. |
| | `**`: Max Pressure is a fixed signal-timing optimization algorithm (not an LLM), so we omit its Avg Response Time; this metric is only defined for LLM-based signal-timing optimization. |
| | `***`: For LightGPT-8B-Llama3, Avg Response Time is computed using only the successful responses. |
| |
|
| | **Conclusion**: Among thinking-enabled models, **DeepSignal-Phase-4B** achieves the highest throughput (79.883 veh/5min) with a response time of only 2.131s. GPT-OSS-20B achieves the best saturation (0.380) but with higher response latency (6.768s). |
| |
|
| | ### Performance Metrics Comparison by Model (CyclePlan) * |
| |
|
| | | Model | Format Success Rate (%) | Avg Queue Vehicles | Avg Delay per Vehicle (s) | Throughput (veh/min) | Avg Response Time (s) | |
| | |:---:|:---:|:---:|:---:|:---:|:---:| |
| | | **DeepSignal-CyclePlan-4B-V1 F16 (thinking, Ours)** | **100.0** | **3.504** | **27.747** | **8.611** | 4.351 | |
| | | [`GLM-4.7-Flash (thinking)`](https://huggingface.co/zai-org/glm-4.7-flash) | 100.0 | 7.323 | 29.422 | 8.567 | 36.388 | |
| | | DeepSignal-CyclePlan-4B-V1 Q4_K_M (thinking, Ours) | 98.1 | 4.783 | 29.891 | 7.722 | 1.674 | |
| | | [`Qwen3-30B-A3B`](https://huggingface.co/Qwen/Qwen3-30B-A3B-2507) | 97.1 | 6.938 | 31.135 | 7.578 | 7.885 | |
| | | [`LightGPT-8B-Llama3`](https://huggingface.co/lightgpt/LightGPT-8B-Llama3) | 68.0 | 5.026 | 31.266 | 7.380 | 167.373 | |
| | | [`GPT-OSS-20B (thinking)`](https://huggingface.co/openai/gpt-oss-20b) | 65.4 | 6.289 | 31.947 | 7.247 | 4.919 | |
| | | [`Qwen3-4B (thinking)`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | 54.1 | 10.060 | 48.895 | 7.096 | 122.333 | |
| |
|
| | `*`: Each simulation scenario runs for 60 minutes. We discard the first **5 minutes** as warm-up, then compute metrics over the next **20 minutes** (minute 5 to 25). All evaluations are conducted on a **Mac Studio M3 Ultra**. |
| |
|
| | **Conclusion**: DeepSignal-CyclePlan-4B-V1 (F16) achieves a 100% format success rate, the lowest average queue vehicles (3.504), and the highest throughput (8.611 veh/min) among all evaluated models. The Q4_K_M quantized version maintains strong performance with 98.1% format success rate while offering the fastest response time (1.674s). |
| |
|
| | ## License |
| |
|
| | This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). |
| | Commercial use is strictly prohibited. |
| |
|