Primus
Paper β’ 2502.11191 β’ Published β’ 4Note Start by reading the πPrimus Paper! To the best of our knowledge, we are the ππ½ββοΈ first to release datasets covering cybersecurity pretraining, IFT, and reasoning distillation. Of course, we are also the first to pretrain an LLM on a large-scale cybersecurity corpus.
trend-cybertron/Llama-Primus-Base
Text Generation β’ UpdatedNote Based on Llama-3.1-8B-Instruct, continually pretrained on 2.77B tokens of cybersecurity text, achieving a π15.88% improvement in the aggregated score across multiple cybersecurity benchmarks.
trend-cybertron/Llama-Primus-Merged
Text Generation β’ UpdatedNote Instruct Model! While maintaining nearly the same instruction-following capability as Llama-3.1-8B-Instruct, achieving a π14.84% improvement across multiple cybersecurity benchmarks.
trend-cybertron/Llama-Primus-Reasoning
Text Generation β’ UpdatedNote Distilled on reasoning and reflection data from o1-preview for cybersecurity tasks, achieving a π10% improvement on CISSP.
trend-cybertron/Primus-Seed
Updated β’ 36Note Includes high-quality cybersecurity texts manually collected from reputable sources such as wikipedia, MITRE, cybersecurity company websites, CTI, and more.
trend-cybertron/Primus-FineWeb
Updated β’ 45 β’ 1Note Includes 2.57B tokens of cybersecurity texts filtered from FineWeb.
trend-cybertron/Primus-Instruct
Updated β’ 39 β’ 1Note Includes approximately 1K QA pairs covering common cybersecurity business scenarios.
trend-cybertron/Primus-Reasoning
Updated β’ 38 β’ 1Note Includes reasoning and reflection data generated by o1-preview on cybersecurity tasks for distillation.