Text Generation
Transformers
English
cybersecurity
pretraining
Inference Endpoints
youyaoching commited on
Commit
ef0ffad
·
verified ·
1 Parent(s): 0f15c58

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -3
README.md CHANGED
@@ -1,3 +1,76 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - trendmicro-ailab/Primus-FineWeb
5
+ - trendmicro-ailab/Primus-Seed
6
+ language:
7
+ - en
8
+ base_model:
9
+ - meta-llama/Llama-3.1-8B-Instruct
10
+ pipeline_tag: text-generation
11
+ library_name: transformers
12
+ tags:
13
+ - cybersecurity
14
+ - pretraining
15
+ extra_gated_fields:
16
+ Affiliation: text
17
+ Country: country
18
+ I want to use this model for:
19
+ type: select
20
+ options:
21
+ - Research
22
+ - Commercial
23
+ - label: Other
24
+ value: other
25
+ Job title:
26
+ type: select
27
+ options:
28
+ - Student
29
+ - Research graduate
30
+ - AI researcher
31
+ - AI developer/engineer
32
+ - Cybersecurity researcher
33
+ - Reporter
34
+ - Other
35
+ geo: ip_location
36
+ ---
37
+
38
+ # Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
39
+
40
+ <img src="https://i.imgur.com/PtqeTZw.png" alt="Primus Overview" width="60%">
41
+
42
+ > TL;DR: Llama-Primus-Base is a foundation model based on Llama-3.1-8B-Instruct, continually pre-trained on Primus-Seed (0.2B) and Primus-FineWeb (2.57B). Primus-Seed is a high-quality, manually curated cybersecurity text dataset, while Primus-FineWeb consists of cybersecurity texts filtered from FineWeb, a refined version of Common Crawl. By pretraining on such a large-scale cybersecurity corpus, it achieves a 🚀**15.88%** improvement in aggregated scores across multiple cybersecurity benchmarks, demonstrating the effectiveness of cybersecurity-specific pretraining.
43
+
44
+ **🔥 For more details, please refer to the paper: [[📄Paper]](https://arxiv.org/abs/2502.11191).**
45
+
46
+ ## Introduction
47
+
48
+ Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, with promising applications in specialized domains such as finance, law, and biomedicine. However, in the domain of cybersecurity, we noticed a lack of open-source datasets specifically designed for LLM pre-training—even though much research has shown that LLMs acquire their knowledge during pre-training. To fill this gap, we present a collection of datasets covering multiple stages of cybersecurity LLM training, including pre-training (_Primus-Seed_ and _Primus-FineWeb_), instruction fine-tuning (_Primus-Instruct_), and reasoning data for distillation (_Primus-Reasoning_). Based on these datasets and Llama-3.1-8B-Instruct, we developed _Llama-Primus-Base_, _Llama-Primus-Merged_, and _Llama-Primus-Reasoning_. This model card is **Llama-Primus-Base**.
49
+
50
+ > **Note:** No TrendMicro customer information is included.
51
+
52
+ ## Cybersecurity Benchmark Results
53
+
54
+ | **Metric** (5-shot, w/o CoT)| **Llama-3.1-8B-Instruct** | **Llama-Primus-Base** |
55
+ |---------------------------------|---------------------------|------------------------------|
56
+ | **CISSP (Exams in book)** | 0.7073 | **0.7230** |
57
+ | **CTI-Bench (MCQ)** | 0.6420 | **0.6676** |
58
+ | **CTI-Bench (CVE → CWE)** | 0.5910 | **0.6780** |
59
+ | **CTI-Bench (CVSS, _lower is better_)** | 1.2712 | **1.0912** |
60
+ | **CTI-Bench (ATE)** | 0.2721 | **0.3140** |
61
+ | **CyberMetric (500)** | 0.8560 | **0.8660** |
62
+ | **SecEval** | 0.4966 | **0.5007** |
63
+ | **_Agg._** | 2.29 | **2.66** ↑**15.88%** 🔥 |
64
+
65
+ CTI-Bench (CVSS) is scored using Mean Absolute Deviation (_lower is better_), CTI-ATE uses F1 score, and the others use accuracy. The aggregate score (_Agg._) is the sum of all benchmarks, with CTI-Bench (CVSS) negated.
66
+
67
+ References:
68
+ - **CyberMetric**: [CyberMetric: A Benchmark Dataset based on Retrieval-Augmented...](https://arxiv.org/abs/2402.07688)
69
+ - **CtiBench**: [CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence](https://arxiv.org/abs/2406.07599)
70
+ - **SecEval**: [SecEval: A Comprehensive Benchmark for Evaluating Cybersecurity Knowledge of Foundation Models](https://xuanwuai.github.io/SecEval/)
71
+
72
+ ## About _Primus_
73
+ _Primus_ is Trend Micro's pioneering family of lightweight, state-of-the-art open cybersecurity language models and datasets. Developed through our cutting-edge research initiatives and advanced technology, these resources share the innovative foundation that powers our enterprise-class [Trend Cybertron](https://newsroom.trendmicro.com/2025-02-25-Trend-Micro-Puts-Industry-Ahead-of-Cyberattacks-with-Industrys-First-Proactive-Cybersecurity-AI) solution. As an industry leader in cybersecurity, Trend Micro is proud to contribute these powerful, efficiency-optimized models and datasets to the community, while maintaining the excellence and reliability that define our global security standards.
74
+
75
+ ## License
76
+ This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.