I finetuned a RobertaForSequenceClassification model which is initialized from CodeBert [https://huggingface.co/microsoft/codebert-base] to judge whether a code is vulnerable or not. I selected balanced samples from MSR dataset [https://github.com/ZeoVan/MSR_20_Code_vulnerability_CSV_Dataset] for training, validation, and testing. The "func_before" is used for code classification. All the data is in the file "msr.csv".

Test Reulsts: acc 0.7022935779816514, f1 0.6482384823848238, precision 0.7920529801324503, recall 0.5486238532110091

Downloads last month
7
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.