Preprint Article Version 1 This version is not peer-reviewed

CS-Eval—A Concise Benchmark for Evaluating the Security Risks of Large Language Models

Version 1 : Received: 12 September 2024 / Approved: 13 September 2024 / Online: 13 September 2024 (15:46:19 CEST)

How to cite: Zhang, Y.; Gao, Y.; Yang, L. CS-Eval—A Concise Benchmark for Evaluating the Security Risks of Large Language Models. Preprints 2024, 2024091098. https://doi.org/10.20944/preprints202409.1098.v1 Zhang, Y.; Gao, Y.; Yang, L. CS-Eval—A Concise Benchmark for Evaluating the Security Risks of Large Language Models. Preprints 2024, 2024091098. https://doi.org/10.20944/preprints202409.1098.v1

Abstract

Large language models (LLMs) are essential to the field of natural language processing, and as their applications expand, security risks have become increasingly prominent. This paper introduces a novel benchmark for evaluating LLMs security, termed CS-Eval, designed to effectively assess the models' ability to address vulnerabilities. CS-Eval targets seven key security risks: ethical dilemmas, marginal topics, error detection, detailed event handling, cognitive bias, logical reasoning, and privacy identification, and establishes a Multi-Security Hazard Dataset (MSHD). The evaluated models include GPT-4o, Llama-3-70B, Claude-3-Opus, ERNIE-4.0, Abab-6.5, Qwen1.5-110B, Gemini-1.5-Pro, Doubao-Pro, SenseChat-V5, and GLM-4. We analyzed each model's performance in relation to these security risks and provided recommendations for improvement. Experimental results demonstrate varying levels of effectiveness across models, with GPT-4o exhibiting the best overall performance. Moreover, the relationship between security enhancement and model capa-bility is nonlinear, indicating that improving safety requires a multifaceted approach, considering various factors in both development and application.

Keywords

Large language models (LLMs); security risks; security benchmark; Multi-Security Hazard Dataset (MSHD)

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.