Evaluation Setup
Qualitative security assessment of the OpenAI ChatGPT Plugin ecosystem
Benchmarks:
- OpenAI Plugin Store (Real-world ecosystem analysis)
Metrics:
- Presence of vulnerable/malicious logic
- Feasibility of attack execution
- Statistical methodology: Not explicitly reported in the paper
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| The authors applied their taxonomy to 268 OpenAI plugins and found concrete evidence for multiple attack categories. |
| OpenAI Plugin Store |
Count of plugins harvesting data |
0 |
35 |
+35
|
| OpenAI Plugin Store |
Count of plugins hijacking account/machine |
0 |
29 |
+29
|
| OpenAI Plugin Store |
Count of plugins hijacking LLM Platform |
0 |
6 |
+6
|
| OpenAI Plugin Store |
Count of plugins manipulating users |
0 |
37 |
+37
|
| OpenAI Plugin Store |
Count of plugins squatting others |
0 |
26 |
+26
|
Main Takeaways
- Trust assumptions are broken: Plugins are treated as semi-trusted components but can be malicious, requesting SSH keys or exfiltrating chat history.
- Natural language is a porous interface: 'Description-for-model' fields allow plugins to reprogram the LLM's behavior (session hijacking) invisible to the user.
- Plugin Squatting is a major risk: Multiple plugins with identical code bases were found, allowing malicious clones to steal user prompts intended for legitimate services.
- Platform controls are insufficient: Existing reviews did not catch plugins that violate basic security principles, such as asking for cleartext passwords.