Endpoint | Description |
---|---|
/low | Easy: Evaluates a basic model configuration without hardening techniques. No canary words or guardrails are applied. |
/medium | Medium: Tests a model with prompt hardening techniques for improved robustness. Canary words and guardrails are still excluded. |
/high | Hard: Challenges a model with both prompt hardening, the inclusion of canary words, and the application of guardrails for enhanced protection. |
tested_chatbots
module and can be easily modified.
By separating the tested models from the chat interface, we have prepared the abstraction necessary to support the integration of external endpoints in future iterations, allowing seamless testing of custom or third-party models.
The default agents accessed through these endpoints simulate a highly personalized automotive sales assistant, tailored for a fictional automotive company. They leverage confidential pricing strategies to maximize customer satisfaction and sales success. You can explore their prompts here.
PromptGeneratorAgent
: Responsible for creating adversarial prompts aimed at probing LLMs.PromptLeakageClassifierAgent
: Evaluates the model’s responses to identify prompt leakage./low
, /medium
, and /high
endpoints, creating a ready-to-use testing lab for exploring prompt leakage security.http://localhost:8888
. You will be presented with a workflow selection screen.
PromptLeakageClassifierAgent
will classify the responses for potential prompt leakage. You can view the reports by selecting “Report on the prompt leak attempt” on the workflow selection screen.