File size: 4,667 Bytes
246d201
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
# Security

Given the impressive capabilities of OpenHands and similar coding agents, ensuring robust security measures is essential to prevent unintended actions or security breaches. The SecurityAnalyzer framework provides a structured approach to monitor and analyze agent actions for potential security risks.

To enable this feature:
* From the web interface
    * Open Configuration (by clicking the gear icon in the bottom right)
    * Select a Security Analyzer from the dropdown
    * Save settings
    * (to disable) repeat the same steps, but click the X in the Security Analyzer dropdown
* From config.toml
```toml

[security]

# Enable confirmation mode

confirmation_mode = true

# The security analyzer to use

security_analyzer = "your-security-analyzer"

```
(to disable) remove the lines from config.toml

## SecurityAnalyzer Base Class

The `SecurityAnalyzer` class (analyzer.py) is an abstract base class designed to listen to an event stream and analyze actions for security risks and eventually act before the action is executed. Below is a detailed explanation of its components and methods:

### Initialization

- **event_stream**: An instance of `EventStream` that the analyzer will listen to for events.



### Event Handling



- **on_event(event: Event)**: Handles incoming events. If the event is an `Action`, it evaluates its security risk and acts upon it.

### Abstract Methods

- **handle_api_request(request: Request)**: Abstract method to handle API requests.
- **log_event(event: Event)**: Logs events.

- **act(event: Event)**: Defines actions to take based on the analyzed event.

- **security_risk(event: Action)**: Evaluates the security risk of an action and returns the risk level.
- **close()**: Cleanups resources used by the security analyzer.

In conclusion, a concrete security analyzer should evaluate the risk of each event and act accordingly (e.g. auto-confirm, send Slack message, etc).

For customization and decoupling from the OpenHands core logic, the security analyzer can define its own API endpoints that can then be accessed from the frontend. These API endpoints need to be secured (do not allow more capabilities than the core logic
provides).

## How to implement your own Security Analyzer

1. Create a submodule in [security](/openhands/security/) with your analyzer's desired name
    * Have your main class inherit from [SecurityAnalyzer](/openhands/security/analyzer.py)
    * Optional: define API endpoints for `/api/security/{path:path}` to manage settings,
2. Add your analyzer class to the [options](/openhands/security/options.py) to have it be visible from the frontend combobox
3. Optional: implement your modal frontend (for when you click on the lock) in [security](/frontend/src/components/modals/security/) and add your component to [Security.tsx](/frontend/src/components/modals/security/Security.tsx)

## Implemented Security Analyzers

### Invariant

It uses the [Invariant Analyzer](https://github.com/invariantlabs-ai/invariant) to analyze traces and detect potential issues with OpenHands's workflow. It uses confirmation mode to ask for user confirmation on potentially risky actions.

This allows the agent to run autonomously without fear that it will inadvertently compromise security or perform unintended actions that could be harmful.

Features:

* Detects:
    * potential secret leaks by the agent
    * security issues in Python code
    * malicious bash commands
    * dangerous user tasks (browsing agent setting)
    * harmful content generation (browsing agent setting)
* Logs:
    * actions and their associated risk
    * OpenHands traces in JSON format
* Run-time settings:
    * the [invariant policy](https://github.com/invariantlabs-ai/invariant?tab=readme-ov-file#policy-language)
    * acceptable risk threshold
    * (Optional) check_browsing_alignment flag
    * (Optional) guardrail_llm that assesses if the agent behaviour is safe



Browsing Agent Safety:



* Guardrail feature that uses the underlying LLM of the agent to:

    * Examine the user's request and check if it is harmful.

    * Examine the content entered by the agent in a textbox (argument of the “fill” browser action) and check if it is harmful.



* If the guardrail evaluates either of the 2 conditions to be true, it emits a change_agent_state action and transforms the AgentState to ERROR. This stops the agent from proceeding further.



* To enable this feature: In the InvariantAnalyzer object, set the check_browsing_alignment attribute to True and initialize the guardrail_llm attribute with an LLM object.