Update README.md
Browse files
README.md
CHANGED
@@ -25,13 +25,26 @@ model-index:
|
|
25 |
This is a trained model of a **PPO** agent playing **MountainCar-v0**
|
26 |
using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
|
27 |
|
28 |
-
|
29 |
-
TODO: Add your code
|
30 |
-
|
31 |
-
|
32 |
```python
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
This is a trained model of a **PPO** agent playing **MountainCar-v0**
|
26 |
using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
|
27 |
|
28 |
+
# Model Details
|
|
|
|
|
|
|
29 |
```python
|
30 |
+
- Model Name: ppo-MountainCar-v0
|
31 |
+
- Model Type: Proximal Policy Optimization (PPO)
|
32 |
+
- Policy Architecture: MultiLayerPerceptron (MLPPolicy)
|
33 |
+
- Environment: MountainCar-v0
|
34 |
```
|
35 |
+
- Training Data: The model was trained using three consecutive training sessions:
|
36 |
+
- First training session: Total timesteps = 1,000,000
|
37 |
+
- Second training session: Total timesteps = 500,000
|
38 |
+
- Third training session: Total timesteps = 500,000
|
39 |
+
|
40 |
+
# Model Parameters
|
41 |
+
```python
|
42 |
+
- n_steps: 2048
|
43 |
+
- batch_size: 64
|
44 |
+
- n_epochs: 8
|
45 |
+
- gamma: 0.999
|
46 |
+
- gae_lambda: 0.95
|
47 |
+
- ent_coef: 0.01
|
48 |
+
- max_grad_norm: 0.5
|
49 |
+
- Verbose: Enabled (Verbose level = 1)
|
50 |
+
```
|