BauerHartmut's picture
Update README.md
be4c045 verified
---
license: apache-2.0
tags:
- Computer
- computervision
---
# Uses
This LLM is trained on data generated by my code for the yolov8 model. [Github code](https://github.com/bauerhartmut/yolov8-Computervision)
The model is capable of briefly describing what the yolov8 model can detect and can also execute a command (/click).
When the command is triggered, a dictionary is generated containing the key data of the object to be clicked.
# Testing
You can test the model by giving it this informations:
```json
{
"Object": [
{
"index": "window_0",
"label": "window",
"property": "toplayer",
"coords": [
189.06007385253906,
79.33326721191406,
1156.018798828125,
750.1478271484375
],
"textes": 24,
"interactions": [
{
"label": "close_window",
"interaction_type": 1,
"coords": [
1114.04541015625,
84.65348815917969,
1149.1778564453125,
113.41248321533203
]
},
{
"label": "maximize",
"interaction_type": 1,
"coords": [
1067.0111083984375,
84.82215118408203,
1099.86328125,
112.69491577148438
]
},
{
"label": "minize_window",
"interaction_type": 1,
"coords": [
1024.7701416015625,
85.06327819824219,
1053.4327392578125,
111.52396392822266
]
}
]
}
]
}
```
You can give the model this informations and a prompt like "Was siehst du" or "Kannst du das Fenster schließen".
The Model is at the moment only trained on german.