BauerHartmut
/

llama_3.1-8B-4bit-Computer_Vision_1.5.3

Inference Endpoints

Model card Files Files and versions Community

llama_3.1-8B-4bit-Computer_Vision_1.5.3 / README.md

BauerHartmut's picture

Update README.md

be4c045 verified 4 months ago

|

history blame contribute delete

2.13 kB

	---
	license: apache-2.0
	tags:
	- Computer
	- computervision
	---

	# Uses

	This LLM is trained on data generated by my code for the yolov8 model. [Github code](https://github.com/bauerhartmut/yolov8-Computervision)
	The model is capable of briefly describing what the yolov8 model can detect and can also execute a command (/click).
	When the command is triggered, a dictionary is generated containing the key data of the object to be clicked.

	# Testing
	You can test the model by giving it this informations:

	```json
	{
	"Object": [
	{
	"index": "window_0",
	"label": "window",
	"property": "toplayer",
	"coords": [
	189.06007385253906,
	79.33326721191406,
	1156.018798828125,
	750.1478271484375
	],
	"textes": 24,
	"interactions": [
	{
	"label": "close_window",
	"interaction_type": 1,
	"coords": [
	1114.04541015625,
	84.65348815917969,
	1149.1778564453125,
	113.41248321533203
	]
	},
	{
	"label": "maximize",
	"interaction_type": 1,
	"coords": [
	1067.0111083984375,
	84.82215118408203,
	1099.86328125,
	112.69491577148438
	]
	},
	{
	"label": "minize_window",
	"interaction_type": 1,
	"coords": [
	1024.7701416015625,
	85.06327819824219,
	1053.4327392578125,
	111.52396392822266
	]
	}
	]
	}
	]
	}
	```

	You can give the model this informations and a prompt like "Was siehst du" or "Kannst du das Fenster schließen".

	The Model is at the moment only trained on german.