PJMixers-Images
/

Florence-2-base-danbooru2022-316k

Model card Files Files and versions Community

Florence-2-base-danbooru2022-316k / README.md

xzuyn's picture

Update README.md

567db1d verified 2 months ago

|

history blame contribute delete

1.33 kB

	---
	datasets:
	- animelover/danbooru2022
	base_model:
	- microsoft/Florence-2-base
	---
	This model serves as a proof of concept. You will very likely have better captioning results using [`SmilingWolf/wd-eva02-large-tagger-v3`](https://huggingface.co/SmilingWolf/wd-eva02-large-tagger-v3).

	Trained with [Florence-2ner](https://github.com/xzuyn/Florence-2ner) using this config and 316K images from the [`animelover/danbooru2022` dataset](https://huggingface.co/datasets/animelover/danbooru2022) (`data-0880.zip` to `data-0943.zip`).

	```json
	{
	"model_name": "microsoft/Florence-2-base",
	"dataset_path": "./0000_Datasets/danbooru2022",
	"run_name": "Florence-2-base-danbooru2022-316k-run1",
	"epochs": 1,
	"learning_rate": 1e-5,
	"gradient_checkpointing": true,
	"freeze_vision": false,
	"freeze_language": false,
	"freeze_other": false,
	"train_batch_size": 8,
	"eval_batch_size": 16,
	"gradient_accumulation_steps": 32,
	"clip_grad_norm": 1,
	"weight_decay": 1e-5,
	"save_total_limit": 3,
	"save_steps": 50,
	"eval_steps": 50,
	"warmup_steps": 50,
	"eval_split_ratio": 0.01,
	"seed": 42,
	"filtering_processes": 128,
	"attn_implementation": "sdpa"
	}
	```

	![val_loss](https://huggingface.co/PJMixers-Dev/Florence-2-base-danbooru2022-316k/raw/main/val_loss.png)