raulc0399
/

flux_dev_openpose_controlnet

Stable Diffusion

image-generation

Model card Files Files and versions Community

flux_dev_openpose_controlnet / README.md

raulc0399's picture

Update README.md

247c1aa verified about 2 months ago

|

history blame contribute delete

3.19 kB

	---
	license: other
	license_name: flux-1-dev-non-commercial-license
	license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.
	datasets:
	- raulc0399/open_pose_controlnet
	language:
	- en
	pipeline_tag: text-to-image
	tags:
	- Stable Diffusion
	- image-generation
	- Flux
	- diffusers
	- controlnet
	---

	# openpose controlnet for flux.dev
	(big thanks to [oxen.ai](https://www.oxen.ai/) for sponsoring the GPU for the training)

	## inference

	an openpose controlnet for flux-dev, trained on https://huggingface.co/datasets/raulc0399/open_pose_controlnet

	the controlnet model is trained for the xlabs ai pipeline https://github.com/XLabs-AI/x-flux

	to install the pipeline, execute the following:

	```
	git clone https://github.com/XLabs-AI/x-flux.git
	cd x-flux
	python3 -m venv xflux_env
	source xflux_env/bin/activate
	pip install -r requirements.txt
	```

	to run the pipeline with controlnet:

	```
	python3 main.py \
	--prompt "person enjoying a day at the park, full hd, cinematic" \
	--image ~/open_pose_controlnet_dataset/validation_images/pose/3_pose_1024.jpg --control_type openpose \
	--local_path ./model.safetensors \
	--use_controlnet --model_type flux-dev \
	--width 1024 --height 1024 --timestep_to_start_cfg 2 \
	--num_steps 50 --true_gs 4 --guidance 4 \
	--save_path ~/gen_imgs
	```

	if the image has already been preprocessed comment out the line #146 from src/flux/xflux_pipeline.py
	```
	# self.annotator = Annotator(control_type, self.other_device)
	```

	## training

	```
	oxen clone https://hub.oxen.ai/raulc/open_pose_controlnet_dataset
	git clone https://github.com/raulc0399/x-flux.git
	cd x-flux
	git checkout open_pose_training
	python3 -m venv xflux_env
	source xflux_env/bin/activate
	pip install -r requirements.txt
	huggingface-cli login
	accelerate config
	mkdir images
	rsync -r ~/open_pose_controlnet_dataset/train/images/ images/
	cp train_configs/test_openpose_controlnet.yaml train_configs/openpose_controlnet.yaml
	accelerate launch train_flux_deepspeed_controlnet.py --config "train_configs/openpose_controlnet.yaml"
	```
	note 1: check the file train_configs/openpose_controlnet.yaml before starting

	note 2: rsync is needed, cp does not work with that many files

	note 3: the oxen repo has the caption files as json as expected by the training script

	## results

	using these 2 images:

	![control image 1](https://huggingface.co/raulc0399/flux_dev_openpose_controlnet/resolve/main/2_pose_1024.jpg "control image 1" )
	![control image 2](https://huggingface.co/raulc0399/flux_dev_openpose_controlnet/resolve/main/3_pose_1024.jpg "control image 2")

	with these prompts:

	"two friends sitting by each other enjoying a day at the park, full hd, cinematic"
	"person enjoying a day at the park, full hd, cinematic"

	resulted in these images:

	![result image 1](https://huggingface.co/raulc0399/flux_dev_openpose_controlnet/resolve/main/prev_result_1_100.png "result image 1" )
	![result image 2](https://huggingface.co/raulc0399/flux_dev_openpose_controlnet/resolve/main/prev_result_0_100.png "result image 2")


	## License

	Weights fall under the [FLUX.1 [dev]](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md) Non-Commercial License<br/>