File size: 4,590 Bytes
b659c17
 
 
 
 
 
 
 
 
 
 
 
505bc2f
b659c17
 
 
505bc2f
 
b659c17
 
 
505bc2f
 
a17b715
 
 
505bc2f
 
a17b715
 
 
505bc2f
 
a17b715
 
 
505bc2f
b659c17
 
 
 
 
 
 
 
a17b715
 
 
 
b659c17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d29036f
1613317
b659c17
a17b715
b659c17
a17b715
b659c17
 
 
 
 
 
 
 
 
 
 
 
a17b715
b659c17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a17b715
b659c17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
---
license: creativeml-openrail-m
base_model: "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS"
tags:
  - stable-diffusion
  - stable-diffusion-diffusers
  - text-to-image
  - diffusers
  - full

inference: true
widget:
- text: 'A blonde sexy girl, wearing glasses at latex shirt and a blue beanie with a tattoo, blue and white, highly detailed, sublime, extremely beautiful, sharp focus, refined, cinematic, intricate, elegant, dynamic, rich deep colors, bright color, shining light, attractive, cute, pretty, background full, epic composition, dramatic atmosphere, radiant, professional, stunning'
  parameters:
    negative_prompt: 'blurry, cropped, ugly'
  output:
    url: ./assets/1.png
- text: 'a wizard with a glowing staff and a glowing hat, colorful magic, dramatic atmosphere, sharp focus, highly detailed, cinematic, original composition, fine detail, intricate, elegant, creative, color spread, shiny, amazing, symmetry, illuminated, inspired, pretty, attractive, artistic, dynamic background, relaxed, professional, extremely inspirational, beautiful, determined, cute, adorable, best'
  parameters:
    negative_prompt: 'blurry, cropped, ugly'
  output:
    url: ./assets/2.png
- text: 'girl in modern car, intricate, elegant, highly detailed, extremely complimentary colors, beautiful, glowing aesthetic, pretty, dramatic light, sharp focus, perfect composition, clear artistic color, calm professional background, precise, joyful, emotional, unique, cute, best, gorgeous, great delicate, expressive, thought, iconic, fine, awesome, creative, winning, charming, enhanced'
  parameters:
    negative_prompt: 'blurry, cropped, ugly'
  output:
    url: ./assets/3.png
- text: 'girl in modern car, intricate, elegant, highly detailed, extremely complimentary colors, beautiful, glowing aesthetic, pretty, dramatic light, sharp focus, perfect composition, clear artistic color, calm professional background, precise, joyful, emotional, unique, cute, best, gorgeous, great delicate, expressive, thought, iconic, fine, awesome, creative, winning, charming, enhanced'
  parameters:
    negative_prompt: 'blurry, cropped, ugly'
  output:
    url: ./assets/3.png
- text: 'A girl stands amidst scattered glass shards, surrounded by a beautifully crafted and expansive world. The scene is depicted from a dynamic angle, emphasizing her determined expression. The background features vast landscapes with floating crystals and soft, glowing lights that create a mystical and grand atmosphere.'
  parameters:
    negative_prompt: 'blurry, cropped, ugly'
  output:
    url: ./assets/ComfyUI_PixArt_00036_.png
---

# pixart-training

This is a full rank finetune derived from [PixArt-alpha/PixArt-Sigma-XL-2-1024-MS](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS).



No validation prompt was used during training.


None


## Validation settings
- CFG: `7.5`
- CFG Rescale: `0.0`
- Steps: `30`
- Sampler: `euler`
- Seed: `42`
- Resolution: `1024`

Note: The validation settings are not necessarily the same as the [training settings](#training-settings).

You can find some example images in the following gallery:


<Gallery />

The text encoder **was not** trained.
You may reuse the base model text encoder for inference.


## Training settings

- Training epochs: 5
- Training steps: 6500
- Learning rate: 8e-06
- Effective batch size: 128
  - Micro-batch size: 32
  - Gradient accumulation steps: 4
  - Number of GPUs: 1
- Prediction type: epsilon
- Rescaled betas zero SNR: False
- Optimizer: AdamW, stochastic bf16
- Precision: Pure BF16
- Xformers: Enabled


## Datasets

### mj-v6
- Repeats: 0
- Total number of images: 134144
- Total number of aspect buckets: 1
- Resolution: 1.0 megapixels
- Cropped: False
- Crop style: None
- Crop aspect: None


## Inference


```python
import torch
from diffusers import DiffusionPipeline



model_id = "pixart-training"
prompt = "An astronaut is riding a horse through the jungles of Thailand."
negative_prompt = "malformed, disgusting, overexposed, washed-out"

pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
    prompt=prompt,
    negative_prompt='blurry, cropped, ugly',
    num_inference_steps=30,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
    width=1152,
    height=768,
    guidance_scale=7.5,
    guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")
```