aipicasso
/

emi-3

Text-to-Image

Diffusers

Safetensors

English

StableDiffusion3Pipeline

Model card Files Files and versions Community

alfredplpl commited on Dec 5, 2024

Commit

93afff2

verified ·

1 Parent(s): 36b27f9

Update README.md

Browse files

Files changed (1) hide show

README.md +24 -44

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ license_link: LICENSE.md
 # はじめに
 Emi 3 (Ethereal master of illustration 3) は、
 AI Picasso社が開発したAIアートに特化した画像生成AIです。
-このモデルの特徴として、Danbooruなどにある無断転載画像を学習していないことがあげられます。
 # 使い方
 [ここ](https://huggingface.co/spaces/aipicasso/emi-3)からデモを利用することができます。
@@ -38,20 +38,10 @@ [email protected]
 ## モデル詳細
 - **モデルタイプ:** 拡散モデルベースの text-to-image 生成モデル
 - **言語:** 日本語
-- **ライセンス:** [CreativeML Open RAIL++-M License](LICENSE.md)
-- **モデルの説明:** このモデルはプロンプトに応じて適切な画像を生成することができます。アルゴリズムは [Latent Diffusion Model](https://arxiv.org/abs/2307.01952) と [OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip)、[CLIP-L](https://github.com/openai/CLIP) です。
 - **補足:**
-- **参考文献:**
-```bibtex
-@misc{podell2023sdxl,
-      title={SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis},
-      author={Dustin Podell and Zion English and Kyle Lacey and Andreas Blattmann and Tim Dockhorn and Jonas Müller and Joe Penna and Robin Rombach},
-      year={2023},
-      eprint={2307.01952},
-      archivePrefix={arXiv},
-      primaryClass={cs.CV}
-}
-```
 ## モデルの使用例
@@ -72,28 +62,27 @@ Stable Diffusion XL 1.0 の使い方と同じく、safetensors形式のモデル
 まずは、以下のスクリプトを実行し、ライブラリをいれてください。
 ```bash
-pip install invisible_watermark transformers accelerate safetensors　diffusers
 ```
 次のスクリプトを実行し、画像を生成してください。
-```python
-from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler
 import torch
-model_id = "aipicasso/emi-2-5"
-scheduler = EulerAncestralDiscreteScheduler.from_pretrained(model_id,subfolder="scheduler")
-pipe = StableDiffusionXLPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.bfloat16)
 pipe = pipe.to("cuda")
-prompt = "1girl, upper body, brown bob short hair, brown eyes, looking at viewer, cherry blossom"
-images = pipe(prompt, num_inference_steps=20).images
-images[0].save("girl.png")
 ```
-複雑な操作は[デモのソースコード](https://huggingface.co/spaces/aipicasso/emi-2-demo/blob/main/app.py)を参考にしてください。
 #### 想定される用途
 - イラストや漫画、アニメの作画補助
@@ -143,8 +132,8 @@ images[0].save("girl.png")
 **学習データ**
-- Stable Diffusionと同様のデータセットからDanbooruの無断転載画像を取り除いて手動で集めた約3000枚の画像
-- Stable Diffusionと同様のデータセットからDanbooruの無断転載画像を取り除いて自動で集めた約50万枚の画像
 - [CosmicMan-SDXL](https://huggingface.co/cosmicman/CosmicMan-SDXL)
 **学習プロセス**
@@ -164,22 +153,13 @@ images[0].save("girl.png")
 ## 参考文献
 ```bibtex
-@misc{podell2023sdxl,
-      title={SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis},
-      author={Dustin Podell and Zion English and Kyle Lacey and Andreas Blattmann and Tim Dockhorn and Jonas Müller and Joe Penna and Robin Rombach},
-      year={2023},
-      eprint={2307.01952},
       archivePrefix={arXiv},
-      primaryClass={cs.CV}
 }
 ```
-```bibtex
-@article{li2024cosmicman,
-  title={CosmicMan: A Text-to-Image Foundation Model for Humans},
-  author={Li, Shikai and Fu, Jianglin and Liu, Kaiyuan and Wang, Wentao and Lin, Kwan-Yee and Wu, Wayne},
-  journal={arXiv preprint arXiv:2404.01294},
-  year={2024}
-}
-```

 # はじめに
 Emi 3 (Ethereal master of illustration 3) は、
 AI Picasso社が開発したAIアートに特化した画像生成AIです。
+このモデルの特徴として、Danbooruなどにある無断転載画像を追加に学習していないことがあげられます。
 # 使い方
 [ここ](https://huggingface.co/spaces/aipicasso/emi-3)からデモを利用することができます。
 ## モデル詳細
 - **モデルタイプ:** 拡散モデルベースの text-to-image 生成モデル
 - **言語:** 日本語
+- **ライセンス:** [Stabilityai AI Community](LICENSE.md)
+- **モデルの説明:** このモデルはプロンプトに応じて適切な画像を生成することができます。アルゴリズムは [Rectified Flow Transformer](https://stability.ai/news/stable-diffusion-3-research-paper) と [OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip)、[CLIP-L](https://github.com/openai/CLIP) 、[T5](https://arxiv.org/abs/1910.10683) です。
 - **補足:**
 ## モデルの使用例
 まずは、以下のスクリプトを実行し、ライブラリをいれてください。
 ```bash
+pip install -U diffusers
 ```
 次のスクリプトを実行し、画像を生成してください。
+```py
 import torch
+from diffusers import StableDiffusion3Pipeline
+pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large", torch_dtype=torch.bfloat16)
 pipe = pipe.to("cuda")
+image = pipe(
+    "A capybara holding a sign that reads Hello World",
+    num_inference_steps=28,
+    guidance_scale=3.5,
+).images[0]
+image.save("capybara.png")
 ```
+複雑な操作は[デモのソースコード](https://huggingface.co/spaces/aipicasso/emi-3/blob/main/app.py)を参考にしてください。
 #### 想定される用途
 - イラストや漫画、アニメの作画補助
 **学習データ**
+- Stable Diffusion 3.5 Largeと同様のデータセットからDanbooruの無断転載画像を取り除いて手動で集めた約3000枚の画像
+- Stable Diffusion 3.5 Largeと同様のデータセットからDanbooruの無断転載画像を取り除いて自動で集めた約50万枚の画像
 - [CosmicMan-SDXL](https://huggingface.co/cosmicman/CosmicMan-SDXL)
 **学習プロセス**
 ## 参考文献
 ```bibtex
+@misc{esser2024scalingrectifiedflowtransformers,
+      title={Scaling Rectified Flow Transformers for High-Resolution Image Synthesis},
+      author={Patrick Esser and Sumith Kulal and Andreas Blattmann and Rahim Entezari and Jonas Müller and Harry Saini and Yam Levi and Dominik Lorenz and Axel Sauer and Frederic Boesel and Dustin Podell and Tim Dockhorn and Zion English and Kyle Lacey and Alex Goodwin and Yannik Marek and Robin Rombach},
+      year={2024},
+      eprint={2403.03206},
       archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2403.03206},
 }
 ```