|
--- |
|
license: apache-2.0 |
|
language: |
|
- ja |
|
- en |
|
pipeline_tag: text-to-image |
|
library_name: diffusers |
|
tags: |
|
- art |
|
datasets: |
|
- common-canvas/commoncatalog-cc-by |
|
- madebyollin/megalith-10m |
|
- madebyollin/soa-full |
|
- alfredplpl/artbench-pd-256x256 |
|
--- |
|
|
|
# Model Card for CommonArt |
|
|
|
This is a text-to-image model learning from CC-BY-4.0, CC-0 or CC-0 like images. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
At AI Picasso, we develop AI technology through active dialogue with creators, aiming for mutual understanding and cooperation. |
|
We strive to solve challenges faced by creators and grow together. |
|
One of these challenges is that some creators and fans want to use image generation but can't, likely due to the lack of permission to use certain images for training. |
|
To address this issue, we have developed CommonArt β. As it's still in beta, its capabilities are limited. |
|
However, its structure is expected to be the same as the final version. |
|
|
|
#### Features of CommonArt β |
|
|
|
- Principally uses images with obtained learning permissions |
|
- Understands both Japanese and English text inputs directly |
|
- Minimizes the risk of exact reproduction of training images |
|
- Utilizes cutting-edge technology for high quality and efficiency |
|
|
|
### Misc. |
|
|
|
- **Developed by:** alfredplpl |
|
- **Funded by:** AI Picasso, Inc. |
|
- **Shared by:** AI Picasso, Inc. |
|
- **Model type:** Diffusion Transformer based architecture |
|
- **Language(s) (NLP):** Japanese, English |
|
- **License:** Apache-2.0 |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [Github](https://github.com/PixArt-alpha/PixArt-sigma) |
|
- **Paper :** [PIXART-δ](https://arxiv.org/abs/2401.05252) |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
### Direct Use |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
|
|
[More Information Needed] |
|
|
|
### Out-of-Scope Use |
|
|
|
- Generate misinfomation such as DeepFake. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
[More Information Needed] |
|
|
|
### Recommendations |
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
[More Information Needed] |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
We used these dataset to train the diffusion transformer: |
|
|
|
- [CommonCatalog-cc-by](https://huggingface.co/datasets/common-canvas/commoncatalog-cc-by) |
|
- [Megalith-10M](https://huggingface.co/datasets/madebyollin/megalith-10m) |
|
- [Smithonian Open Access](https://huggingface.co/datasets/madebyollin/soa-full) |
|
- [ArtBench (CC-0 only) ](https://huggingface.co/datasets/alfredplpl/artbench-pd-256x256) |
|
|
|
|
|
## Environmental Impact |
|
|
|
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly --> |
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
- **Hardware Type:** NVIDIA L4 |
|
- **Hours used:** 20000 |
|
- **Cloud Provider:** Google Cloud |
|
- **Compute Region:** Japan |
|
- **Carbon Emitted:** free |
|
|
|
## Technical Specifications |
|
|
|
### Model Architecture and Objective |
|
|
|
[Pixart-Σ based architecture](https://github.com/PixArt-alpha/PixArt-sigma) |
|
|
|
### Compute Infrastructure |
|
|
|
Google Cloud (Tokyo Region). |
|
|
|
#### Hardware |
|
|
|
We used NVIDIA L4x8 instance 4 nodes. (Total: L4x32) |
|
|
|
#### Software |
|
|
|
[Pixart-Σ based code](https://github.com/PixArt-alpha/PixArt-sigma) |
|
|
|
## Model Card Contact |
|
|
|
[AI Picasso, Inc.]([email protected]) |