pdsdpo commited on
Commit
ed34500
·
verified ·
1 Parent(s): c8d2764

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -5
README.md CHANGED
@@ -8,18 +8,38 @@ pipeline_tag: image-text-to-text
8
  library_name: transformers
9
  ---
10
 
11
- # PDS-DPO-7B Model Card
12
 
13
  GitHub | arXiv
14
 
15
- PDS-DPO-7B is trained based on LLaVA 1.5 7B with a novel framework.
16
 
17
  ## Model Details
 
 
 
 
 
 
18
 
19
  ## Key Features
20
- - Features 1
21
- - Features 2
 
 
 
 
22
 
23
  ## Examples
 
 
24
 
25
- ## Citation
 
 
 
 
 
 
 
 
 
8
  library_name: transformers
9
  ---
10
 
11
+ # PDS-DPO-7B-LoRA Model Card
12
 
13
  GitHub | arXiv
14
 
15
+ PDS-DPO-7B is a vision-language model built upon LLaVA 1.5 7B and trained using the proposed Preference Data Synthetic Direct Preference Optimization (PDS-DPO) framework. This approach leverages synthetic data generated using generative and reward models as proxies for human preferences to improve alignment, reduce hallucinations, and enhance reasoning capabilities.
16
 
17
  ## Model Details
18
+ - Model Name: PDS-DPO-7B-LoRA
19
+ - Base Model: LLaVA 1.5 (Vicuna-7B)
20
+ - Framework: Preference Data Synthetic Alignment using Direct Preference Optimization (PDS-DPO)
21
+ - Dataset: 9K synthetic image-text pairs (positive and negative responses), generated via Stable Diffusion, LLaVA, and scored by reward models like ImageReward and Llama-3-8B-ArmoRM.
22
+ - Training Hardware: 2 × A100 GPUs
23
+ - Training Optimization: LoRA fine-tuning
24
 
25
  ## Key Features
26
+ - Synthetic Data Alignment
27
+ - Utilizes generative models and leverages reward models for quality control, filtering the best images and responses to align with human preferences.
28
+ - Improved Hallucination Control
29
+ - Achieves significant reduction in hallucination rates on benchmarks like Object HalBench, MMHal-Bench, and POPE.
30
+ - Competitive Benchmark Performance
31
+ - Demonstrates strong results across vision-language tasks like VQAv2, SQA, MM-Vet, and TextVQA.
32
 
33
  ## Examples
34
+ <img src="./images-1.png" alt="fig-1" width="45%"/>
35
+ <img src="./images-2.png" alt="fig-2" width="90%"/>
36
 
37
+ ## Citation
38
+ ```bibtex
39
+ @article{2024pdsdpo
40
+ title={Multimodal Preference Data Synthetic Alignment with Reward Model},
41
+ author={},
42
+ journal={},
43
+ year={}
44
+ }
45
+ ```