fittar commited on
Commit
75ee371
1 Parent(s): eee3e8c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -8
README.md CHANGED
@@ -9,7 +9,9 @@ license: mit
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
- ViPE: Visualize Pretty-much Everything, is the first automated model for translating any arbitrary piece of text into a visualizable prompt. It helps any text-to-image model in figurative or non-lexical language visualizations.
 
 
13
 
14
  ### Model Description
15
 
@@ -101,18 +103,28 @@ However, a semicolon draws a stronger boundary between the keywords and encourag
101
  ### Training Data
102
 
103
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
104
-
105
- [More Information Needed]
106
-
107
  ### Training Procedure
108
 
 
 
 
109
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
110
 
111
-
112
  ## Evaluation
113
-
114
- <!-- This section describes the evaluation protocols and provides the results. -->
115
-
 
 
 
 
 
 
 
 
 
116
 
117
  ## Citation
118
 
 
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
+ ViPE: Visualize Pretty-much Everything, is the first automated model for translating any arbitrary piece of text into a visualizable prompt.
13
+ It helps any text-to-image model in figurative or non-lexical language visualizations. It has been shown to be more robust than GPT3.5 Turbo (ChatGPT)
14
+ in generating depictable and semantically meaningful prompts.
15
 
16
  ### Model Description
17
 
 
103
  ### Training Data
104
 
105
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
106
+ - LyricCanvas dataset: a synthetically generated dataset: will be published soon
107
+
 
108
  ### Training Procedure
109
 
110
+ ViPE has been trained in the standard auto-regressive procedure: given a line (or lines) of lyrics as a prefix, the objective is to generate a plausible
111
+ prompt that is both despicable and semantically related to the given lyric(c). The loss function does not include the tokens corresponding to the lyrics. So ViPE
112
+ never generates any original lyrics and only learns to generate visually related prompts.
113
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
114
 
 
115
  ## Evaluation
116
+ In all of the following evaluations, ViPE consistently demonstrates its robustness compared to ChatGPT and achieves performance that is competitive with that of human experts.
117
+
118
+ - ***Intrinsic evaluations***
119
+ - General understanding of figurative language using [Fig-QA dataset](https://huggingface.co/datasets/nightingal3/fig-qa)
120
+ - ***Extrinsic evaluations***
121
+ - Image-text Retrieval on the [HAIVMet dataset](https://aclanthology.org/2023.findings-acl.465.pdf)
122
+ - Emotion visualizations: How well does ViPE transfer emotionally charged tweets into a depictable description of a scene in comparison with
123
+ ChatGPT. The [Emotion dataset](https://huggingface.co/datasets/dair-ai/emotion) is utilized.
124
+ - ***Human evaluations***
125
+ - we conducted a user study involving 30 native English-speaking participants aged between 20 and 40. Participants were
126
+ presented with 3 images and a metaphor from the HAIVMet dataset. They were asked to select the images that matches the metaphor the best.
127
+ The images were generated using prompts from ViPE, ChatGPT, and human experts (HAIVMet).
128
 
129
  ## Citation
130