justinthelaw
/

LaMini-Flan-T5-783M-Opera-Fine-Tune

Model card Files Files and versions Community

Justin Law commited on Aug 27, 2023

Commit

b5dbfe2

•

1 Parent(s): 07e70bb

Build(Release): v0.1.0 Opera Bullet Interpreter Model

Browse files

Files changed (7) hide show

README.md +238 -3
config.json +32 -0
generation_config.json +7 -0
pytorch_model.bin +3 -0
special_tokens_map.json +107 -0
spiece.model +3 -0
tokenizer_config.json +114 -0

README.md CHANGED Viewed

@@ -1,3 +1,238 @@
----
-license: apache-2.0
----

+# Model Card for Opera Bullet Interpreter
+An unofficial United States Air Force and Space Force performance statement "translation" model. Takes a properly formatted performance statement, also known as a "bullet," as an input and outputs a long-form sentence, using plain english, describing the accomplishments captured within the bullet.
+This checkpoint is a fine-tuned version of the LaMini-Flan-T5-783M, using the justinthelaw/opera-bullet-completions (private) dataset.
+To learn more about this project, please visit the [Opera GitHub Repository](https://github.com/justinthelaw/opera).
+# Table of Contents
+- [Model Card for Opera Bullet Interpreter](#model-card-for--model_id-)
+- [Table of Contents](#table-of-contents)
+- [Model Details](#model-details)
+- [Uses](#uses)
+- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
+- [Training Details](#training-details)
+- [Evaluation](#evaluation)
+- [Model Examination](#model-examination)
+- [Environmental Impact](#environmental-impact)
+- [Technical Specifications [optional]](#technical-specifications-optional)
+- [Citation](#citation)
+- [Model Card Authors](#model-card-authors-optional)
+- [Model Card Contact](#model-card-contact)
+- [How to Get Started with the Model](#how-to-get-started-with-the-model)
+# Model Details
+## Model Description
+An unofficial United States Air Force and Space Force performance statement "translation" model. Takes a properly formatted performance statement, also known as a "bullet," as an input and outputs a long-form sentence, using plain english, describing the accomplishments captured within the bullet.
+This is a fine-tuned version of the LaMini-Flan-T5-783M, using the justinthelaw/opera-bullet-completions (private) dataset.
+- **Developed by:** Justin Law, Alden Davidson, Christopher Kodama, My Tran
+- **Model type:** Language Model
+- **Language(s) (NLP):** en
+- **License:** apache-2.0
+- **Parent Model:** [LaMini-Flan-T5-783M](https://huggingface.co/MBZUAI/LaMini-Flan-T5-783M)
+- **Resources for more information:** More information needed
+  - [GitHub Repo](https://github.com/justinthelaw/opera)
+  - [Associated Paper](https://huggingface.co/MBZUAI/LaMini-Flan-T5-783M)
+# Uses
+## Direct Use
+Used to programmatically produce training data for Opera&#39;s Bullet Forge (see GitHub repository for details).
+## Downstream Use [Optional]
+Used to quickly interpret bullets written by Airman (Air Force) or Guardians (Space Force), into long-form, plain English sentences.
+## Out-of-Scope Use
+Generating bullets from long-form, plain English sentences. General NLP functionality.
+# Bias, Risks, and Limitations
+Specialized acronyms or abbreviations specific to small units may not be transformed properly. Bullets in highly non-standard formats may result in lower quality results.
+## Recommendations
+Look-up acronyms to ensure the correct narrative is being formed. Double-check (spot check) bullets with slightly more complex acronyms and abbreviations for narrative precision.
+# Training Details
+## Training Data
+pre-processing or additional filtering. -->
+The model was fine-tuned on the justinthelaw/opera-bullet-completions dataset, which can be partially found at the GitHub repository.
+## Training Procedure
+### Preprocessing
+The justinthelaw/opera-bullet-completions dataset was created using a custom Python web-scraper, along with some custom cleaning functions, all of which can be found at the GitHub repository.
+### Speeds, Sizes, Times
+It takes approximately 3-5 seconds per inference when using any standard-sized Air and Space Force bullet statement.
+# Evaluation
+## Testing Data, Factors & Metrics
+### Testing Data
+20% of the justinthelaw/opera-bullet-completions dataset was used to validate the model's performance.
+### Factors
+Repitition, contextual loss, and bullet format are all loss factors tied into the backward propogation calculations and validation steps.
+### Metrics
+ROGUE scores were computed and averaged. These may be provided in future iterations of this model's development.
+## Results
+# Model Examination
+More information needed
+# Environmental Impact
+- **Hardware Type:** 2.6 GHz 6-Core Intel Core i7, 16 GB 2667 MHz DDR4, AMD Radeon Pro 5300M 4 GB
+- **Hours used:** 18
+- **Cloud Provider:** N/A
+- **Compute Region:** N/A
+- **Carbon Emitted:** N/A
+# Technical Specifications
+### Hardware
+2.6 GHz 6-Core Intel Core i7, 16 GB 2667 MHz DDR4, AMD Radeon Pro 5300M 4 GB
+### Software
+VSCode, Jupyter Notebook, Python3, PyTorch, Transformers, Pandas, Asyncio, Loguru, Rich
+# Citation
+**BibTeX:**
+@article{lamini-lm,
+author = {Minghao Wu and
+Abdul Waheed and
+Chiyu Zhang and
+Muhammad Abdul-Mageed and
+Alham Fikri Aji
+},
+title = {LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions},
+journal = {CoRR},
+volume = {abs/2304.14402},
+year = {2023},
+url = {https://arxiv.org/abs/2304.14402},
+eprinttype = {arXiv},
+eprint = {2304.14402}
+}
+# Model Card Authors
+construction? Etc. -->
+Justin Law, Alden Davidson, Christopher Kodama, My Tran
+# Model Card Contact
+Email: [email protected]
+# How to Get Started with the Model
+Use the code below to get started with the model.
+<details>
+<summary> Click to expand </summary>
+```python
+import torch
+from transformers import T5ForConditionalGeneration, T5Tokenizer
+bullet_data_creation_prefix = (
+    "Using upwards of 3 sentences, expand upon the following Air and Space Force bullet statement by "
+    + "spelling-out acronyms and adding additional context that is not already included in the Air and Space Force bullet statement: "
+)
+# Path of the pre-trained model that will be used
+model_path = "justinthelaw/opera-bullet-interpreter"
+# Path of the pre-trained model tokenizer that will be used
+# Must match the model checkpoint's signature
+tokenizer_path = "justinthelaw/opera-bullet-interpreter"
+# Max length of tokens a user may enter for summarization
+# Increasing this beyond 512 may increase compute time significantly
+max_input_token_length = 512
+# Max length of tokens the model should output for the summary
+# Approximately the number of tokens it may take to generate a bullet
+max_output_token_length = 512
+# Beams to use for beam search algorithm
+# Increased beams means increased quality, but increased compute time
+number_of_beams = 6
+# Scales logits before soft-max to control randomness
+# Lower values (~0) make output more deterministic
+temperature = 0.5
+# Limits generated tokens to top K probabilities
+# Reduces chances of rare word predictions
+top_k = 50
+# Applies nucleus sampling, limiting token selection to a cumulative probability
+# Creates a balance between randomness and determinism
+top_p = 0.90
+try:
+    tokenizer = T5Tokenizer.from_pretrained(
+        f"{model_path}",
+        model_max_length=max_input_token_length,
+        add_special_tokens=False,
+    )
+    input_model = T5ForConditionalGeneration.from_pretrained(f"{model_path}")
+    logger.info(f"Loading {model_path}...")
+    # Set device to be used based on GPU availability
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    # Model is sent to device for use
+    model = input_model.to(device)  # type: ignore
+    input_text = bullet_data_creation_prefix + input("Input a US Air or Space Force bullet: ")
+    encoded_input_text = tokenizer.encode_plus(
+        input_text,
+        return_tensors="pt",
+        truncation=True,
+        max_length=max_input_token_length,
+    )
+    # Generate summary
+    summary_ids = model.generate(
+        encoded_input_text["input_ids"],
+        attention_mask=encoded_input_text["attention_mask"],
+        max_length=max_output_token_length,
+        num_beams=number_of_beams,
+        temperature=temperature,
+        top_k=top_k,
+        top_p=top_p,
+        early_stopping=True,
+    )
+    output_text = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
+    # input_text and output_text insert into data sets
+    print(input_line["output"] + "\n\t" + output_text)
+except KeyboardInterrupt:
+    print("Received interrupt, stopping script...")
+except Exception as e:
+    print(f"An error occurred during generation: {e}")
+```
+</details>

config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "_name_or_path": "justinthelaw/opera-bullet-interpreter",
+  "architectures": [
+    "T5ForConditionalGeneration"
+  ],
+  "d_ff": 2816,
+  "d_kv": 64,
+  "d_model": 1024,
+  "decoder_start_token_id": 0,
+  "dense_act_fn": "gelu_new",
+  "dropout_rate": 0.1,
+  "eos_token_id": 1,
+  "feed_forward_proj": "gated-gelu",
+  "initializer_factor": 1.0,
+  "is_encoder_decoder": true,
+  "is_gated_act": true,
+  "layer_norm_epsilon": 1e-06,
+  "model_type": "t5",
+  "n_positions": 512,
+  "num_decoder_layers": 24,
+  "num_heads": 16,
+  "num_layers": 24,
+  "output_past": true,
+  "pad_token_id": 0,
+  "relative_attention_max_distance": 128,
+  "relative_attention_num_buckets": 32,
+  "tie_word_embeddings": false,
+  "torch_dtype": "float32",
+  "transformers_version": "4.31.0",
+  "use_cache": true,
+  "vocab_size": 32128
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "_from_model_config": true,
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "pad_token_id": 0,
+  "transformers_version": "4.31.0"
+}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ebd0339b927e3c64245694e8916fc13f919008f48a61f9a942bbcbd47d1c08e7
+size 3132785797

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,107 @@

+{
+  "additional_special_tokens": [
+    "<extra_id_0>",
+    "<extra_id_1>",
+    "<extra_id_2>",
+    "<extra_id_3>",
+    "<extra_id_4>",
+    "<extra_id_5>",
+    "<extra_id_6>",
+    "<extra_id_7>",
+    "<extra_id_8>",
+    "<extra_id_9>",
+    "<extra_id_10>",
+    "<extra_id_11>",
+    "<extra_id_12>",
+    "<extra_id_13>",
+    "<extra_id_14>",
+    "<extra_id_15>",
+    "<extra_id_16>",
+    "<extra_id_17>",
+    "<extra_id_18>",
+    "<extra_id_19>",
+    "<extra_id_20>",
+    "<extra_id_21>",
+    "<extra_id_22>",
+    "<extra_id_23>",
+    "<extra_id_24>",
+    "<extra_id_25>",
+    "<extra_id_26>",
+    "<extra_id_27>",
+    "<extra_id_28>",
+    "<extra_id_29>",
+    "<extra_id_30>",
+    "<extra_id_31>",
+    "<extra_id_32>",
+    "<extra_id_33>",
+    "<extra_id_34>",
+    "<extra_id_35>",
+    "<extra_id_36>",
+    "<extra_id_37>",
+    "<extra_id_38>",
+    "<extra_id_39>",
+    "<extra_id_40>",
+    "<extra_id_41>",
+    "<extra_id_42>",
+    "<extra_id_43>",
+    "<extra_id_44>",
+    "<extra_id_45>",
+    "<extra_id_46>",
+    "<extra_id_47>",
+    "<extra_id_48>",
+    "<extra_id_49>",
+    "<extra_id_50>",
+    "<extra_id_51>",
+    "<extra_id_52>",
+    "<extra_id_53>",
+    "<extra_id_54>",
+    "<extra_id_55>",
+    "<extra_id_56>",
+    "<extra_id_57>",
+    "<extra_id_58>",
+    "<extra_id_59>",
+    "<extra_id_60>",
+    "<extra_id_61>",
+    "<extra_id_62>",
+    "<extra_id_63>",
+    "<extra_id_64>",
+    "<extra_id_65>",
+    "<extra_id_66>",
+    "<extra_id_67>",
+    "<extra_id_68>",
+    "<extra_id_69>",
+    "<extra_id_70>",
+    "<extra_id_71>",
+    "<extra_id_72>",
+    "<extra_id_73>",
+    "<extra_id_74>",
+    "<extra_id_75>",
+    "<extra_id_76>",
+    "<extra_id_77>",
+    "<extra_id_78>",
+    "<extra_id_79>",
+    "<extra_id_80>",
+    "<extra_id_81>",
+    "<extra_id_82>",
+    "<extra_id_83>",
+    "<extra_id_84>",
+    "<extra_id_85>",
+    "<extra_id_86>",
+    "<extra_id_87>",
+    "<extra_id_88>",
+    "<extra_id_89>",
+    "<extra_id_90>",
+    "<extra_id_91>",
+    "<extra_id_92>",
+    "<extra_id_93>",
+    "<extra_id_94>",
+    "<extra_id_95>",
+    "<extra_id_96>",
+    "<extra_id_97>",
+    "<extra_id_98>",
+    "<extra_id_99>"
+  ],
+  "eos_token": "</s>",
+  "pad_token": "<pad>",
+  "unk_token": "<unk>"
+}

spiece.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d60acb128cf7b7f2536e8f38a5b18a05535c9e14c7a355904270e15b0945ea86
+size 791656

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,114 @@

+{
+  "add_special_tokens": false,
+  "additional_special_tokens": [
+    "<extra_id_0>",
+    "<extra_id_1>",
+    "<extra_id_2>",
+    "<extra_id_3>",
+    "<extra_id_4>",
+    "<extra_id_5>",
+    "<extra_id_6>",
+    "<extra_id_7>",
+    "<extra_id_8>",
+    "<extra_id_9>",
+    "<extra_id_10>",
+    "<extra_id_11>",
+    "<extra_id_12>",
+    "<extra_id_13>",
+    "<extra_id_14>",
+    "<extra_id_15>",
+    "<extra_id_16>",
+    "<extra_id_17>",
+    "<extra_id_18>",
+    "<extra_id_19>",
+    "<extra_id_20>",
+    "<extra_id_21>",
+    "<extra_id_22>",
+    "<extra_id_23>",
+    "<extra_id_24>",
+    "<extra_id_25>",
+    "<extra_id_26>",
+    "<extra_id_27>",
+    "<extra_id_28>",
+    "<extra_id_29>",
+    "<extra_id_30>",
+    "<extra_id_31>",
+    "<extra_id_32>",
+    "<extra_id_33>",
+    "<extra_id_34>",
+    "<extra_id_35>",
+    "<extra_id_36>",
+    "<extra_id_37>",
+    "<extra_id_38>",
+    "<extra_id_39>",
+    "<extra_id_40>",
+    "<extra_id_41>",
+    "<extra_id_42>",
+    "<extra_id_43>",
+    "<extra_id_44>",
+    "<extra_id_45>",
+    "<extra_id_46>",
+    "<extra_id_47>",
+    "<extra_id_48>",
+    "<extra_id_49>",
+    "<extra_id_50>",
+    "<extra_id_51>",
+    "<extra_id_52>",
+    "<extra_id_53>",
+    "<extra_id_54>",
+    "<extra_id_55>",
+    "<extra_id_56>",
+    "<extra_id_57>",
+    "<extra_id_58>",
+    "<extra_id_59>",
+    "<extra_id_60>",
+    "<extra_id_61>",
+    "<extra_id_62>",
+    "<extra_id_63>",
+    "<extra_id_64>",
+    "<extra_id_65>",
+    "<extra_id_66>",
+    "<extra_id_67>",
+    "<extra_id_68>",
+    "<extra_id_69>",
+    "<extra_id_70>",
+    "<extra_id_71>",
+    "<extra_id_72>",
+    "<extra_id_73>",
+    "<extra_id_74>",
+    "<extra_id_75>",
+    "<extra_id_76>",
+    "<extra_id_77>",
+    "<extra_id_78>",
+    "<extra_id_79>",
+    "<extra_id_80>",
+    "<extra_id_81>",
+    "<extra_id_82>",
+    "<extra_id_83>",
+    "<extra_id_84>",
+    "<extra_id_85>",
+    "<extra_id_86>",
+    "<extra_id_87>",
+    "<extra_id_88>",
+    "<extra_id_89>",
+    "<extra_id_90>",
+    "<extra_id_91>",
+    "<extra_id_92>",
+    "<extra_id_93>",
+    "<extra_id_94>",
+    "<extra_id_95>",
+    "<extra_id_96>",
+    "<extra_id_97>",
+    "<extra_id_98>",
+    "<extra_id_99>"
+  ],
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "</s>",
+  "extra_ids": 100,
+  "legacy": true,
+  "model_max_length": 512,
+  "pad_token": "<pad>",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "T5Tokenizer",
+  "unk_token": "<unk>"
+}