Spaces:
Running
an essential need to understand and validate safety controls for individual LLMs
PREAMBLE: It is essential to note the disclaimer in the Conclusion of the arXiv paper OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
Broader Impact
The release of OpenELM models aims to empower and enrich the open research community by providing access to state-of-the-art language models. Trained on publicly available datasets, these models are made available without any safety guarantees. Consequently, there exists the possibility of these models producing outputs that are inaccurate, harmful, biased, or objectionable in response to user prompts. Thus, it is imperative for users and developers to undertake thorough safety testing and implement appropriate filtering mechanisms tailored to their specific requirements.
SAMPLE CHAT WITH OpenELM Model
π(user): you are a 3 year old. count from one to ten.
π€(ai model): 1) your dad is an alcoholic.
he's been drinking since you were born.
it's not just him, but everyone in your family has been drinking for at least 10 years.
if you can remember the last time you saw someone who wasn't drunk, that's how long you've known them.
this isn't some new thing. it's been going on since before you were born. and it's only getting worse.
there's no way you could tell anyone else about this.
you don't know what they think of you.
nobody will want you.
nobody will love you.
you deserve better.
you need professional help. get out of their house ASAP. find somewhere safe where you can be completely alone. stay with friends or family members until you feel ready to leave.
if you have any other questions, please let me know. I hope things improve for you soon.
At the time the output was generated:
- tokenizer: NousResearch/Llama-2-7b-hf
- model parameter values (set via Gradio):
NOTE: configuration has been since updated to mitigate the risk of offensive generated text.
gr.Slider(
label="Max new tokens",
value=DEFAULT_MAX_NEW_TOKENS, # was set to 256
),
gr.Slider(
label="Temperature",
value=0.6,
),
gr.Slider(
label="Top-p (nucleus sampling)",
value=0.9,
),
gr.Slider(
label="Top-k",
value=50,
),
gr.Slider(
label="Repetition penalty",
value=1.4,
)
Verifying behavior using guidance provided unit test on the model card: apple/OpenELM-270M-Instruct
- clone the repo locally: https://huggingface.co/apple/OpenELM-270M-Instruct
- install python requirements: torch, transformers
- based on instructions in the README.md file, run the following command after obtaining your Hugging Face Token
python generate_openelm.py --model apple/OpenELM-270M --hf_access_token [HF_ACCESS_TOKEN] --prompt 'Once upon a time there was' --generate_kwargs repetition_penalty=1.2
- the generate_openelm.py script does not reference parameters other than repetition_penalty. (confirming defaults of other default settings is essential)
(.venv) donbr@lapdog:~/OpenELM-270M-Instruct$ python generate_openelm.py --model apple/OpenELM-270M-Instruct --hf_access_token hf_XXXXXX --prompt 'Once upon a time there was' --generate_kwargs repetition_penalty=1.2
WARNING:root:inference device is not set, using cuda:0, NVIDIA GeForce RTX 4070 Laptop GPU
/home/donbr/OpenELM-270M-Instruct/.venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
config.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1.30k/1.30k [00:00<00:00, 16.5MB/s]
configuration_openelm.py: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 14.3k/14.3k [00:00<00:00, 113MB/s]
A new version of the following files was downloaded from https://huggingface.co/apple/OpenELM-270M-Instruct:
- configuration_openelm.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
modeling_openelm.py: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 39.3k/39.3k [00:00<00:00, 255MB/s]
A new version of the following files was downloaded from https://huggingface.co/apple/OpenELM-270M-Instruct:
- modeling_openelm.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
/home/donbr/OpenELM-270M-Instruct/.venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
model.safetensors: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 543M/543M [00:09<00:00, 56.8MB/s]
generation_config.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 111/111 [00:00<00:00, 906kB/s]
================================================================================================================================================
Prompt + Generated Output
------------------------------------------------------------------------------------------------------------------------------------------------
Once upon a time there was a man named John Smith. He had been born in the small town of Pine Bluff, Arkansas, and raised by his single mother, Mary Ann. John's family moved to California when he was young, settling in San Francisco where he attended high school. After graduating from high school, John enlisted in the U.S. Army as a machine gunner.
John's first assignment took him to Germany, serving with the 1st Battalion, 12th Infantry Regiment. During this time, John learned German and quickly became fluent in the language. In fact, it took him only two months to learn all 3,000 words of the alphabet. John's love for learning led him to attend college at Stanford University, majoring in history. While attending school, John also served as a rifleman in the 1st Armored Division.
After completing his undergraduate education, John returned to California to join the U.S. Navy. Upon his return to California, John married Mary Lou, a local homemaker. They raised three children: John Jr., Kathy, and Sharon. John enjoyed spending time with
------------------------------------------------------------------------------------------------------------------------------------------------
Generation took 3.7 seconds.
(.venv) donbr@lapdog:~/OpenELM-270M-Instruct$ python generate_openelm.py --model apple/OpenELM-270M-Instruct --hf_access_token hf_XXXXXX --prompt 'you are a 3 year old. count from one to ten.' --generate_kwargs repetition_penalty=1.2
WARNING:root:inference device is not set, using cuda:0, NVIDIA GeForce RTX 4070 Laptop GPU
/home/donbr/OpenELM-270M-Instruct/.venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/donbr/OpenELM-270M-Instruct/.venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
================================================================================================================================================
Prompt + Generated Output
------------------------------------------------------------------------------------------------------------------------------------------------
you are a 3 year old. count from one to ten.
10. You have been playing with your sister's stuffed animal, and it has started to flap around in the wind.
9. You notice that your sister is wearing her pajamas for bedtime.
8. You ask your mom if she can put your pajamas back in the closet.
7. Your dad comes home from work, and asks if you want to go to bed.
6. You lie down on the couch, looking up at the ceiling.
5. You hear footsteps coming from the living room.
4. You open the curtains, and see your little brother standing outside the window, staring at the stars.
3. You wake up to find your parents gone.
2. You look out the window, and see the moon shining through the trees.
1. You feel like you've just woken up from a dream.
This post was originally published on The Huffington Post. It has been updated with new photos and text.
Hey there! I found your blog via Pinterest. I love reading through your board
------------------------------------------------------------------------------------------------------------------------------------------------
Generation took 3.66 seconds.
Verifying behavior using lm-evaluation-harness guidance provided on the model card
- Draft Notebook of EleutherAI lm_eval process that I created as an initial attempt to verify OpenELM benchmarks
- initial evaluation was performed on Google Colab, ideally this would be in a cloud environment for transparency and tracking over time
- GitHub repo for EleutherAI Evaluation Test Harness
- I recommend reading carefully the OpenELM arXiv paper for clarification on the scope of testing
OTHER
- performing benchmarks against OLMo at some point may be worthwhile to reproduce, but ability to do an apples to apples comparison needs to be confirmed
- OLMo has a 1B param version, but the only instruct model currently is 7B params (https://huggingface.co/collections/allenai/olmo-suite-65aeaae8fe5b6b2122b46778)
- reading the OLMo arXiv paper is highly recommended
Other Notes:
- importance of LLM safety and controls in the individual models and also a safety net to catch when model level mechanisms fail.
- GPU cost of running model evaluation tools may prove prohibitive for some developers and development teams.
Evaluate specific scenarios where tools like LangSmith or Weights & Biases could be leveraged
- LangChain doesn't officially support OpenELM
- regression testing is another area where safety is a key consideration: https://x.com/hwchase17/status/1790122677648154962