|
--- |
|
base_model: ibm-granite/granite-3.0-2b-instruct |
|
license: apache-2.0 |
|
pipeline_tag: text-generation |
|
tags: |
|
- language |
|
- granite-3.0 |
|
quantized_model: AliNemati |
|
inference: false |
|
model-index: |
|
- name: granite-3.0-2b-instruct |
|
results: |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: IFEval |
|
type: instruction-following |
|
metrics: |
|
- type: pass@1 |
|
value: 52.27 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 8.22 |
|
name: pass@1 |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: AGI-Eval |
|
type: human-exams |
|
metrics: |
|
- type: pass@1 |
|
value: 40.52 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 65.82 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 34.45 |
|
name: pass@1 |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: OBQA |
|
type: commonsense |
|
metrics: |
|
- type: pass@1 |
|
value: 46.6 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 71.21 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 82.61 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 77.51 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 60.32 |
|
name: pass@1 |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: BoolQ |
|
type: reading-comprehension |
|
metrics: |
|
- type: pass@1 |
|
value: 88.65 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 21.58 |
|
name: pass@1 |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: ARC-C |
|
type: reasoning |
|
metrics: |
|
- type: pass@1 |
|
value: 64.16 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 33.81 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 51.55 |
|
name: pass@1 |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: HumanEvalSynthesis |
|
type: code |
|
metrics: |
|
- type: pass@1 |
|
value: 64.63 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 57.16 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 65.85 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 49.6 |
|
name: pass@1 |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: GSM8K |
|
type: math |
|
metrics: |
|
- type: pass@1 |
|
value: 68.99 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 30.94 |
|
name: pass@1 |
|
- task: |
|
type: text-generation |
|
dataset: |
|
name: PAWS-X (7 langs) |
|
type: multilingual |
|
metrics: |
|
- type: pass@1 |
|
value: 64.94 |
|
name: pass@1 |
|
- type: pass@1 |
|
value: 48.2 |
|
name: pass@1 |
|
--- |
|
|
|
**osllm.ai Models Highlights Program** |
|
|
|
**We believe there's no need to pay a token if you have a GPU on your computer.** |
|
|
|
Highlighting new and noteworthy models from the community. Join the conversation on Discord. |
|
|
|
|
|
**Model creator**: ibm-granite |
|
|
|
**Original model**: granite-3.0-3b-a800m-instruct |
|
|
|
|
|
[**README**:](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct/edit/main/README.md) |
|
|
|
<p align="center"> |
|
<a href="https://osllm.ai">Official Website</a> • <a href="https://docs.osllm.ai/index.html">Documentation</a> • <a href="https://discord.gg/2fftQauwDD">Discord</a> |
|
</p> |
|
|
|
|
|
|
|
<p align="center"> |
|
<b>NEW:</b> <a href="https://docs.google.com/forms/d/1CQXJvxLUqLBSXnjqQmRpOyZqD6nrKubLz2WTcIJ37fU/prefill">Subscribe to our mailing list</a> for updates and news! |
|
</p> |
|
|
|
|
|
Email: [email protected] |
|
|
|
|
|
|
|
**Model Summary:** |
|
Granite-3.0-2B-Instruct is a 2B parameter model finetuned from *Granite-3.0-2B-Base* using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging. |
|
|
|
- **Developers:** Granite Team, IBM |
|
- **GitHub Repository:** [ibm-granite/granite-3.0-language-models](https://github.com/ibm-granite/granite-3.0-language-models) |
|
- **Website**: [Granite Docs](https://www.ibm.com/granite/docs/) |
|
- **Paper:** [Granite 3.0 Language Models](https://github.com/ibm-granite/granite-3.0-language-models/blob/main/paper.pdf) |
|
- **Release Date**: October 21st, 2024 |
|
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
|
|
|
**Supported Languages:** |
|
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may finetune Granite 3.0 models for languages beyond these 12 languages. |
|
|
|
**Intended use:** |
|
The model is designed to respond to general instructions and can be used to build AI assistants for multiple domains, including business applications. |
|
|
|
*Capabilities* |
|
* Summarization |
|
* Text classification |
|
* Text extraction |
|
* Question-answering |
|
* Retrieval Augmented Generation (RAG) |
|
* Code related tasks |
|
* Function-calling tasks |
|
* Multilingual dialog use cases |
|
|
|
|
|
|
|
|
|
**About [osllm.ai](https://osllm.ai)**: |
|
|
|
[osllm.ai](https://osllm.ai) is a community-driven platform that provides access to a wide range of open-source language models. |
|
|
|
1. **[IndoxJudge](https://github.com/indoxJudge)**: A free, open-source tool for evaluating large language models (LLMs). |
|
It provides key metrics to assess performance, reliability, and risks like bias and toxicity, helping ensure model safety. |
|
|
|
1. **[inDox](https://github.com/inDox)**: An open-source retrieval augmentation tool for extracting data from various |
|
document formats (text, PDFs, HTML, Markdown, LaTeX). It handles structured and unstructured data and supports both |
|
online and offline LLMs. |
|
|
|
1. **[IndoxGen](https://github.com/IndoxGen)**: A framework for generating high-fidelity synthetic data using LLMs and |
|
human feedback, designed for enterprise use with high flexibility and precision. |
|
|
|
1. **[Phoenix](https://github.com/Phoenix)**: A multi-platform, open-source chatbot that interacts with documents |
|
locally, without internet or GPU. It integrates inDox and IndoxJudge to improve accuracy and prevent hallucinations, |
|
ideal for sensitive fields like healthcare. |
|
|
|
1. **[Phoenix_cli](https://github.com/Phoenix_cli)**: A multi-platform command-line tool that runs LLaMA models locally, |
|
supporting up to eight concurrent tasks through multithreading, eliminating the need for cloud-based services. |
|
|
|
|
|
|
|
|
|
**Special thanks** |
|
|
|
🙏 Special thanks to [**Georgi Gerganov**](https://github.com/ggerganov) and the whole team working on [**llama.cpp**](https://github.com/ggerganov/llama.cpp) for making all of this possible. |
|
|
|
|
|
|
|
**Disclaimers** |
|
|
|
[osllm.ai](https://osllm.ai) is not the creator, originator, or owner of any Model featured in the Community Model Program. |
|
Each Community Model is created and provided by third parties. osllm.ai does not endorse, support, represent, |
|
or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand |
|
that Community Models can produce content that might be offensive, harmful, inaccurate, or otherwise |
|
inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who |
|
originated such Model. osllm.ai may not monitor or control the Community Models and cannot, and does not, take |
|
responsibility for any such Model. osllm.ai disclaims all warranties or guarantees about the accuracy, |
|
reliability, or benefits of the Community Models. osllm.ai further disclaims any warranty that the Community |
|
Model will meet your requirements, be secure, uninterrupted, or available at any time or location, or |
|
error-free, virus-free, or that any errors will be corrected, or otherwise. You will be solely responsible for |
|
any damage resulting from your use of or access to the Community Models, your downloading of any Community |
|
Model, or use of any other Community Model provided by or through [osllm.ai](https://osllm.ai). |
|
|
|
|
|
|
|
|
|
|