--- pipeline_tag: text-generation inference: true widget: - text: 'def print_hello_world():' example_title: Hello world group: Python datasets: - bigcode/commitpackft - bigcode/oasst-octopack metrics: - code_eval library_name: transformers language: - zh - en tags: - codegeex - glm - chatglm model-index: - name: OctoGeeX results: - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalSynthesize Python metrics: - name: pass@1 type: pass@1 value: 44.7 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalSynthesize JavaScript metrics: - name: pass@1 type: pass@1 value: 33.8 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalSynthesize Java metrics: - name: pass@1 type: pass@1 value: 36.9 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalSynthesize Go metrics: - name: pass@1 type: pass@1 value: 21.9 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalSynthesize C++ metrics: - name: pass@1 type: pass@1 value: 32.3 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalSynthesize Rust metrics: - name: pass@1 type: pass@1 value: 25.7 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalSynthesize Average metrics: - name: pass@1 type: pass@1 value: 30.9 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalFix Python metrics: - name: pass@1 type: pass@1 value: 28.1 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalFix JavaScript metrics: - name: pass@1 type: pass@1 value: 27.7 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalFix Java metrics: - name: pass@1 type: pass@1 value: 30.4 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalFix Go metrics: - name: pass@1 type: pass@1 value: 27.6 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalFix C++ metrics: - name: pass@1 type: pass@1 value: 22.9 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalFix Rust metrics: - name: pass@1 type: pass@1 value: 9.6 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalFix Average metrics: - name: pass@1 type: pass@1 value: 24.4 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalExplain Python metrics: - name: pass@1 type: pass@1 value: 30.4 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalExplain JavaScript metrics: - name: pass@1 type: pass@1 value: 24.0 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalExplain Java metrics: - name: pass@1 type: pass@1 value: 24.7 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalExplain Go metrics: - name: pass@1 type: pass@1 value: 21.7 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalExplain C++ metrics: - name: pass@1 type: pass@1 value: 21.0 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalExplain Rust metrics: - name: pass@1 type: pass@1 value: 15.9 verified: false - task: type: text-generation dataset: type: bigcode/humanevalpack name: HumanEvalExplain Average metrics: - name: pass@1 type: pass@1 value: 22.9 verified: false --- ![Octopack](https://github.com/bigcode-project/octopack/blob/31f3320f098703c7910e43492c39366eeea68d83/banner.png?raw=true) # OctoGeeX Play with the model on the [TODO Playground](https://huggingface.co/spaces/bigcode/bigcode-playground). ## Table of Contents 1. [Model Summary](##model-summary) 2. [Use](##use) 3. [Limitations](##limitations) 4. [Training](##training) 5. [License](##license) 6. [Citation](##citation) ## Model Summary OctoGeeX is an instruction tuned model with 6B parameters created by fine-tuning [CodeGeeX2](https://huggingface.co/THUDM/codegeex2-6b) on [CommitPackFT](https://huggingface.co/datasets/bigcode/commitpackft) & [OASST](https://huggingface.co/datasets/bigcode/oasst-octopack) as described in the OctoPack paper. - **Repository:** [bigcode/octopack](https://github.com/bigcode-project/octopack) - **Paper:** [TODO]() - **Languages:** 100+ Programming languages - **OctoPack🐙🎒:**
Data | CommitPack | 4TB of GitHub commits across 350 programming languages |
---|---|---|
CommitPackFT | Filtered version of CommitPack for high-quality commit messages that resemble instructions | |
Model | OctoCoder | StarCoder (16B parameters) instruction tuned on CommitPackFT + OASST |
OctoGeeX | CodeGeeX2 (6B parameters) instruction tuned on CommitPackFT + OASST | |
Evaluation | HumanEvalPack | Extension of OpenAI's HumanEval to cover 3 scenarios across 6 languages |