gemini-small / README.md
land14's picture
Update README.md
ba5967d
---
language: en
tags:
- Explain code
- Code Summarization
- Summarization
license: mit
---
# Gemini
For in-depth understanding of our model and methods, please see our blog [here](https://www.describe-ai.com/gemini)
## Model description
Gemini is a transformer based on Google's T5 model. The model is pre-trained on approximately 800k code/description pairs and then fine-tuned on 10k higher-level explanations that were synthetically generated. Gemini is capable of summarization/explaining short to medium code snippets in:
- Python
- Javascript (mostly vanilla JS, however, it can handle frameworks like React as well)
- Java
- Ruby
- Go
And outputs a description in English.
## Intended uses & limitations
Gemini without any additional fine-tuning is capable of explaining code in a sentence or two and typically performs best in Python and Javascript. We recommend using Gemini for either simple code explanation, documentation or producing more synthetic data to improve its explanations.
### How to use
You can use this model directly with a pipeline for Text2Text generation, as shown below:
```python
from transformers import pipeline, set_seed
summarizer = pipeline('text2text-generation', model='describeai/gemini-small')
code = "print('hello world!')"
response = summarizer(code, max_length=100, num_beams=3)
print("Summarized code: " + response[0]['generated_text'])
```
Which should yield something along the lines of:
```
Summarized code: The following code is greeting the world.
```
### Model sizes
- Gemini: 770 Million Parameters
- Gemini-Small (this repo): 220 Million Parameters
### Limitations
Typically, Gemini may produce overly simplistic descriptions that don't encompass the entire code snippet. We suspect with more training data, this could be circumvented and will produce better results.
### About Us
A Describe.ai, we are focused on building Artificial Intelligence systems that can understand language as well as humans. While a long path, we plan to contribute our findings to our API to the Open Source community.