File size: 2,594 Bytes
d24935a
 
 
 
 
 
db09b42
 
d24935a
 
db09b42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d59bded
 
 
 
26788a7
db09b42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26788a7
7bf9907
db09b42
 
 
 
 
d59bded
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
license: mit
language:
- en
metrics:
- bleu
- rouge
- meteor
pipeline_tag: text2text-generation
widget:
- text: >-
    name: Bug report\nabout: Create a report to help us improve\ntitle:
    <|EMPTY|>\nlabels: <|EMPTY|>\nassignees: <|EMPTY|>\nheadlines_type:
    <|MASK|>\nheadlines: <|MASK|>\nsummary: This issue report aims to describe a
    bug encountered while using the software. It includes a clear and concise
    description of the issue, steps to reproduce the behavior, expected
    behavior, screenshots (if applicable), and relevant versions of the
    operating system, IIS, Django, and Python. Additional context may also be
    provided to provide further details about the problem.
  example_title: Example 1
datasets:
- nafisehNik/GIRT-Instruct
---

# GIRT-Model 

paper: https://arxiv.org/abs/2402.02632

demo: https://huggingface.co/spaces/nafisehNik/girt-space

This model is fine-tuned to generate issue report templates based on the input instruction provided. It has been fine-tuned on [GIRT-Instruct](https://huggingface.co/datasets/nafisehNik/GIRT-Instruct) data.

## Usage

```python

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# load model and tokenizer
model = AutoModelForSeq2SeqLM.from_pretrained('nafisehNik/girt-t5-base')
tokenizer = AutoTokenizer.from_pretrained(nafisehNik/girt-t5-base)

# method for computing issue report template generation
def compute(sample, top_p, top_k, do_sample, max_length, min_length):

    inputs = tokenizer(sample, return_tensors="pt").to('cpu')

    outputs = model.generate(
        **inputs,
        min_length= min_length,
        max_length=max_length,
        do_sample=do_sample,
        top_p=top_p,
        top_k=top_k).to('cpu')

    generated_texts = tokenizer.batch_decode(outputs, skip_special_tokens=False)
    generated_text = generated_texts[0]
    
    replace_dict = {
        '\n ': '\n',
        '</s>': '',
        '<pad> ': '',
        '<pad>': '',
        '<unk>!--': '<!--',
        '<unk>': '',
    }
    
    postprocess_text = generated_text
    for key, value in replace_dict.items():
        postprocess_text = postprocess_text.replace(key, value)

    return postprocess_text

prompt = "YOUR INPUT INSTRUCTION"
result = compute(prompt, top_p = 0.92, top_k=0, do_sample=True, max_length=300, min_length=30)
```


## Citation

```
@article{nikeghbal2024girt,
  title={GIRT-Model: Automated Generation of Issue Report Templates},
  author={Nikeghbal, Nafiseh and Kargaran, Amir Hossein and Heydarnoori, Abbas},
  journal={arXiv preprint arXiv:2402.02632},
  year={2024}
}
```