girt-t5-base / README.md
nafisehNik's picture
add paper.
d59bded verified
metadata
license: mit
language:
  - en
metrics:
  - bleu
  - rouge
  - meteor
pipeline_tag: text2text-generation
widget:
  - text: >-
      name: Bug report\nabout: Create a report to help us improve\ntitle:
      <|EMPTY|>\nlabels: <|EMPTY|>\nassignees: <|EMPTY|>\nheadlines_type:
      <|MASK|>\nheadlines: <|MASK|>\nsummary: This issue report aims to describe
      a bug encountered while using the software. It includes a clear and
      concise description of the issue, steps to reproduce the behavior,
      expected behavior, screenshots (if applicable), and relevant versions of
      the operating system, IIS, Django, and Python. Additional context may also
      be provided to provide further details about the problem.
    example_title: Example 1
datasets:
  - nafisehNik/GIRT-Instruct

GIRT-Model

paper: https://arxiv.org/abs/2402.02632

demo: https://huggingface.co/spaces/nafisehNik/girt-space

This model is fine-tuned to generate issue report templates based on the input instruction provided. It has been fine-tuned on GIRT-Instruct data.

Usage


from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# load model and tokenizer
model = AutoModelForSeq2SeqLM.from_pretrained('nafisehNik/girt-t5-base')
tokenizer = AutoTokenizer.from_pretrained(nafisehNik/girt-t5-base)

# method for computing issue report template generation
def compute(sample, top_p, top_k, do_sample, max_length, min_length):

    inputs = tokenizer(sample, return_tensors="pt").to('cpu')

    outputs = model.generate(
        **inputs,
        min_length= min_length,
        max_length=max_length,
        do_sample=do_sample,
        top_p=top_p,
        top_k=top_k).to('cpu')

    generated_texts = tokenizer.batch_decode(outputs, skip_special_tokens=False)
    generated_text = generated_texts[0]
    
    replace_dict = {
        '\n ': '\n',
        '</s>': '',
        '<pad> ': '',
        '<pad>': '',
        '<unk>!--': '<!--',
        '<unk>': '',
    }
    
    postprocess_text = generated_text
    for key, value in replace_dict.items():
        postprocess_text = postprocess_text.replace(key, value)

    return postprocess_text

prompt = "YOUR INPUT INSTRUCTION"
result = compute(prompt, top_p = 0.92, top_k=0, do_sample=True, max_length=300, min_length=30)

Citation

@article{nikeghbal2024girt,
  title={GIRT-Model: Automated Generation of Issue Report Templates},
  author={Nikeghbal, Nafiseh and Kargaran, Amir Hossein and Heydarnoori, Abbas},
  journal={arXiv preprint arXiv:2402.02632},
  year={2024}
}