How to use BLOOM for text summarization ?

#172

by ankit5678 - opened Jan 15, 2023

Discussion

ankit5678

Jan 15, 2023

how to create summary using bloom , I wanted to compare the output of bloom with gpt-3.
is it possible?

ybelkada

BigScience Workshop org Jan 15, 2023

You may want to pre-pend your prompt with something like: Summarize the following text: (or put it after the text). Please also have a look at BLOOMZ, which is much better on these tasks than BLOOM: https://huggingface.co/bigscience/bloomz

NigelTheMaker

Jan 21, 2023

The following is an abstractive summarisation system. It allows user to input a large block of text then it summarises that input in to a smaller paragraph. The system stops generating when it has finished summarising, it does not print extra text after. The system always sticks to the context when summarising, does not repeat any text and finishes at a full stop:

Source text: Peter and Elizabeth took a taxi to attend the right party in the city. While in the party Elizabeth collapsed and was rushed to the hospital.

Summary: Elizabeth was hospitalised after attending a party with Peter.

Source text: John went buy sweets at the corner shop. At some point on the way back he dropped 5 of them and was left with 3.

Summary: John dropped 5 sweets on the way back from the corner shop and was left with 3.

If you to learn more about prompting or having questions about use bloom come join my discord https://discord.gg/y9newnc9. I'll be releasing chat interface templates which come with built in prompts that you can easily customise. You will also have the ability to store and emit prompts into the interface easily

sarasc

Mar 23, 2023

how to create summary using bloom , I wanted to compare the output of bloom with gpt-3.
is it possible?

Did you get to perform this?

Nimja

Jun 1, 2023

•

edited Jun 1, 2023

This is the code I used and for long texts, the summary is completely useless.

Chat-GPT (3.5) gave me VERY good answers;

from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
import torch
torch.set_default_tensor_type(torch.cuda.FloatTensor)

language_model = "bigscience/bloomz-1b1"

model = AutoModelForCausalLM.from_pretrained(language_model, use_cache=True)
model.cuda()
tokenizer = AutoTokenizer.from_pretrained(language_model)

prompt = "Mysteriously long text..."

summary_instruction = f"\n\nSummarize the previous text in three sentences:\n\n"

total_prompt = prompt + summary_instruction

input_ids = tokenizer(prompt + f"\n\n" + summary_instruction, return_tensors="pt").to(0)
sample = model.generate(**input_ids, max_length=5000,  top_k=1, temperature=0.9, repetition_penalty = 2.0)

result_string = tokenizer.decode(sample[0], truncate_before_pattern=[r"\n\n^#", "^'", "\n\n\n"])

print(result_string[len(total_prompt)::])

Nimja

Jun 1, 2023

The following is an abstractive summarisation system. It allows user to input a large block of text then it summarises that input in to a smaller paragraph. The system stops generating when it has finished summarising, it does not print extra text after. The system always sticks to the context when summarising, does not repeat any text and finishes at a full stop:

Source text: Peter and Elizabeth took a taxi to attend the right party in the city. While in the party Elizabeth collapsed and was rushed to the hospital.

Summary: Elizabeth was hospitalised after attending a party with Peter.

Source text: John went buy sweets at the corner shop. At some point on the way back he dropped 5 of them and was left with 3.

Summary: John dropped 5 sweets on the way back from the corner shop and was left with 3.

Why not share the prompt and explanation here, just like you did here: https://huggingface.co/bigscience/bloom/discussions/183

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment