GAMA-IT

Running on Zero

App Files Files Community

GAMA-IT / hf /transformers /docs /source /en /internal /generation_utils.mdx

sonalkum

bug fix

fa57c60 7 months ago

raw

history blame

6.93 kB

	<!--Copyright 2020 The HuggingFace Team. All rights reserved.

	Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
	an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
	specific language governing permissions and limitations under the License.
	-->

	# Utilities for Generation

	This page lists all the utility functions used by [`~generation.GenerationMixin.generate`],
	[`~generation.GenerationMixin.greedy_search`],
	[`~generation.GenerationMixin.contrastive_search`],
	[`~generation.GenerationMixin.sample`],
	[`~generation.GenerationMixin.beam_search`],
	[`~generation.GenerationMixin.beam_sample`],
	[`~generation.GenerationMixin.group_beam_search`], and
	[`~generation.GenerationMixin.constrained_beam_search`].

	Most of those are only useful if you are studying the code of the generate methods in the library.

	## Generate Outputs

	The output of [`~generation.GenerationMixin.generate`] is an instance of a subclass of
	[`~utils.ModelOutput`]. This output is a data structure containing all the information returned
	by [`~generation.GenerationMixin.generate`], but that can also be used as tuple or dictionary.

	Here's an example:

	```python
	from transformers import GPT2Tokenizer, GPT2LMHeadModel

	tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
	model = GPT2LMHeadModel.from_pretrained("gpt2")

	inputs = tokenizer("Hello, my dog is cute and ", return_tensors="pt")
	generation_output = model.generate(**inputs, return_dict_in_generate=True, output_scores=True)
	```

	The `generation_output` object is a [`~generation.GreedySearchDecoderOnlyOutput`], as we can
	see in the documentation of that class below, it means it has the following attributes:

	- `sequences`: the generated sequences of tokens
	- `scores` (optional): the prediction scores of the language modelling head, for each generation step
	- `hidden_states` (optional): the hidden states of the model, for each generation step
	- `attentions` (optional): the attention weights of the model, for each generation step

	Here we have the `scores` since we passed along `output_scores=True`, but we don't have `hidden_states` and
	`attentions` because we didn't pass `output_hidden_states=True` or `output_attentions=True`.

	You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you
	will get `None`. Here for instance `generation_output.scores` are all the generated prediction scores of the
	language modeling head, and `generation_output.attentions` is `None`.

	When using our `generation_output` object as a tuple, it only keeps the attributes that don't have `None` values.
	Here, for instance, it has two elements, `loss` then `logits`, so

	```python
	generation_output[:2]
	```

	will return the tuple `(generation_output.sequences, generation_output.scores)` for instance.

	When using our `generation_output` object as a dictionary, it only keeps the attributes that don't have `None`
	values. Here, for instance, it has two keys that are `sequences` and `scores`.

	We document here all output types.


	### GreedySearchOutput

	[[autodoc]] generation.GreedySearchDecoderOnlyOutput

	[[autodoc]] generation.GreedySearchEncoderDecoderOutput

	[[autodoc]] generation.FlaxGreedySearchOutput

	### SampleOutput

	[[autodoc]] generation.SampleDecoderOnlyOutput

	[[autodoc]] generation.SampleEncoderDecoderOutput

	[[autodoc]] generation.FlaxSampleOutput

	### BeamSearchOutput

	[[autodoc]] generation.BeamSearchDecoderOnlyOutput

	[[autodoc]] generation.BeamSearchEncoderDecoderOutput

	### BeamSampleOutput

	[[autodoc]] generation.BeamSampleDecoderOnlyOutput

	[[autodoc]] generation.BeamSampleEncoderDecoderOutput

	## LogitsProcessor

	A [`LogitsProcessor`] can be used to modify the prediction scores of a language model head for
	generation.

	[[autodoc]] LogitsProcessor
	- __call__

	[[autodoc]] LogitsProcessorList
	- __call__

	[[autodoc]] LogitsWarper
	- __call__

	[[autodoc]] MinLengthLogitsProcessor
	- __call__

	[[autodoc]] MinNewTokensLengthLogitsProcessor
	- __call__

	[[autodoc]] TemperatureLogitsWarper
	- __call__

	[[autodoc]] RepetitionPenaltyLogitsProcessor
	- __call__

	[[autodoc]] TopPLogitsWarper
	- __call__

	[[autodoc]] TopKLogitsWarper
	- __call__

	[[autodoc]] TypicalLogitsWarper
	- __call__

	[[autodoc]] NoRepeatNGramLogitsProcessor
	- __call__

	[[autodoc]] NoBadWordsLogitsProcessor
	- __call__

	[[autodoc]] PrefixConstrainedLogitsProcessor
	- __call__

	[[autodoc]] HammingDiversityLogitsProcessor
	- __call__

	[[autodoc]] ForcedBOSTokenLogitsProcessor
	- __call__

	[[autodoc]] ForcedEOSTokenLogitsProcessor
	- __call__

	[[autodoc]] InfNanRemoveLogitsProcessor
	- __call__

	[[autodoc]] TFLogitsProcessor
	- __call__

	[[autodoc]] TFLogitsProcessorList
	- __call__

	[[autodoc]] TFLogitsWarper
	- __call__

	[[autodoc]] TFTemperatureLogitsWarper
	- __call__

	[[autodoc]] TFTopPLogitsWarper
	- __call__

	[[autodoc]] TFTopKLogitsWarper
	- __call__

	[[autodoc]] TFMinLengthLogitsProcessor
	- __call__

	[[autodoc]] TFNoBadWordsLogitsProcessor
	- __call__

	[[autodoc]] TFNoRepeatNGramLogitsProcessor
	- __call__

	[[autodoc]] TFRepetitionPenaltyLogitsProcessor
	- __call__

	[[autodoc]] TFForcedBOSTokenLogitsProcessor
	- __call__

	[[autodoc]] TFForcedEOSTokenLogitsProcessor
	- __call__

	[[autodoc]] FlaxLogitsProcessor
	- __call__

	[[autodoc]] FlaxLogitsProcessorList
	- __call__

	[[autodoc]] FlaxLogitsWarper
	- __call__

	[[autodoc]] FlaxTemperatureLogitsWarper
	- __call__

	[[autodoc]] FlaxTopPLogitsWarper
	- __call__

	[[autodoc]] FlaxTopKLogitsWarper
	- __call__

	[[autodoc]] FlaxForcedBOSTokenLogitsProcessor
	- __call__

	[[autodoc]] FlaxForcedEOSTokenLogitsProcessor
	- __call__

	[[autodoc]] FlaxMinLengthLogitsProcessor
	- __call__

	## StoppingCriteria

	A [`StoppingCriteria`] can be used to change when to stop generation (other than EOS token).

	[[autodoc]] StoppingCriteria
	- __call__

	[[autodoc]] StoppingCriteriaList
	- __call__

	[[autodoc]] MaxLengthCriteria
	- __call__

	[[autodoc]] MaxTimeCriteria
	- __call__

	## Constraints

	A [`Constraint`] can be used to force the generation to include specific tokens or sequences in the output.

	[[autodoc]] Constraint

	[[autodoc]] PhrasalConstraint

	[[autodoc]] DisjunctiveConstraint

	[[autodoc]] ConstraintListState

	## BeamSearch

	[[autodoc]] BeamScorer
	- process
	- finalize

	[[autodoc]] BeamSearchScorer
	- process
	- finalize

	[[autodoc]] ConstrainedBeamSearchScorer
	- process
	- finalize

	## Utilities

	[[autodoc]] top_k_top_p_filtering

	[[autodoc]] tf_top_k_top_p_filtering