Input text size
I noticed that example 4 has more than 2000 words, but when I wanted to try the summary pipeline myself, I faced with below error:
"Token indices sequence length is longer than the specified maximum sequence length for this model (2114 > 1024). Running this sequence through the model will result in indexing errors."
Is there a parameter I need to pass in specifically?
Try giving the argument truncation=True while calling the model.
when we accept the truncation it summarize the whole text or it stops at the maximum length plz ?
From what I can understaand, truncation=True
will truncate to the context length. Based on the error above, I suspect this is 1024 tokens.
you need to split the text into chunks of 1024 (This model limitation is 1024) and then get the summary and then append the results to get summary. if you want to, you can pass the final concatenated result to the model, to get the summary of summaries.