corrected 'generate' demo code
changed 'prompt' to 'messages' to correct generation error.
added explicit device assertion to alleviate this error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
added eos token to prevent open ended generation
added a print statement so the user can read the generated content.
Hi Tim, thanks for this PR! LGTM, but can we keep the device_map="auto"
instead? 🤗
Absolutely, it’s your project. I just had some issues with it but if other people need to they can just change the device on their own like I did.
No worries at all, I’ll try to reproduce to see if I also have the same issue, thanks for reporting! Coming back to this on Monday, but thanks a lot for the comments and the PR 🫶🏻
sounds good, have a nice weekend!
Hi @macadeliccc , I've reproduced in a Google Colab, feel free to use the code and output from there 🤗 Also to add the output so that users know what's the expected output
See https://colab.research.google.com/drive/1wHgYZQUtomhqFZHfTVFaDMbR8g8MjjkA?usp=sharing, thanks for the issue!