Script?
Could you please share the training script?
I second that request. Is it anything like method used by Q-bert/Mamba-370M? Or was this done with shell commands? A python solution would be awesome. I used this to make summaries for some emails and calls with no training and it works pretty well for generation.
Here's how I'm training mine. It would probably be safe to assume I don't know what I'm doing but I can finagle some half coherent answers from this
I'm using 370m state space model and training it on a random assortment of insurance pdf going over handbooks histories and basic comp. Nothing is organized, but it does train.
https://colab.research.google.com/drive/199DTxoqJFRwrsykIbZpuIxVd40RCP-LJ?usp=sharing
@UphamProjects I think you need to open the access of that link such that "Anyone Can View".
I've been tinkering with it, it's still not organized but should be accessible for now.
please share the training script?
Here. You'll need to change paths and stuff, but this should let you train on colab.
https://colab.research.google.com/drive/16AKSrMI3jEgXfWObrJalmeypqZSSDySo?usp=sharing
Here. You'll need to change paths and stuff, but this should let you train on colab.
https://colab.research.google.com/drive/16AKSrMI3jEgXfWObrJalmeypqZSSDySo?usp=sharing
Thanks
could you train the hugging face (transformers) mamba variant please?
That's new from last I checked. Yeah I'll check it out.
https://colab.research.google.com/drive/1HB69O16hFeQwLZdfIiGlqDrdQliY1Sbb?usp=sharing
Training is pretty straight forward works out of the box on colab as they say on the model page. Just mind the transformers install.
i have never trained a model yet. I will have a look at the colab though :)
Just a note use
!pip install git+https://github.com/huggingface/transformers@main
!pip install datasets trl peft mamba-ssm causal-conv1d>=1.2.0
not
!pip install git+https://github.com/huggingface/transformers@main
!pip install datasets trl peft
the second will still let you train but will royally hog the gpu