Script?

by Vezora - opened Jan 2, 2024

Discussion

Vezora

Jan 2, 2024

Could you please share the training script?

UphamProjects

Jan 3, 2024

I second that request. Is it anything like method used by Q-bert/Mamba-370M? Or was this done with shell commands? A python solution would be awesome. I used this to make summaries for some emails and calls with no training and it works pretty well for generation.

UphamProjects

Jan 11, 2024

Here's how I'm training mine. It would probably be safe to assume I don't know what I'm doing but I can finagle some half coherent answers from this
I'm using 370m state space model and training it on a random assortment of insurance pdf going over handbooks histories and basic comp. Nothing is organized, but it does train.
https://colab.research.google.com/drive/199DTxoqJFRwrsykIbZpuIxVd40RCP-LJ?usp=sharing

lkurlandski

Jan 13, 2024

@UphamProjects I think you need to open the access of that link such that "Anyone Can View".

UphamProjects

Jan 15, 2024

I've been tinkering with it, it's still not organized but should be accessible for now.

eramax

Jan 24, 2024

please share the training script?

UphamProjects

Jan 24, 2024

Here. You'll need to change paths and stuff, but this should let you train on colab.

https://colab.research.google.com/drive/16AKSrMI3jEgXfWObrJalmeypqZSSDySo?usp=sharing

eramax

Jan 25, 2024

Here. You'll need to change paths and stuff, but this should let you train on colab.

https://colab.research.google.com/drive/16AKSrMI3jEgXfWObrJalmeypqZSSDySo?usp=sharing

Thanks

ramzeez88

Mar 27, 2024

•

edited Mar 27, 2024

could you train the hugging face (transformers) mamba variant please?

UphamProjects

Mar 28, 2024

That's new from last I checked. Yeah I'll check it out.

UphamProjects

Mar 28, 2024

https://colab.research.google.com/drive/1HB69O16hFeQwLZdfIiGlqDrdQliY1Sbb?usp=sharing

Training is pretty straight forward works out of the box on colab as they say on the model page. Just mind the transformers install.

ramzeez88

Mar 28, 2024

i have never trained a model yet. I will have a look at the colab though :)

UphamProjects

Mar 28, 2024

Just a note use
!pip install git+https://github.com/huggingface/transformers@main
!pip install datasets trl peft mamba-ssm causal-conv1d>=1.2.0
not
!pip install git+https://github.com/huggingface/transformers@main
!pip install datasets trl peft
the second will still let you train but will royally hog the gpu

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment