Any plans to share Sharded Models?
Any plans to share Sharded Models? that might make it easy to fit this on Colab
Hi Thank you for the feedback!
The model is already shaded into 2 parts now.
Any specific shards you are looking for? Can you provide an example? Cheers!
@yinsong1986 Usually for google colab people use models sharded into pieces roughly around 2-3gb each. this makes it easier for the very low system memory restraints of the free google colab tier to load the model into memory before loading into vram (usually restricted to 12gb of system ram)
So you'd end up with maybe 5-10 actual model weight shards in the end rather than 2. Just wanted to further elaborate for you.
@1littlecoder Love your videos btw! big fan β₯
Thanks very much. Very kind of you!
Thanks for your explanation! @rombodawg @1littlecoder
If we plan to further shard the model to small pieces, which is easier for you?
- Option 1: replace the shards in this model repo with smaller shards.
- Option 2: create a new model repo and upload the same model with more shards there.
Thank you!
I dont know about what 1littlecoder thinks but i highly recommend uploading a new version of the model and call it a (sharded) version, and keep the original model as well. As some user often prefer sharded model, and others prefer having to download less model files.
Thats just my two cents
You can also upload different revisions of the model in this one repo. TheBloke does this extensively with his different gptq fine tune combinations.
Thanks @ssmi153 for your suggestion!
To the best of my understanding, the Bloke they uploaded the models for different gptq models to one repo, so it is easier to differentiate.
But the request here AFAIK is a bit different. Since the current model is already sharded to 2 shards. If we upload another sharded version of the same model, say 10 shards, it may confuse library like HF transformers, to read the correct model files? Pls correct me if I am wrong, or you are referring to some other solution? Thank you!
@yinsong1986 , the Revision option effectively makes a branch of the repo, so the files are separated. By default, users would recieve the files in the "main" branch, but they can also request to pull from one of the other branches (e.g. you could create one called "smallshards") instead. In reality, if it's easier just to create another repo then you may as well do that :) I was just letting you know that this was an option.
Thanks for all your feedback!
@1littlecoder @rombodawg @ssmi153
Now I have uploaded the model with smaller shards in a new branch https://huggingface.co/amazon/MistralLite/tree/small-shards
You should be able to load the model with smaller shards in that branch. pls have a try and let me know how it goes, and thank you!