Leonard Pรผttmann's picture

Leonard Pรผttmann PRO

puettmann

AI & ML interests

None yet

Recent Activity

reacted to anakin87's post with ๐Ÿ‘ 3 days ago
๐๐ž๐ฐ ๐ˆ๐ญ๐š๐ฅ๐ข๐š๐ง ๐’๐ฆ๐š๐ฅ๐ฅ ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž ๐Œ๐จ๐๐ž๐ฅ๐ฌ: ๐†๐ž๐ฆ๐ฆ๐š ๐๐ž๐จ๐ ๐ž๐ง๐ž๐ฌ๐ข๐ฌ ๐œ๐จ๐ฅ๐ฅ๐ž๐œ๐ญ๐ข๐จ๐ง ๐Ÿ’Ž๐ŸŒ๐Ÿ‡ฎ๐Ÿ‡น I am happy to release two new language models for the Italian Language! ๐Ÿ’ช Gemma 2 9B Neogenesis ITA https://huggingface.co/anakin87/gemma-2-9b-neogenesis-ita Building on the impressive work by VAGO Solutions, I applied Direct Preference Optimization with a mix of Italian and English data. Using Spectrum, I trained 20% of model layers. ๐Ÿ“Š Evaluated on the Open ITA LLM leaderboard (https://huggingface.co/spaces/mii-llm/open_ita_llm_leaderboard), this model achieves strong performance. To beat it on this benchmark, you'd need a 27B model ๐Ÿ˜Ž ๐Ÿค Gemma 2 2B Neogenesis ITA https://huggingface.co/anakin87/gemma-2-2b-neogenesis-ita This smaller variant is fine-tuned from the original Gemma 2 2B it by Google. Through a combination of Supervised Fine-Tuning and Direct Preference Optimization, I trained 25% of the layers using Spectrum. ๐Ÿ“ˆ Compared to the original model, it shows improved Italian proficiency, good for its small size. Both models were developed during the recent #gemma competition on Kaggle. ๐Ÿ““ Training code: https://www.kaggle.com/code/anakin87/post-training-gemma-for-italian-and-beyond ๐Ÿ™ Thanks @FinancialSupport and mii-llm for the help during evaluation.
liked a model 7 days ago
anakin87/gemma-2-2b-neogenesis-ita
liked a model 7 days ago
anakin87/gemma-2-9b-neogenesis-ita
View all activity

Organizations

Kern AI GmbH's profile picture