Kaito Sugimoto's picture

Kaito Sugimoto

kaisugi

·

https://kaisugi.me

kaisugi

AI & ML interests

Japanese LLMs

Recent Activity

upvoted a collection about 7 hours ago

liked a model about 15 hours ago

pfnet/plamo-2-8b

liked a model 14 days ago

sbintuitions/modernbert-ja-130m

View all activity

Organizations

kaisugi's activity

upvoted a collection about 7 hours ago

Asagi-VLM

Asagi is a Japanese Vision & Language model, trained on a large-scale synthetic dataset. • 4 items • Updated about 23 hours ago • 2

upvoted a collection 27 days ago

TinySwallow

Compact Japanese models trained with "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models" • 5 items • Updated 27 days ago • 16

upvoted 2 articles about 2 months ago

Article

Navigating Korean LLM Research #2: Evaluation Tools

By

•

Oct 23, 2024

• 7

Article

Navigating Korean LLM Research #1: Models

By

•

Oct 22, 2024

• 24

upvoted 2 collections 3 months ago

LLM-jp-3 Fine-tuned Models

Fine-tuned models in the LLM-jp-3 model series • 21 items • Updated 21 days ago • 4

LLM-jp-3 Pre-trained Models

Pre-trained models in the LLM-jp-3 model series • 8 items • Updated 21 days ago • 5

upvoted an article 4 months ago

Article

How to generate text: using different decoding methods for language generation with Transformers

Mar 1, 2020

• 159

upvoted a collection 5 months ago

Llama-3.1-Swallow

9 items • Updated 26 days ago • 5

upvoted a paper 5 months ago

PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency

Paper • 2410.07563 • Published Oct 10, 2024 • 2

upvoted 3 collections 5 months ago

gemma-2-baku

The baku model series are based on the gemma-2 series and have been continually pre-trained on Japanese-specific corpora. • 4 items • Updated 13 days ago • 4

Gemma 2 JPN Release

A Gemma 2 2B model fine-tuned on Japanese text. It supports the Japanese language the same level of performance of EN only queries on Gemma 2. • 3 items • Updated Dec 13, 2024 • 28

Borea

3 items • Updated Aug 21, 2024 • 2

upvoted a paper 5 months ago

Ruri: Japanese General Text Embeddings

Paper • 2409.07737 • Published Sep 12, 2024 • 8

upvoted 3 collections 6 months ago

Ruri: Japanese General Text Embeddings

18 items • Updated Sep 13, 2024 • 13

Japanese SimCSE

Tsukagoshi et al., Japanese SimCSE Technical Report, arXiv 2023. https://arxiv.org/abs/2310.19349 • 5 items • Updated Sep 4, 2024 • 2

Japanese Retrieval

3 items • Updated Aug 20, 2024 • 3

upvoted 4 collections 7 months ago

llama-3-youko

The youko model series are based on the llama-3 series and have been continually pre-trained on Japanese-specific corpora. • 9 items • Updated 13 days ago • 1

EZO

18 items • Updated Oct 3, 2024 • 2

Llama 3.1 GPTQ, AWQ, and BNB Quants

Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗 • 9 items • Updated Sep 26, 2024 • 56

Sarashina

Large Language Models developed by SB Intuitions • 8 items • Updated 6 days ago • 5