Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
0.0
TFLOPS
40
26
26
Xuan Son NGUYEN
ngxson
Follow
Felladrin's profile picture
cbilgin's profile picture
eliebak's profile picture
91 followers
·
30 following
https://blog.ngxson.com
ngxson
ngxson
ngxson
ngxson.hf.co
AI & ML interests
Doing AI for fun, not for profit
Recent Activity
updated
a model
about 6 hours ago
ngxson/Qwen2.5-7B-Instruct-1M-Q4_K_M-GGUF
published
a model
about 6 hours ago
ngxson/Qwen2.5-7B-Instruct-1M-Q4_K_M-GGUF
reacted
to
mitkox
's
post
with 🚀
1 day ago
llama.cpp is 26.8% faster than ollama. I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison. Total duration: llama.cpp 6.85 sec <- 26.8% faster ollama 8.69 sec Breakdown by phase: Model loading llama.cpp 241 ms <- 2x faster ollama 553 ms Prompt processing llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster ollama 42.17 tokens/s with an eval time of 498 ms Token generation llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster ollama 122.07 tokens/s with an eval time 7.64 sec llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing. Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
View all activity
Articles
Introducing GGUF-my-LoRA
Nov 1, 2024
•
13
Code a simple RAG from scratch
Oct 29, 2024
•
18
Introduction to ggml
Aug 13, 2024
•
132
Organizations
ngxson
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
New activity in
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
4 days ago
Tokenizer config is wrong
7
#10 opened 5 days ago by
stoshniwal
New activity in
5CD-AI/Vintern-1B-v3_5
12 days ago
Deployment as server?
11
#1 opened 13 days ago by
ngxson
New activity in
5CD-AI/Viet-Doc-VQA-verIII
16 days ago
🚩 Report: Not working
3
#1 opened 17 days ago by
khang119966
New activity in
ngxson/MiniThinky-dataset
18 days ago
Librarian Bot: Add language metadata for dataset
#2 opened 19 days ago by
librarian-bot
New activity in
ngxson/MiniThinky-1B-Llama-3.2
18 days ago
Update README.md
1
#2 opened 18 days ago by
Xenova
New activity in
bartowski/QVQ-72B-Preview-GGUF
19 days ago
Add system message
1
#7 opened 19 days ago by
ngxson
Ollama upload please.
15
#2 opened about 1 month ago by
AlgorithmicKing
New activity in
ngxson/MiniThinky-v2-1B-Llama-3.2
19 days ago
Upload folder using huggingface_hub
1
#1 opened 19 days ago by
Xenova
New activity in
ngxson/MiniThinky-1B-Llama-3.2
21 days ago
Upload folder using huggingface_hub
#1 opened 21 days ago by
Xenova
New activity in
ggml-org/gguf-my-repo
25 days ago
Update app.py
1
#144 opened 27 days ago by
gghfez
New activity in
ggml-org/gguf-my-repo
about 2 months ago
Accessing own private repos
2
#141 opened about 2 months ago by
themex1380
[Errno 2] No such file or directory: './llama.cpp/llama-quantize'
11
#140 opened about 2 months ago by
AlirezaF138
New activity in
ggml-org/gguf-my-repo
2 months ago
Error quantizing: b'/bin/sh: 1: ./llama.cpp/llama-quantize: not found\n'
6
#136 opened 2 months ago by
win10
Better isolation + various improvements
3
#133 opened 3 months ago by
ngxson
New activity in
ggml-org/gguf-my-repo
3 months ago
update readme for card generation
4
#128 opened 3 months ago by
ariG23498
Error converting to fp16: b'INFO:hf-to-gguf:Loading model: qwen2.5-3b
1
#135 opened 3 months ago by
nanowell
Qwen2.5-3B: [Errno 2] No such file or directory: 'downloads/tmpg0g5sjvl'
1
#134 opened 3 months ago by
nanowell
add docker compose for dev locally
1
#130 opened 3 months ago by
ngxson
Add F16 and BF16 quantization
1
#129 opened 3 months ago by
andito
Update app.py
2
#132 opened 3 months ago by
velyan
Load more