Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

fantaxyย 
posted an update 1 day ago
view post
Post
1918
๐Ÿ“š AI Graphic Novel Generator Suite 2025

๐ŸŽฏ Four Unique Genre Experiences

๐Ÿ—ก๏ธ Martial Arts Novel Generator
fantaxy/novel-sorim-en

Epic wuxia storytelling with real-time combat art
Traditional martial arts world visualization
Dynamic qi techniques in motion
Beautiful Eastern art style generation

๐Ÿ’– Romance Novel Generator
fantaxy/novel-romance-en

Contemporary romance with matching scenes
Emotional moment captures in art
Modern relationship visualization
Real-time romantic illustrations

๐Ÿ‰ Fantasy Novel Generator
fantaxy/novel-fantasy-en

Rich fantasy worlds come alive
Magical scenes in stunning detail
Epic quests visualized instantly
Dynamic fantasy art generation

๐Ÿ”’ Adult Novel Generator
fantaxy/novel-NSFW-en

Mature content with tasteful art (18+)
Modern scene visualization
Character-focused illustrations
Sophisticated mood settings

โšก Core Features

7000+ token story generation
Real-time text-to-art creation
Auto scene illustration
Continuous story flow
Dynamic image gallery
HD quality (768x768)

๐Ÿ› ๏ธ Technical Highlights

Advanced Flux image generation
Story-driven art creation
Genre-optimized visuals
Seamless integration
Instant visualization

#AINovel #GraphicNovel #StoryGeneration #HuggingFace
aiqcampย 
posted an update 1 day ago
view post
Post
1466
Chat with Gemini 2.0 Flash and See its Thoughts! ๐Ÿค–๐Ÿ’ญ

Experience the future of AI interaction with this innovative demo featuring Google's Gemini 2.0 Flash model. Watch in real-time as the AI reveals its thought process before delivering responses! ๐ŸŽฏ

โœจ Key Features

Transparent AI Thinking: Observe the model's reasoning process with "โš™๏ธ Thinking" indicators
Real-time Streaming: Smooth, natural conversation flow with immediate responses
Conversation History: Multi-turn dialogue support for context-aware interactions
Clean Interface: Markdown support with intuitive chat layout
Mobile-Friendly: Responsive design for access on any device

๐Ÿ› ๏ธ Technical Highlights

Powered by Google's latest Gemini 2.0 Flash model
Built with Gradio for seamless UI/UX
Streaming architecture for responsive interactions
Error handling for stable performance
Customizable themes with Soft theme integration

๐Ÿ’ก Perfect Use Cases

Education: Watch AI reasoning in action
Research: Study AI thought patterns
Development: Understand model behavior
Exploration: Test various prompts and scenarios

Try it now: aiqcamp/Gemini2-Flash-Thinking

๐ŸŽฎ Getting Started

Enter your message or select an example prompt
Watch the model's thought process unfold
Receive detailed, contextual responses
Use "Clear Chat" to start fresh

#MachineLearning #AI #Gemini #GoogleAI #NLP #AIResearch #DeepLearning
  • 2 replies
ยท
openfreeย 
posted an update 2 days ago
view post
Post
3761
๐Ÿธ Pepe Meme Generator

Hello to everyone who loves frog memes! Now you can generate fun images of Pepe in various scenarios. By using the DiffusionPipeline from Hugging Face and LoRA (a method of adding additional training data to a large model for a specific style), you can easily create Pepe characters.

๐Ÿ€ Model & Space Links
Model Link:
openfree/pepe

Space Link:
openfree/pepe

The model card includes LoRA weights related to the Pepe character, allowing you to easily create meme-style images.
On the Space page, you can generate Pepe images right away via the web UI without writing extra code!

โญ Main Features
Meme-Style Pepe Images

Enter prompts like โ€œPepe dancing excitedlyโ€ or โ€œPepe busking in the streets of New York,โ€ and it automatically generates an image.
From comical, cartoon-like memes to a somewhat serious(?) Pepe, you can achieve a wide variety of styles.
LoRA Scale Adjustment

Change the LoRA scale parameter to fine-tune how strongly the Pepe style is applied.
A value closer to 0 yields a more generic image, while a value closer to 1 results in a strongly cartoon-like Pepe appearance.
Advanced Settings

Modify the Height and Width to generate vertical or horizontal images of different aspect ratios.
Adjust Guidance scale and Inference steps to get the exact level of detail and artistic style you want.
Seed Configuration

Choose a fixed seed or a random seed so that images are either reproducible or new every time.
๐Ÿš€ Usage Ideas
SNS Meme Creation

Quickly make fun Pepe images for Twitter or Instagram Stories.
Perfect for events, birthdays, or any special occasion memes!
Fan Art & Merch Design

Use generated images as references for Pepe fan art, or draft designs for merchandise (stickers, T-shirts, etc.).
Blog & Community Posts

Spice up your blog articles or community posts with meme images.
Set up humorous scenarios featuring Pepe as an entertaining โ€œreaction image.โ€
merveย 
posted an update 1 day ago
view post
Post
2033
Oof, what a week! ๐Ÿฅต So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal ๐Ÿ’ฌ
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG ๐Ÿ’—
- UI-TARS are new models by ByteDance to unlock agentic GUI control ๐Ÿคฏ in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs ๐Ÿ“–
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! ๐Ÿคฏ
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio ๐Ÿ—ฃ๏ธ
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation โฏ๏ธ
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images
ยท
lewtunย 
posted an update about 20 hours ago
view post
Post
1256
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

๐Ÿงช Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

๐Ÿง  Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

๐Ÿ”ฅ Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
m-ricย 
posted an update 1 day ago
view post
Post
1512
Today we make the biggest release in smolagents so far: ๐˜„๐—ฒ ๐—ฒ๐—ป๐—ฎ๐—ฏ๐—น๐—ฒ ๐˜ƒ๐—ถ๐˜€๐—ถ๐—ผ๐—ป ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€, ๐˜„๐—ต๐—ถ๐—ฐ๐—ต ๐—ฎ๐—น๐—น๐—ผ๐˜„๐˜€ ๐˜๐—ผ ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ ๐—ฝ๐—ผ๐˜„๐—ฒ๐—ฟ๐—ณ๐˜‚๐—น ๐˜„๐—ฒ๐—ฏ ๐—ฏ๐—ฟ๐—ผ๐˜„๐˜€๐—ถ๐—ป๐—ด ๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€! ๐Ÿฅณ

Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.

The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !

Go try it out, it's the most cracked agentic stuff I've seen in a while ๐Ÿคฏ (well, along with OpenAI's Operator who beat us by one day)

For more detail, read our announcement blog ๐Ÿ‘‰ https://huggingface.co/blog/smolagents-can-see
The code for the web browser example is here ๐Ÿ‘‰ https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py
ยท
mitkoxย 
posted an update 2 days ago
view post
Post
1304
llama.cpp is 26.8% faster than ollama.
I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison.

Total duration:
llama.cpp 6.85 sec <- 26.8% faster
ollama 8.69 sec

Breakdown by phase:
Model loading
llama.cpp 241 ms <- 2x faster
ollama 553 ms

Prompt processing
llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster
ollama 42.17 tokens/s with an eval time of 498 ms

Token generation
llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster
ollama 122.07 tokens/s with an eval time 7.64 sec

llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
ยท
onekqย 
posted an update 1 day ago
view post
Post
842
So ๐Ÿ‹DeepSeek๐Ÿ‹ hits the mainstream media. But it has been a star in our little cult for at least 6 months. Its meteoric success is not overnight, but two years in the making.

To learn their history, just look at their ๐Ÿค— repo https://huggingface.co/deepseek-ai

* End of 2023, they launched the first model (pretrained by themselves) following Llama 2 architecture
* June 2024, v2 (MoE architecture) surpassed Gemini 1.5, but behind Mistral
* September, v2.5 surpassed GPT 4o mini
* December, v3 surpassed GPT 4o
* Now R1 surpassed o1

Most importantly, if you think DeepSeek success is singular and unrivaled, that's WRONG. The following models are also near or equal the o1 bar.

* Minimax-01
* Kimi k1.5
* Doubao 1.5 pro
clemย 
posted an update 1 day ago
haritzpuertoย 
posted an update 2 days ago
view post
Post
1358
I'm excited to announce that my internship paper at Parameter Lab was accepted to Findings of #NAACL2025 ๐ŸŽ‰
TLDR: Stating an LLM was trained on a sentence might not be possible ๐Ÿ˜ฅ , but it is possible for large enough amounts of tokens, such as long documents or multiple documents! ๐Ÿคฏ
Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models (2411.00154)
๐Ÿ”— https://github.com/parameterlab/mia-scaling