John6666 (John Smith)

reacted to AdinaY's post with 🚀🔥 about 3 hours ago

Post

215

Wan2.1 🔥📹 new OPEN video model by Alibaba Wan team!

Model: Wan-AI/Wan2.1-T2V-14B
Demo: Wan-AI/Wan2.1

✨Apache 2.0
✨8.19GB VRAM, runs on most GPUs
✨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A
✨Text Generation: Supports Chinese & English
✨Powerful Video VAE: Encode/decode 1080P w/ temporal precision

reacted to fdaudens's post with ❤️ about 5 hours ago

Post

1772

🚀 Just launched: A toolkit of 20 powerful AI tools that journalists can use right now - transcribe, analyze, create. 100% free & open-source.

Been testing all these tools myself and created a searchable collection of the most practical ones - from audio transcription to image generation to document analysis. No coding needed, no expensive subscriptions.

Some highlights I've tested personally:
- Private, on-device transcription with speaker ID in 100+ languages using Whisper
- Website scraping that just works - paste a URL, get structured data
- Local image editing with tools like Finegrain (impressive results)
- Document chat using Qwen 2.5 72B (handles technical papers well)

Sharing this early because the best tools come from the community. Drop your favorite tools in the comments or join the discussion on what to add next!

👉 JournalistsonHF/ai-toolkit

reacted to AdinaY's post with 🚀 about 5 hours ago

Post

1476

Try QwQ-Max-Preview, Qwen's reasoning model here👉 https://chat.qwen.ai
Can't wait for the model weights to drop on the Hugging Face Hub 🔥

1 reply

·

reacted to onekq's post with 👀 about 5 hours ago

Post

651

Huge disappointment to Claude Sonnet 3.7 😞 Big performance regression. Worse than the June version in 2024. 👎
onekq-ai/WebApp1K-models-leaderboard

I'm sure though this version improves on something, only not the thing my leaderboard measures. This proves the point that no model can be the best on everything.

1 reply

·

reacted to sequelbox's post with 🚀 about 5 hours ago

Post

850

SNEAK PREVIEW: Tachibana 2! A new high-difficulty code-reasoning dataset to use and challenge deepseek-ai/DeepSeek-R1 - harder prompts, complex requirements, deeper technical skill.

Link here: sequelbox/Tachibana2-DeepSeek-R1-PREVIEW

All responses generated by DeepSeek's R1 model, all prompts synthetically generated by Llama 3.1 405b Instruct.

excited to bring out the full dataset for everyone's use as soon as I can! more to come soon.

reacted to MrOvkill's post with ❤️ about 5 hours ago

Post

845

Hello!

I was just playing around with Python's MIDI library and Colab's code generation, accidentally cooked up a quick n' dirty audio synthesis template.
Have fun!

https://colab.research.google.com/drive/1d-AF6jygCwmoJvAa9nnEMe5ROidnMJNY?usp=sharing

-<3

reacted to schuler's post with 🚀 about 5 hours ago

Post

631

📢 Old Research Alert: Making Computer Vision Models Smaller & Smarter!

Years ago, I coded an optimization in the first layers of a convolutional neural network (computer vision) and ended never posting here. The optimization decreases the number of parameters while increasing accuracy. The optimization relies in separating (branching) chromatic and achromatic information through the layers of a neural network.

YouTube videos:
https://www.youtube.com/watch?v=u4vZZmBMFLw
https://www.youtube.com/watch?v=-BD293yqdKI

Source codes:
https://github.com/joaopauloschuler/two-branch-plant-disease
https://github.com/joaopauloschuler/two-path-noise-lab-plant-disease

Research papers:
https://www.researchgate.net/publication/361511874_Color-Aware_Two-Branch_DCNN_for_Efficient_Plant_Disease_Classification
https://www.researchgate.net/publication/355215213_Reliable_Deep_Learning_Plant_Leaf_Disease_Classification_Based_on_Light-Chroma_Separated_Branches

May the force be with you.

reacted to samchain's post with 🚀 about 5 hours ago

Post

412

NLP for economics 1.1 is out !

Following the 1.0 collection, I release the 1.1 version with an updated dataset for sentence similarity as well as a raw dataset from central bankers speeches.

The newest model is econo-sentence-v2 is a new version of a sentence-transformers model based on EconoBert ! It gets better results with a nuance on similarity.

If you're an economist looking for useful tools, don't hesitate to check it out !

reacted to Jiaqi-hkust's post with 🚀 about 5 hours ago

Post

353

We have open-sourced Hawk (NeurIPS 2024) 🎉, one of the pioneering frameworks for open-world video anomaly understanding.

In the field of video anomaly detection, despite continuous technological advancements, existing systems still face limitations in semantic understanding of scenes and user interaction, making it challenging to effectively identify complex anomalous scenes. Additionally, the scarcity of datasets restricts the applicability of these systems in open-world scenarios.

To tackle these challenges, we developed Hawk, an open-world video understanding and anomaly detection framework. Hawk significantly enhances anomaly recognition by identifying motion information differences between anomalous and normal videos. We introduce an auxiliary consistency loss to strengthen the focus on motion modalities and establish a supervisory relationship between motion and language representations. Furthermore, we have annotated over 8,000 anomalous videos and their language descriptions and created 8,000 question-answer pairs to support effective training in diverse open-world scenarios.

Experimental results demonstrate that Hawk surpasses existing video understanding frameworks in video description generation and question-answering tasks.

We warmly invite everyone to try it out!
- Hugging Face Demo: Jiaqi-hkust/hawk
- Hugging Face Model: Jiaqi-hkust/hawk
- Hugging Face Dataset: Jiaqi-hkust/hawk
- GitHub Code: https://github.com/jqtangust/hawk

We look forward to your feedback and participation! 👏

reacted to ovi054's post with 🚀 about 5 hours ago

Post

472

Image-to-Vector ⚡

ovi054/image-to-vector

Transform Images into Professional Vector Graphics
Convert your raster images (JPG, PNG, WEBP) into high-quality vector graphics (SVG) with our easy-to-use tool! Perfect for designers, artists, and anyone needing vector conversions.

🎯 Key Features:

Convert to scalable SVG vector graphics
Real-time preview of your SVG output
Advanced customization options
Clean, user-friendly interface
Batch processing ready

🛠️ Advanced Controls:

Color/B&W mode selection
Speckle filtering
Color precision adjustment
Layer management
Curve fitting options

💫 Why Image-to-Vector?

No installation needed
Free to use
Professional-grade output
Simple yet powerful

🔧 Technical Details:

Built with Gradio
Powered by VTracer
Optimized SVG generation

👉 Try it now: ovi054/image-to-vector

#computervision #vectorgraphics #imageprocessing #svg #design #ai

reacted to nicolay-r's post with 👀 1 day ago

Post

840

📢 If you're interesting in quick application of target sentiment analysis towards your data, you might be insterested in using fine-tuned FlanT5-xl version. Reason is a quick performance: I've added batching support for series of sentiment analysis models in this card:
nicolay-r/sentiment-analysis-advances-665ba391e0eba729021ea101

The provider implementation:
https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_flan_t5.py

📺 How to quick launch:
https://github.com/nicolay-r/bulk-chain/blob/master/test/test_provider_batching.py

Reason for using? experimenting in out-of domain, the noticed the performance of xl version similar to LLaMA-3-3b-instruct.

🔑 Key takeaways of adaptaiont:
- paddings and truncation strategies for batching mode:
- https://huggingface.co/docs/transformers/en/pad_truncation
- add_special_tokens=False causes a drastic changes in the result behaviour (FlanT5 models).
💥 Crashes on pad_token_id=50256 during generation proces.
🔻 use_bf16 mode performs 3 times slower on CPU.

🚀 Performance for BASE sized model:
nicolay-r/flan-t5-tsa-thor-base
17.2 it/s (prompt) and 5.22 it/s (3-step CoT) (CPU Core i5-1140G7)

There are other domain-oriented models could be launched via the same provider:
nicolay-r/flan-t5-emotion-cause-thor-base

Reference: https://github.com/huggingface/transformers/issues/26061

reacted to suayptalha's post with 👍 1 day ago

Post

552

I've just distilled Llama-3.2-3B-Instruct with deepseek-ai/DeepSeek-R1 on ServiceNow-AI/R1-Distill-SFT dataset. 🐋🦙

Here is the model:
suayptalha/DeepSeek-R1-Distill-Llama-3B

reacted to sometimesanotion's post with 🔥 1 day ago

Post

3992

I'd like to draw your attention to a Lamarck-based experiment which uses Arcee AI's newly published arcee_fusion merge method for three out of its four merges. Yes, just four. This is a simple one, and its recipe is fully open:

sometimesanotion/Lamarck-14B-v0.7-Fusion

It unifies three branches, all of which feature models which bring Lamarck-14B-v0.7 and Qwenvergence-14B-v12-Prose together. One side features @jpacifico 's jpacifico/Chocolatine-2-14B-Instruct-v2.0.3 and the other features @suayptalha 's suayptalha/Lamarckvergence-14B paired with my models which were their merge ancestors.

A fusion merge - of a fusion merge and a SLERP of a fusion and older merge - should demonstrate the new merge method's behavior in interesting ways, especially in the first 1/4th of the model where the SLERP has less impact.

I welcome you to kick the tires and learn from it. It has prose quality near Qwenvergence v12's - as you'd expect.

Thank you, @mradermacher and @MaziyarPanahi , for the first-day quantizations! Your work helped get me started. https://huggingface.co/models?other=base_model:quantized:sometimesanotion/Lamarck-14B-v0.7-Fusion

4 replies

·

reacted to openfree's post with 🔥 1 day ago

Post

6448

Datasets Convertor 🚀

openfree/Datasets-Convertor

Welcome to Datasets Convertor, the cutting-edge solution engineered for seamless and efficient data format conversion. Designed with both data professionals and enthusiasts in mind, our tool simplifies the transformation process between CSV, Parquet, and JSONL, XLS file formats, ensuring that your data is always in the right shape for your next analytical or development challenge. 💻✨

Why Choose Datasets Convertor?
In today’s data-driven world, managing and converting large datasets can be a daunting task. Our converter is built on top of robust technologies like Pandas and Gradio, delivering reliable performance with a modern, intuitive interface. Whether you’re a data scientist, analyst, or developer, Datasets Convertor empowers you to effortlessly switch between formats while maintaining data integrity and optimizing storage.

Key Features and Capabilities:
CSV ⇆ Parquet Conversion:
Easily transform your CSV files into the highly efficient Parquet format and vice versa. Parquet’s columnar storage not only reduces file size but also accelerates query performance—a critical advantage for big data analytics. 🔄📂

CSV to JSONL Conversion:
Convert CSV files to JSONL (newline-delimited JSON) to facilitate efficient, line-by-line data processing. This format is particularly useful for streaming data applications, logging systems, and scenarios where incremental data processing is required. Each CSV row is meticulously converted into an individual JSON record, preserving all the metadata and ensuring compatibility with modern data pipelines. 📄➡️📝

Parquet to JSONL Conversion:
For those working with Parquet files, our tool offers a streamlined conversion to JSONL.

Parquet to XLS Conversion.

reacted to csabakecskemeti's post with 🚀🤗 1 day ago

Post

2249

Testing Training on AMD/ROCm the first time!

I've got my hands on an AMD Instinct MI100. It's about the same price used as a V100 but on paper has more TOPS (V100 14TOPS vs MI100 23TOPS) also the HBM has faster clock so the memory bandwidth is 1.2TB/s.
For quantized inference it's a beast (MI50 was also surprisingly fast)

For LORA training with this quick test I could not make the bnb config works so I'm running the FT on the fill size model.

Will share all the install, setup and setting I've learned in a blog post, together with the cooling shroud 3D design.

8 replies

·

reacted to AdinaY's post with 🔥 1 day ago

Post

1756

Two AI startups, DeepSeek & Moonshot AI , keep moving in perfect sync 👇

✨ Last December: DeepSeek & Moonshot AI released their reasoning models on the SAME DAY.
DeepSeek: deepseek-ai/DeepSeek-R1
MoonShot: https://github.com/MoonshotAI/Kimi-k1.5

✨ Last week: Both teams published papers on modifying attention mechanisms on the SAME DAY AGAIN.
DeepSeek: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (2502.11089)
Moonshot: MoBA: Mixture of Block Attention for Long-Context LLMs (2502.13189)

✨ TODAY:
DeepSeek unveiled Flash MLA: a efficient MLA decoding kernel for NVIDIA Hopper GPUs, optimized for variable-length sequences.
https://github.com/deepseek-ai/FlashMLA

Moonshot AI introduces Moonlight: a 3B/16B MoE trained on 5.7T tokens using Muon, pushing the Pareto frontier with fewer FLOPs.
moonshotai/Moonlight-16B-A3B

What's next? 👀

reacted to stefan-it's post with 👍 1 day ago

Post

2551

She arrived 😍

[Expect more models soon...]

1 reply

·

reacted to m-ric's post with 🚀 1 day ago

Post

2319

We now have a Deep Research for academia: SurveyX automatically writes academic surveys nearly indistinguishable from human-written ones 🔥

Researchers from Beijing and Shanghai just published the first application of a deep research system to academia: their algorithm, given a question, can give you a survey of all papers on the subject.

To make a research survey, you generally follow two steps, preparation (collect and organize papers) and writing (outline creation, writing, polishing). Researchers followed the same two steps and automated them.

🎯 For the preparation part, a key part is find all the important references on the given subject.
Researchers first cast a wide net of all relevant papers. But then finding the really important ones is like distilling knowledge from a haystack of information. To solve this challenge, they built an “AttributeTree” object that structures key information from citations. Ablating these AttributeTrees significantly decreased structure and synthesis scores, so they were really useful!

📝 For the writing part, key was to get a synthesis that's both short and true. This is not easy to get with LLMs! So they used methods like LLM-based deduplication to shorten the too verbose listings made by LLMs, and RAG to grab original quotes instead of made-up ones.

As a result, their system outperforms previous approaches by far!

As assessed by LLM-judges, the quality score os SurveyX even approaches this of human experts, with 4.59/5 vs 4.75/5 🏆

I advise you to read the paper, it's a great overview of the kind of assistants that we'll get in the short future! 👉 SurveyX: Academic Survey Automation via Large Language Models (2502.14776)
Their website shows examples of generated surveys 👉 http://www.surveyx.cn/

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity