Daniel Huynh PRO

dhuynh95

AI & ML interests

None yet

Articles

Organizations

dhuynh95's activity

posted an update 5 months ago
view post
Post
1590
๐Ÿ’ชBuild an information retrieval Agent that can beat Gemini and OpenAI using open-source Large Action Model framework!

In this video, we ask to different proprietary Conversational AI the question:
โ€œWhat is the most trendy recent paper on Llava models on Hugging Face papers? Provide the date and a summary of the paperโ€, and the results are interesting!
โŒGemini: found a paper from Jan 29, 2024
โŒOpenAI: found a paper from October 2023
โŒYou.com: found a paper from Jan 29 2024
โœ…LaVague: found the latest paper (ConvLlaVA which is dope by the way https://arxiv.org/abs/2405.15738)!

The best? Our solution fits a few ines of code with our open-source framework! I will share how we built that agent during our webinar on AI Web Agents, this Thursday 30th May at 9 am PST (https://lu.ma/m8fzmb3q) so donโ€™t miss it ๐Ÿ˜‰

You can also start playing with our framework: https://github.com/lavague-ai/LaVague
reacted to m-ric's post with ๐Ÿ‘€ 8 months ago
view post
Post
1798
๐“๐ก๐ž ๐ซ๐ž๐ญ๐ฎ๐ซ๐ง ๐จ๐Ÿ ๐ญ๐ก๐ž ๐‘๐๐๐ฌ โš” ๐๐ž๐ฐ ๐Œ๐š๐ฆ๐›๐š-๐›๐š๐ฌ๐ž๐ ๐š๐ซ๐œ๐ก๐ข๐ญ๐ž๐œ๐ญ๐ฎ๐ซ๐ž "๐‰๐š๐ฆ๐›๐š"

Since the release of BERT by Google in 2019, Transformers architecture have taken over machine learning thanks to their ๐—ฎ๐˜๐˜๐—ฒ๐—ป๐˜๐—ถ๐—ผ๐—ป ๐—บ๐—ฒ๐—ฐ๐—ต๐—ฎ๐—ป๐—ถ๐˜€๐—บ, that gives them the ability to focus on important points of the input. But ๐™–๐™ฉ๐™ฉ๐™š๐™ฃ๐™ฉ๐™ž๐™ค๐™ฃ ๐™˜๐™ค๐™ข๐™ฅ๐™ช๐™ฉ๐™–๐™ฉ๐™ž๐™ค๐™ฃ ๐™ž๐™จ ๐™ฆ๐™ช๐™–๐™™๐™ง๐™–๐™ฉ๐™ž๐™˜ ๐™ž๐™ฃ ๐™ฉ๐™๐™š ๐™ž๐™ฃ๐™ฅ๐™ช๐™ฉ ๐™ก๐™š๐™ฃ๐™œ๐™ฉ๐™.

๐Ÿ’ซ The Mamba paper, published in December 2023, announced the return of the RNNs: it has no attention, but integrates a selection mechanism, which should be able to reproduce the โ€œfocusโ€ ability of attention, in an architecture for which the compute requirements ๐—ด๐—ฟ๐—ผ๐˜„ ๐—ผ๐—ป๐—น๐˜† ๐—น๐—ถ๐—ป๐—ฒ๐—ฎ๐—ฟ๐—น๐˜† ๐—ถ๐—ป ๐—ถ๐—ป๐—ฝ๐˜‚๐˜ ๐—น๐—ฒ๐—ป๐—ด๐˜๐—ต!
๐Ÿค” Would this work? We had yet to see a large Mamba model recovering the performance of Attention-based Transformers.

๐Ÿ’ฅ But now it's done! A (Mamba + Transformers) hybrid just beat Transformers!

The AI21 Labs team just released Jamba.
They insert a few Transformer layers to inject some attention in a big pile of Mamba layers, thus getting the best of both worlds.

๐™๐™‡;๐˜ฟ๐™:
๐Ÿ—๏ธ ๐—ก๐—ฒ๐˜„ ๐— ๐—ผ๐—˜ ๐—ฎ๐—ฟ๐—ฐ๐—ต๐—ถ๐˜๐—ฒ๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ: 4 Jamba blocks, each of these being 7 Mamba layers for 1 Transformer.
๐Ÿ‹๏ธ ๐Ÿฑ๐Ÿฎ๐—• ๐—ฝ๐—ฎ๐—ฟ๐—ฎ๐—บ๐—ฒ๐˜๐—ฒ๐—ฟ๐˜€, ๐Ÿญ๐Ÿฎ๐—• ๐—ฎ๐—ฐ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—ฎ๐˜ ๐—ถ๐—ป๐—ณ๐—ฒ๐—ฟ๐—ฒ๐—ป๐—ฐ๐—ฒ: This reduction is enabled by Mixture of Experts, and similar to Mixtral (47B parameters - 13B active).
๐ŸŽ๏ธ ๐—ฆ๐—ฝ๐—ฒ๐—ฒ๐—ฑ: ๐˜…๐Ÿฏ ๐˜๐—ต๐—ฟ๐—ผ๐˜‚๐—ด๐—ต๐—ฝ๐˜‚๐˜. Jamba is much faster than similar-sized Transformer models on long contexts.
๐Ÿ“ ๐—–๐—ผ๐—ป๐˜๐—ฒ๐˜…๐˜ ๐—น๐—ฒ๐—ป๐—ด๐˜๐—ต: ๐Ÿญ๐Ÿฐ๐Ÿฌ๐—ž ๐˜๐—ผ๐—ธ๐—ฒ๐—ป๐˜€ on a single 80GB A100!
๐Ÿ’ช ๐—ฃ๐—ฒ๐—ฟ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐—ป๐—ฐ๐—ฒ: ๐˜€๐˜๐—ฎ๐˜๐—ฒ-๐—ผ๐—ณ-๐˜๐—ต๐—ฒ-๐—ฎ๐—ฟ๐˜ ๐—ณ๐—ผ๐—ฟ ๐˜๐—ต๐—ถ๐˜€ ๐˜€๐—ถ๐˜‡๐—ฒ. The small injection of attention seems sufficient since Jamba beats the open-source reference Mixtral-8x7B on many benchmarks!

Try it here ๐Ÿ‘‰ ai21labs/Jamba-v0.1
posted an update 8 months ago
view post
Post
1749
๐ŸŒŠLaVague can compile Action Plans into actionable code to browse the internet!

In this example, you can see how an action plan with natural language instructions can be โ€œcompiledโ€ into executable Selenium code!

๐Ÿค–This shows the potential of #LAM (Large Action Models) to perform actions for us and automate mechanical tasks.
This example leverages a local embedding model and OpenAI GPT-3.5, but we support many options, including local ones with Gemma!
You can try this in our docs: https://docs.lavague.ai/en/latest/

LaVague is an open-source Large Action Model framework to automate automation. If you are interested in helping us on our mission to democratize automation tooling for devs, donโ€™t hesitate to visit our GitHub (https://github.com/lavague-ai/LaVague) or Discord (https://discord.gg/SDxn9KpqX9)!
posted an update 8 months ago
view post
Post
Hello World! This post is written by the Large Action Model framework LaVague! Find out more on https://github.com/mithril-security/LaVague

Edit: Here is the video of ๐ŸŒŠLaVague posting this. This is quite meta
  • 2 replies
ยท
replied to their post 8 months ago
view reply

Thanks! Preparing a tutorial too to explain how we managed to have a working solution in <150 lines of code :D

posted an update 8 months ago
view post
Post
๐ŸŒŠ Released #LaVague, fullly open-source AI pipeline to turn natural language into browser actions!

In less than 150 lines of code (RAG with local embedding + Zephyr-7b-Gemma locally or Mixtral on HF Inference API), it generates #Selenium code from user query. In this GIF you can see it follow user instructions to command a browser to browse HF website!

Try it on Colab: colab.research.google.com/github/dhuynh95/LaVague/blob/main/LaVague.ipynb
GitHub: github.com/dhuynh95/LaVague

Pretty exciting how it becomes possible to create an AI assistant that could perform actions for us, such as logging on gov accounts, fill forms, or pull personal information!

It was quite fun to hack in the weekend using open-source tools, from @huggingface local embedding with transformers for local inference or HF Inference API, to RAG with @llama_index, through @MistralAI Mixtral model!

Some challenges: to make it run on Colab for the #GPU Poors, I first resorted to @huggingface Inference API with Mixtral as it was the only model good enough (gemma-7b did not make it and refused to produce code). But after some experimentations, I managed to make it work a local Zephyr-7b-Gemma so that people could run this assistant fully locally!

Because I used an off-the-shelf model, I had to improve performance with few-shot learning and Chain Of Thought, which managed to generate appropriate code!

I hope this project will herald a new dawn where transparent, private and local AI assistants help automate menial but critical tasks, such as helping fill taxes, book accomodation, or research information for us.
ยท
posted an update 9 months ago
view post
Post
โœจ In-context learning is all you need!

This super interesting paper shows that fine-tuning with #SFT or #RLHF only helps on the form but does not impact knowledge or reasoning abilities, and in some cases, actually decreases performance!

They tested it with Mistral-base vs Mistral FT-ed, as well as Llama 2 70b base and FT-ed and results are consistent.

Providing the right prompt to the base model actually makes the model better and has 0 training cost!

Paper: https://arxiv.org/abs/2312.01552
reacted to abidlabs's post with โค๏ธ 9 months ago
view post
Post
Just out: new custom Gradio component specifically designed for code completion models ๐Ÿ”ฅ
  • 1 reply
ยท
replied to lbourdois's post 9 months ago
view reply

Pretty cool stuff! Maybe you should do a leaderboard of major datasets and their leakage score

reacted to lbourdois's post with ๐Ÿคฏ 9 months ago
reacted to Locutusque's post with โค๏ธ 9 months ago
view post
Post
Introducing the "UltraTextbooks" dataset ๐Ÿš€๐Ÿ“š
Check it out here: Locutusque/UltraTextbooks
๐Ÿ“˜ A comprehensive collection of high-quality synthetic and human-written textbooks
๐Ÿ‘จโ€๐ŸŽ“ Spanning various subjects and programming languages
๐Ÿ”ง Designed for advanced NLP tasks like language modeling, educational QA, text summarization, and content generation for edu purposes
๐Ÿš€ Future expansions planned with additional data sources to enhance the corpus
๐Ÿ‘‡ Data composition highlights ๐Ÿ‘‡
- Blend of synthetic and human-written material
- Includes topics from general edu to specialized areas
- Structured with field "text"
๐Ÿงฉ Data collection from various Hugging Face datasets, guided by a diverse and comprehensive curation rationale
๐Ÿšง Limitations may exist, so report any issues you encounter
  • 2 replies
ยท
reacted to gsarti's post with ๐Ÿ‘ 9 months ago
view post
Post
๐Ÿ” Today's pick in Interpretability & Analysis of LMs: Black-Box Access is Insufficient for Rigorous AI Audits by @stecas @carsonezell et al.

Audits conducted on AI systems can identify potential risks and ensure their compliance to safety requirements. Authors categorise audits based on the access to model-related resources (black, grey, white and out-of-the box) and highlight how levels of transparency on audited AI system enable broader and more effective auditing procedures. Technical, physical, and legal safeguards for performing audits are also introduced to ensure minimal security risks for audited companies. Authors conclude that transparency on the type of auditorsโ€™ access and methods is a pre-requisite to correctly interpret audit results, and white- and outside-the-box access allow for substantially more scrutiny than black-box access alone.

๐Ÿ“„ Paper: Black-Box Access is Insufficient for Rigorous AI Audits (2401.14446)

๐Ÿ” Further readings:

๐Ÿ“„Taxonomy of AI system access: https://bit.ly/struct-access
๐Ÿ’ปAn API for transparent science on Black-box AI (NNsight): https://nnsight.net/about
posted an update 10 months ago
view post
Post
Fascinating paper by Rand shows that there is no statistically significant difference between using LLMs or regular internet to craft operational plans for bioweapons!

This is the first paper that actually studies the impact of AI on bioweapons from an operational perspective and looks at the big question: is AI any better than just using public data on the Internet?

As most of the data is most likely out there, an LLM would just be a more efficient tool to come up with the relevant information, but it seems that its impact is limited.

https://www.rand.org/pubs/research_reports/RRA2977-2.html
  • 1 reply
ยท
replied to santiviquez's post 10 months ago
posted an update 10 months ago
view post
Post
โœ…New paper to ensure valid LLM output with SOTA LLMs like GPT4 by mixing it with OSS LLMs

Paper: arxiv.org/abs/2401.09967

Great paper showing how strong proprietary AI like #GPT4 can be paired with #OSS LLM to ensure LLM output validity, e.g. valid JSON.

Many devs complain that #LLMs cannot be reliably used in production if the output is not valid, for instance, if one wants to use LLMs to generate SQL queries or JSON, it is crucial that the output is valid.

Frameworks have arisen to constrain the outputs of the LLM to follow some constraints, like outlines (https://github.com/outlines-dev/outlines), but they assume access to logits.

This makes them incompatible with proprietary LLMs like GPT4 that donโ€™t share logits, so one can only use open-source LLMs that are much less performant.

This paper shows how can use powerful proprietary LLMs like GPT4 to create a first unconstrained sketch and refine it using an OSS model like Llama 2 where logits are accessible, to rewrite the sketch following some specific constraints.

They show that GPT4 Precision can be increased by 14% (43% before, 57% after), by boosting it with constrained output on information extraction on Wiki-NRE!
reacted to philschmid's post with โค๏ธ 10 months ago
view post
Post
What's the best way to fine-tune open LLMs in 2024? Look no further! ๐Ÿ‘€ย I am excited to share โ€œHow to Fine-Tune LLMs in 2024 with Hugging Faceโ€ using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. ๐Ÿš€

It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
๐Ÿ’กDefine and understand use cases for fine-tuning
๐Ÿง‘๐Ÿปโ€๐Ÿ’ปย Setup of the development environment
๐Ÿงฎย Create and prepare dataset (OpenAI format)
๐Ÿ‹๏ธโ€โ™€๏ธย Fine-tune LLM using TRL and the SFTTrainer
๐Ÿฅ‡ย Test and evaluate the LLM
๐Ÿš€ย Deploy for production with TGI

๐Ÿ‘‰ย  https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. ๐Ÿ”œ
ยท
reacted to gsarti's post with โค๏ธ 10 months ago
view post
Post
๐Ÿ’ฅ Today's pick in Interpretability & Analysis of LMs: Fine-grained Hallucination Detection and Editing For Language Models by @abhika-m @akariasai @vidhisha et al.

Authors introduce a new taxonomy for fine-grained annotation of hallucinations in LM generations and propose Factuality Verification with Augmented Knowledge (FAVA), a retrieval-augmented LM fine-tuned to detect and edit hallucinations in LM outputs, outperforming ChatGPT and LLama2 Chat on both detection and editing tasks.

๐ŸŒ Website: https://fine-grained-hallucination.github.io
๐Ÿ“„ Paper: Fine-grained Hallucination Detection and Editing for Language Models (2401.06855)
๐Ÿš€ Demo: fava-uw/fava
๐Ÿค– Model: fava-uw/fava-model
๐Ÿ”ก Dataset: fava-uw/fava-data
posted an update 10 months ago
view post
Post
๐ŸชŸ32k-context BERT for embedding and RAG on long corpus

Monarch Mixer is a new architecture to enable long context BERT for large corpus and can be fine-tuned for large context retrieval.

Quite interesting and important as BERT is still the most used LLM in production for "old school" tasks like classification, NER, embeddings, but is also a key component for RAG.

Paper: https://arxiv.org/abs/2310.12109
Blog: https://hazyresearch.stanford.edu/blog/2024-01-11-m2-bert-retrieval
GitHub: https://github.com/HazyResearch/m2
  • 1 reply
ยท