1 1 4

David Cody Taupo Lingan PRO

rizzware

AI & ML interests

Incoming @ ? [hopefully an open source GPU acceleration team]

Recent Activity

liked a Space 23 days ago

huggingface/open-source-ai-year-in-review-2024

View all activity

Organizations

rizzware's activity

liked a Space 23 days ago

Running

430

😻

Open Source Ai Year In Review 2024

What happened in open-source AI this year, and what’s next?

New activity in databricks/dbrx-instruct 4 months ago

Fine-tune dbrx via Hugging Face Trainer vs. LLM-Foundry

#58 opened 8 months ago by

rizzware

posted an update 4 months ago

Post

516

Question about LightEval 🤗:

I've been searching for an LLM evaluation suite that can, out-of-the-box, compare the outputs of a model(s) without any enhancements vs. the same model with better prompt engineering, vs. the same model with RAG vs. the same model with fine-tuning.

I unfortunately have not found a tool that fits my exact description, but of course I ran into LightEval.

A huge pain-point of building large-scale projects that use LLMs is that prior to building an MVP, it is difficult to evaluate whether better prompt engineering, or RAG, or fine-tuning, or some combination of all is needed for satisfactory LLM output in terms of the project's given use case.

Time and resources is then wasted R&D'ing exactly what LLM enhancements are needed.

I believe an out-of-the-box solution to compare models w/ or w/out the aforementioned LLM enhancements could help teams of any size better decide what LLM enhancements are needed prior to building.

I wanted to know if the LightEval team or Hugging Face in general is thinking about such a tool.

1 reply

liked a Space 8 months ago

Running

🌍

README

upvoted a paper 9 months ago

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10 • 104

liked a model 11 months ago

facebook/musicgen-large

Text-to-Audio • Updated Nov 17, 2023 • 9.98k • 428

liked a dataset 12 months ago

newsmediabias/news-bias-full-data

Viewer • Updated Oct 21 • 3.72M • 117 • 11