alkinun's picture

alkinun

AtAndDev

·

AI & ML interests

LLMs, Alignment, Merging, Unsloth, DPO, SFT, ORPO, SPIN..

Recent Activity

liked a Space 4 days ago

DragGan/DragGan

reacted to mkurman's post with ❤️ 4 days ago

Introducing a new architecture, MedIT One – a single-token transformer with LSTM-like recurrence. It is extremely fast in training and inference, but we lack funding for large-scale training. Enjoy 🍓 https://github.com/MedITSolutionsKurman/medit-one

reacted to Quazim0t0's post with 👍 4 days ago

Debugging Tags: Imagine, Associated Thoughts, Dialectical Analysis, Backwards Induction, Metacognition, and Normal Thought Processes such as <think> or <begin_of_thought> Edit: Uploaded new images w/ a Open WebUI function to organize the tags. Open WebUI Function: https://openwebui.com/f/quaz93/imagine_phi This Phi-4 model is part of a test project that I called Micro-Dose. My goal was to use a small dataset to activate reasoning and other cognitive processes without relying on a large dataset. I found that this was possible with a tiny dataset of just 90 rows, specifically designed as math problems. In the initial iterations, the dataset only activated reasoning when a math-related question was asked. I then made a few changes to the dataset’s structure, including the order of information and the naming of tags. You can see the sample results in the pictures. Not really anything special, just thought I'd share. Tweaked the dataset a bit: https://huggingface.co/Quazim0t0/Imagine-Phi-v0.2-GGUF https://huggingface.co/datasets/Quazim0t0/MicroDoseV0.2 First image shows the new tags, second shows the regular thought process and the third is the model in combination with web searches

View all activity

Organizations

Posts 5

Post

2400

@nroggendorff is that you sama?

Post

1886

everywhere i go i see his face

spaces 3

DeepSense.ai

Bicycle and E-Bike Detection Model

marco-qwq-7B

AIDC AI Marco O1

Generate responses for AI chat

models 7

AtAndDev/marco-qwq-7B

Text Generation • Updated Dec 8, 2024 • 13

AtAndDev/Ogno-Monarch-Neurotic-9B-Passthrough

Text Generation • Updated Mar 1, 2024 • 15

AtAndDev/Ogno-Monarch-Neurotic-7B-Dare-Ties

Text Generation • Updated Mar 1, 2024 • 17

AtAndDev/Marcoro14-7B-Slerp

Text Generation • Updated Mar 1, 2024 • 13

AtAndDev/CapybaraMarcoroni-7B

Text Generation • Updated Jan 7, 2024 • 1.96k

AtAndDev/ShortKing-3b-v0.2

Text Generation • Updated Oct 2, 2023 • 94 • 2

AtAndDev/ShortKing-1.4b-v0.1

Text Generation • Updated Sep 29, 2023 • 2.14k • 2

datasets 12

AtAndDev/symbolm

Viewer • Updated Jan 23 • 20k • 78

AtAndDev/symlm

Viewer • Updated Jan 16 • 10.1k • 78

AtAndDev/chain-of-diffusion

Viewer • Updated Jan 7 • 6.45k • 76

AtAndDev/clip-bicycle-e-bike

Viewer • Updated Jan 2 • 6k • 84

AtAndDev/QwQ-LongCoT-59k-cleaned

Viewer • Updated Dec 6, 2024 • 59.2k • 73

AtAndDev/sedir-clean

Viewer • Updated Dec 5, 2024 • 11.8k • 59

AtAndDev/sedir-unclean

Viewer • Updated Dec 5, 2024 • 19.9k • 75

AtAndDev/ultrachat_200k_formatted

Viewer • Updated Oct 10, 2024 • 208k • 62

AtAndDev/MedInstruct

Viewer • Updated Jul 20, 2024 • 216 • 52

AtAndDev/MedRag-textbooks-stella_en_400M_v5

Viewer • Updated Jul 14, 2024 • 126k • 60