4 205 48

Charles I Niswander II

charlesniswander

dhar174

AI & ML interests

None yet

Recent Activity

upvoted a paper about 10 hours ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

upvoted a paper 2 days ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

upvoted a paper 2 days ago

ReMamba: Equip Mamba with Effective Long-Sequence Modeling

View all activity

Organizations

None yet

charlesniswander's activity

upvoted a paper about 10 hours ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published 4 days ago • 19

upvoted 2 papers 2 days ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 40

ReMamba: Equip Mamba with Effective Long-Sequence Modeling

Paper • 2408.15496 • Published Aug 28, 2024 • 12

upvoted a collection 2 days ago

Foundation AI Papers

Collection

Curated List of Must-Reads on LLM reasoning at Temus AI team • 135 items • Updated Jun 15, 2024 • 31

upvoted 16 papers 2 days ago

Farewell to Length Extrapolation, a Training-Free Infinite Context with Finite Attention Scope

Paper • 2407.15176 • Published Jul 21, 2024 • 1

LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models

Paper • 2409.00509 • Published Aug 31, 2024 • 39

Neurocache: Efficient Vector Retrieval for Long-range Language Modeling

Paper • 2407.02486 • Published Jul 2, 2024 • 1

LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2, 2024 • 27

Engineering A Large Language Model From Scratch

Paper • 2401.16736 • Published Jan 30, 2024 • 2

Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey

Paper • 2311.12351 • Published Nov 21, 2023 • 4

Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4, 2024 • 64

Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 17

DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models

Paper • 2403.00818 • Published Feb 26, 2024 • 17

A Quantitative Review on Language Model Efficiency Research

Paper • 2306.01768 • Published May 28, 2023 • 2

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 32

Low-Rank Approximation, Adaptation, and Other Tales

Paper • 2408.05883 • Published Aug 12, 2024 • 1