Architecture - a anujga Collection

anujga 's Collections

Rl

Theory

agent

Architecture

updated Nov 20, 2023

UT5: Pretraining Non autoregressive T5 with unrolled denoising

Paper • 2311.08552 • Published Nov 14, 2023 • 7
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Paper • 2311.10642 • Published Nov 17, 2023 • 23