lbourdois commited on
Commit
611a914
1 Parent(s): 2dd4c91

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -23
README.md CHANGED
@@ -3,27 +3,8 @@ title: SSM Blog Posts
3
  emoji: 📝
4
  colorFrom: purple
5
  colorTo: yellow
6
- sdk: static
 
 
7
  pinned: false
8
- ---
9
-
10
- <b><p style="text-align: center; color:red">Une version en français est disponible sur mon [blog](https://lbourdois.github.io/blog/ssm/)</p></b>
11
- <br>
12
-
13
- October 7, 2021, while wondering whether [AK](https://hf.co/akhaliq) was a bot or a human, I saw one of his [tweets](https://twitter.com/_akhaliq/status/1445931206030282756). A link to a publication on [open-review.net](https://openreview.net/forum?id=uYLFoz1vlAC) accompanied by the following image:
14
-
15
- <center>
16
- <img src="https://cdn-uploads.huggingface.co/production/uploads/613b0a62a14099d5afed7830/QMpNVGwdQV2jRw-jYalxa.png" alt="alt text" width="800" height="450">
17
- </center>
18
-
19
- Intrigued by the results announced, I decided to read about this S3 model, which would be renamed less than a month later to [S4](https://twitter.com/_albertgu/status/1456031299194470407) ([link](https://github.com/lbourdois/blog/blob/master/assets/efficiently_modeling_long_sequences_s3.pdf) of the version from when it was still called S3 for those interested).
20
- This brilliant article impressed me. At the time, I was convinced that State Space Models (SSM) were going to be a revolution, replacing transformers in the coming months. Two years later, I'm forced to admit that I was completely wrong, given the tsunami of LLMs making the news in NLP.
21
- Nevertheless, on Monday December 4, 2023, the announcement of Mamba by [Albert Gu](https://twitter.com/_albertgu/status/1731727672286294400) and [Tri Dao](https://twitter.com/tri_dao/status/1731728602230890895) revived their interest. This phenomenon was accentuated 4 days later with the announcement of [StripedHyena](https://twitter.com/togethercompute/status/1733213267185762411) by Together AI.
22
- A good opportunity for me to write a few words about the developments in SSM over the last two years.
23
-
24
- I plan to write three articles first, where the aim is to illustrate the basics of SSM with S4 (the "Attention is all you need" of the field) before doing a literature review of the evolution of SSM since that first paper:
25
- - [Introduction to SSM and S4](https://huggingface.co/blog/lbourdois/get-on-the-ssm-train)
26
- - [SSM's history in 2022](https://huggingface.co/blog/lbourdois/ssm-2022)
27
- - [SSM's history in 2023](WIP) (WIP)
28
-
29
- I also hope in a second time to go into the details of the architectures of some specific SSMs with animations ✨
 
3
  emoji: 📝
4
  colorFrom: purple
5
  colorTo: yellow
6
+ sdk: streamlit
7
+ sdk_version: 1.39.0
8
+ app_file: Home.py
9
  pinned: false
10
+ ---