arxiv:2310.08164
Abdullah
amirabdullah19852020
AI & ML interests
Mechanistic interpretability, high dimensional geometry, persona role playing.
Organizations
Papers
1
models
16
amirabdullah19852020/interpreting_reward_models
Updated
amirabdullah19852020/test
Text Generation
•
Updated
•
4
amirabdullah19852020/gpt-neo-125m_hh_reward
Text Generation
•
Updated
•
13
amirabdullah19852020/gpt-neo-125m_utility_reward
Reinforcement Learning
•
Updated
•
5
amirabdullah19852020/pythia-70m_sentiment_reward
Reinforcement Learning
•
Updated
•
17
amirabdullah19852020/pythia-160m_sentiment_reward
Reinforcement Learning
•
Updated
•
4
amirabdullah19852020/gpt-neo-125m_sentiment_reward
Reinforcement Learning
•
Updated
•
4
amirabdullah19852020/pythia-160m_utility_reward
Reinforcement Learning
•
Updated
•
6
amirabdullah19852020/pythia-70m_utility_reward
Reinforcement Learning
•
Updated
•
8
amirabdullah19852020/gpt-j-6b-sharded-bf16_sentiment_reward
Reinforcement Learning
•
Updated