ai1 / README.md
deepnight-research's picture
Update README.md
dda5461
|
raw
history blame
No virus
2.89 kB
---
license: other
license_name: deepnight-responsible-ai
license_link: LICENSE
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- 600B
- Python
- Code
- Logical Understanding
- Relation Establishment
- Translation
- ai1
- DEEPNIGHT
---
<div style="display: flex; justify-content: center; align-items: center;">
<img src="./cover.jpg" style="width: 100%; max-width: 350px; height: auto;"/></div>
# DEEPNIGHT ai1
The 600 Billion+ Parameter Model.
Yes! We did this!
The second largest model in the world, right after GPT-4.
---
We at [DEEPNIGHT](https://deepnight.tech) have been working on this for quite some time.
We have successfully built the second largest model called ai1 which comes with 600 Billion+ parameters.
`ai1` can perform as good as GPT-4 and has a context-window of 8k tokens.
ai1 was trained with a new approach where after training the model on a corpus of text from various sources including
but not limited to:
- RefinedWeb
- Opensource code from GitHub
- Common Crawl
we fine-tuned the model on a huge dataset (generated manually and with automation) for logical understanding and reasoning.
We also trained the model for function calling capabilities.
---
## What is special about ai1?
ai1 works on a chaining methodology which is built-in. When it receives an input from the user, it tries to understand the input
before acting on generation. It generates an instruction-based prompt internally and then works on generation of the response.
Benefit of this? <b>We'll just say the jobs of Prompt Engineering are over.</b>
Unlike ChatGPT, GPT-4, Llama, and other models, ai1 doesn't require heavy prompt engineering to provide answers.
The understanding-development phase in the model takes care of that.
What else?
- performs as good as GPT-4
- excels in automation tasks
- can predict emotions of the user by the conversation (while understanding the input in Phase-1)
resulting in better and curated generations.
- has an understanding towards human-emotions which helps the model curate the content accordingly
- excels in roleplays
- excels in writing code
- the model has a few global memory units which are used to store data away from the context-window.
These memory units are mostly used to store the function schemas but in the end the model decides itself what to store in them.
- if we consider how much would it cost, well, on an average $0.005 per 1000 tokens.
---
## Future goals
We don't discuss that. Specially after seeing how SOME AI COMPANY ON THEIR DEV DAY just used the opensource research and publications
to profit themselves... Hah.
---
## Are we going to allow access?
Not for some time. We are still running evaluations and have a lot to learn about how this model can be made better.
---
Feel free to reach out to us at [email protected]
- Team [DEEPNIGHT](https://deepnight.tech)