deepnight-research
/

ai1

Text Generation

Logical Understanding

Relation Establishment

text-generation-inference

Inference Endpoints

Model card Files Files and versions

ai1 / README.md

deepnight-research's picture

deepnight-research

Update README.md

dda5461 10 months ago

|

No virus

2.89 kB

	---
	license: other
	license_name: deepnight-responsible-ai
	license_link: LICENSE
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- 600B
	- Python
	- Code
	- Logical Understanding
	- Relation Establishment
	- Translation
	- ai1
	- DEEPNIGHT
	---
	<div style="display: flex; justify-content: center; align-items: center;">
	<img src="./cover.jpg" style="width: 100%; max-width: 350px; height: auto;"/></div>

	# DEEPNIGHT ai1
	The 600 Billion+ Parameter Model.
	Yes! We did this!

	The second largest model in the world, right after GPT-4.

	---

	We at [DEEPNIGHT](https://deepnight.tech) have been working on this for quite some time.
	We have successfully built the second largest model called ai1 which comes with 600 Billion+ parameters.

	`ai1` can perform as good as GPT-4 and has a context-window of 8k tokens.
	ai1 was trained with a new approach where after training the model on a corpus of text from various sources including
	but not limited to:
	- RefinedWeb
	- Opensource code from GitHub
	- Common Crawl
	we fine-tuned the model on a huge dataset (generated manually and with automation) for logical understanding and reasoning.
	We also trained the model for function calling capabilities.

	---

	## What is special about ai1?
	ai1 works on a chaining methodology which is built-in. When it receives an input from the user, it tries to understand the input
	before acting on generation. It generates an instruction-based prompt internally and then works on generation of the response.
	Benefit of this? <b>We'll just say the jobs of Prompt Engineering are over.</b>

	Unlike ChatGPT, GPT-4, Llama, and other models, ai1 doesn't require heavy prompt engineering to provide answers.
	The understanding-development phase in the model takes care of that.

	What else?
	- performs as good as GPT-4
	- excels in automation tasks
	- can predict emotions of the user by the conversation (while understanding the input in Phase-1)
	resulting in better and curated generations.
	- has an understanding towards human-emotions which helps the model curate the content accordingly
	- excels in roleplays
	- excels in writing code
	- the model has a few global memory units which are used to store data away from the context-window.
	These memory units are mostly used to store the function schemas but in the end the model decides itself what to store in them.
	- if we consider how much would it cost, well, on an average $0.005 per 1000 tokens.

	---

	## Future goals
	We don't discuss that. Specially after seeing how SOME AI COMPANY ON THEIR DEV DAY just used the opensource research and publications
	to profit themselves... Hah.

	---

	## Are we going to allow access?
	Not for some time. We are still running evaluations and have a lot to learn about how this model can be made better.

	---

	Feel free to reach out to us at [email protected]

	- Team [DEEPNIGHT](https://deepnight.tech)