pszemraj's picture
Update README.md
4280056 verified
|
raw
history blame
1.08 kB
metadata
license: apache-2.0
base_model: pszemraj/jamba-900M-v0.13-KIx2
tags:
  - textbook
  - '16384'
  - long document
metrics:
  - accuracy
language:
  - en
inference: false

BEE-spoke-data/Jamba-900M-doc-writer

to test it out, try this notebook

This model produces long, surprisingly coherent output that extends some input text; you can see an example here, which is a generated textbook about underwater city design.

Model description

This model is a fine-tuned version of pszemraj/jamba-900M-v0.13-KIx2 on some textbook data.

It achieves the following results on the evaluation set:

  • Loss: 3.0200
  • Accuracy: 0.4544
  • Num Input Tokens Seen: 4940890112

Intended uses & limitations

  • long context generation
  • It requires a rather long prompt (aka 'Introduction') to be coaxed into consistently producing long, textbook-like text