mwitiderrick commited on
Commit
0484553
·
1 Parent(s): 68dc6a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -1
README.md CHANGED
@@ -70,7 +70,6 @@ pip install -e "sparseml[transformers]"
70
  python sparseml/src/sparseml/transformers/sparsification/obcq/obcq.py GeneZC/MiniChat-3B open_platypus --recipe recipe.yaml --save True
71
  python sparseml/src/sparseml/transformers/sparsification/obcq/export.py --task text-generation --model_path obcq_deployment
72
  cp deployment/model.onnx deployment/model-orig.onnx
73
- python onnx_kv_inject.py --input-file deployment/model-orig.onnx --output-file deployment/model.onnx
74
  ```
75
  Run this kv-cache injection to speed up the model at inference by caching the Key and Value states:
76
  ```python
 
70
  python sparseml/src/sparseml/transformers/sparsification/obcq/obcq.py GeneZC/MiniChat-3B open_platypus --recipe recipe.yaml --save True
71
  python sparseml/src/sparseml/transformers/sparsification/obcq/export.py --task text-generation --model_path obcq_deployment
72
  cp deployment/model.onnx deployment/model-orig.onnx
 
73
  ```
74
  Run this kv-cache injection to speed up the model at inference by caching the Key and Value states:
75
  ```python