Update README.md
Browse files
README.md
CHANGED
@@ -1,31 +1,31 @@
|
|
1 |
-
---
|
2 |
-
license: llama3.2
|
3 |
-
---
|
4 |
-
|
5 |
-
# Introduction
|
6 |
-
|
7 |
-
This repository hosts the **LLaMa 3.2** models for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library. It includes both the **1B** and **3B** versions of the LLaMa model, as well as their **quantized** versions in `.pte` format, ready for use in the **ExecuTorch** runtime.
|
8 |
-
|
9 |
-
If you'd like to run these models in your own ExecuTorch runtime, refer to the [official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions.
|
10 |
-
|
11 |
-
## Compatibility
|
12 |
-
|
13 |
-
If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the **ExecuTorch** version used to export the `.pte` files. For more details, see the compatibility note in the [ExecuTorch GitHub repository](https://github.com/pytorch/executorch/blob/11d1742fdeddcf05bc30a6cfac321d2a2e3b6768/runtime/COMPATIBILITY.md?plain=1#L4). If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with runtime used behind the scenes.
|
14 |
-
|
15 |
-
These models were exported using commit `
|
16 |
-
|
17 |
-
### Repository Structure
|
18 |
-
|
19 |
-
The repository is organized into two main directories:
|
20 |
-
|
21 |
-
- `llama-3.2-1B`
|
22 |
-
- `llama-3.2-3B`
|
23 |
-
|
24 |
-
Each directory contains different versions of the model, including **QLoRa**, **SpinQuant**, and the **original** models.
|
25 |
-
|
26 |
-
- The `.pte` file should be passed to the `modelSource` parameter.
|
27 |
-
- The corresponding `.bin` file should be used for `tokenizerSource`.
|
28 |
-
|
29 |
-
If you wish to export the model yourself, you’ll need to obtain model weights and the `params.json` file from the official repositories, which can be found [here](https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf).
|
30 |
-
|
31 |
-
For the **best performance-to-quality ratio**, we highly recommend using the **QLoRa** version, which is optimized for speed without sacrificing too much on model quality.
|
|
|
1 |
+
---
|
2 |
+
license: llama3.2
|
3 |
+
---
|
4 |
+
|
5 |
+
# Introduction
|
6 |
+
|
7 |
+
This repository hosts the **LLaMa 3.2** models for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library. It includes both the **1B** and **3B** versions of the LLaMa model, as well as their **quantized** versions in `.pte` format, ready for use in the **ExecuTorch** runtime.
|
8 |
+
|
9 |
+
If you'd like to run these models in your own ExecuTorch runtime, refer to the [official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions.
|
10 |
+
|
11 |
+
## Compatibility
|
12 |
+
|
13 |
+
If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the **ExecuTorch** version used to export the `.pte` files. For more details, see the compatibility note in the [ExecuTorch GitHub repository](https://github.com/pytorch/executorch/blob/11d1742fdeddcf05bc30a6cfac321d2a2e3b6768/runtime/COMPATIBILITY.md?plain=1#L4). If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with runtime used behind the scenes.
|
14 |
+
|
15 |
+
These models were exported using commit `fe20be98c` and **no forward compatibility** is guaranteed. Older versions of the runtime may not work with these files.
|
16 |
+
|
17 |
+
### Repository Structure
|
18 |
+
|
19 |
+
The repository is organized into two main directories:
|
20 |
+
|
21 |
+
- `llama-3.2-1B`
|
22 |
+
- `llama-3.2-3B`
|
23 |
+
|
24 |
+
Each directory contains different versions of the model, including **QLoRa**, **SpinQuant**, and the **original** models.
|
25 |
+
|
26 |
+
- The `.pte` file should be passed to the `modelSource` parameter.
|
27 |
+
- The corresponding `.bin` file should be used for `tokenizerSource`.
|
28 |
+
|
29 |
+
If you wish to export the model yourself, you’ll need to obtain model weights and the `params.json` file from the official repositories, which can be found [here](https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf).
|
30 |
+
|
31 |
+
For the **best performance-to-quality ratio**, we highly recommend using the **QLoRa** version, which is optimized for speed without sacrificing too much on model quality.
|