|
--- |
|
license: mit |
|
tags: |
|
- coreml |
|
- ANE |
|
- DeepSeek |
|
- Apple |
|
- Apple Neural Engine |
|
--- |
|
# ANEMLL |
|
|
|
**ANEMLL** (pronounced like “animal”) is an open-source project focused on accelerating the porting of Large Language Models (LLMs) to tensor processors, starting with the Apple Neural Engine (ANE). |
|
|
|
The goal is to provide a fully open-source pipeline from model conversion to inference for common LLM architectures running on ANE. |
|
|
|
This enables seamless integration and on-device inference for low-power applications on edge devices, ensuring maximum privacy and security. |
|
|
|
This is critical for autonomous applications, where models run directly on the device without requiring an internet connection. |
|
|
|
--- |
|
|
|
## License |
|
|
|
ANEMLL is licensed under the [MIT License](https://opensource.org/license/mit). |
|
The model is based on Meta’s LLaMA 3.1 8B architecture and may require a separate license. |
|
|
|
This test model is exclusively for the Meta's LLaMA 3.2 1B (1024 context) model converted for CoreML, |
|
released before the official launch of the ANEMLL repository and minimal documentation. |
|
It is intended for early adopters only who requested an early release. |
|
|
|
--- |
|
|
|
## Requirements |
|
|
|
- **macOS Sequoia** with Apple Neural Engine and 16GB RAM |
|
- **CoreML Tools** and **HuggingFace Transformers** libraries |
|
- **Python 3.9** |
|
|
|
`chat.py` provides a sample inference script. |
|
*We apologize for the current quality of `chat.py` and appreciate your patience.* |
|
|
|
**Installation** |
|
|
|
Unzip all ZIP files with CoreML Models files using Finder or via bash |
|
```bash |
|
cd ./anemll-DeepSeek-8B-ctx1024 |
|
find . -type f -name "*.zip" -exec unzip {} \; |
|
``` |
|
|
|
```bash |
|
pip install coremltools transformers |
|
``` |
|
|
|
**Coremltools :** |
|
|
|
See coremltools intallation https://coremltools.readme.io/v4.0/docs/installation |
|
|
|
**How to RUN:** |
|
|
|
python chat.py |
|
|
|
Ctr-D to exit, Ctr-C to interrupt inference. |
|
|
|
**Alternative way to run:** |
|
|
|
```bash |
|
python chat.py Q123 -d /path/to/anemll-DeepSeek-8B-ctx1024 ctx=1024 |
|
|
|
``` |
|
|
|
The first time the model loads, macOS will take some time to place it on the device. |
|
Subsequent loads will be instantaneous. |
|
|
|
|
|
** More Info ** |
|
Please check following links for later updates: |
|
|
|
• https://huggingface.co/anemll |
|
• https://x.com/anemll |
|
• https://github.com/anemll |
|
• https://anemll.com |
|
|
|
[email protected] |
|
|