Edit model card

Model Information

This model, derived from Meta’s Llama-3.1-8B-Instruct, has been converted and optimized to run efficiently on Qualcomm Cloud AI 100 hardware. Leveraging Qualcomm's developer-centric toolchain, it incorporates reengineered Transformer components and precision-optimized graph transformations for enhanced performance on-device.

Key Features

  • Optimized LLM Blocks: Includes custom modules to handle intermediate states and precision challenges, ensuring high-performance inference.
  • Transformation Tools: Supports graph modifications to retain model accuracy while improving efficiency through mathematical optimizations.
  • Export Ready: Compatible with ONNX for easy deployment.
  • Comprehensive Testing: Each PR undergoes extensive validation, comparing MSE against the original model.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Hyratek/Llama-3.1-8B-Instruct-QAIC

Finetuned
(420)
this model

Collection including Hyratek/Llama-3.1-8B-Instruct-QAIC