--- base_model: - meta-llama/Llama-3.1-8B-Instruct library_name: transformers pipeline_tag: text-generation tags: - qaic - qaicrt --- # Model Information This model, derived from Meta’s Llama-3.1-8B-Instruct, has been converted and optimized to run efficiently on Qualcomm Cloud AI 100 hardware. Leveraging Qualcomm's developer-centric toolchain, it incorporates reengineered Transformer components and precision-optimized graph transformations for enhanced performance on-device. # Key Features - Optimized LLM Blocks: Includes custom modules to handle intermediate states and precision challenges, ensuring high-performance inference. - Transformation Tools: Supports graph modifications to retain model accuracy while improving efficiency through mathematical optimizations. - Export Ready: Compatible with ONNX for easy deployment. - Comprehensive Testing: Each PR undergoes extensive validation, comparing MSE against the original model.