Optmized LLM models for Ryzen AI.
The models are quantized & tested via Ryzen AI SW 1.2.
The folder quantized_models
contains a set of LLMs quantized via different algorithms.
The folder onnx_model_nodes
contains the mlp.up_proj
nodes for -npugpu
device parallel config.
Due to the disordered Ryzen AI Python environment, you may need to install different versions of transformers:
transformers==4.37.2 # Others
transformers==4.39.1 # GEMMA
transformers>=4.42 # Mistral