Optmized LLM models for Ryzen AI.

The models are quantized & tested via Ryzen AI SW 1.2.

The folder quantized_models contains a set of LLMs quantized via different algorithms.
The folder onnx_model_nodes contains the mlp.up_proj nodes for -npugpu device parallel config.

Due to the disordered Ryzen AI Python environment, you may need to install different versions of transformers:

transformers==4.37.2  # Others
transformers==4.39.1  # GEMMA
transformers>=4.42  # Mistral