--- license: apache-2.0 --- # Model Card for Model ID **bling-qa-tool** is a 4_K_M quantized GGUF version of bling-tiny-llama-1b-v0, providing a small, fast inference implementation. Load in your favorite GGUF inference engine (see details in config.json to set up the prompt template), or try with llmware as follows: from llmware.models import ModelCatalog # to load the model and make a basic inference qa_tool = ModelCatalog().load_model("bling-qa-tool") response = qa_tool.function_call(text_sample) # this one line will download the model and run a series of tests ModelCatalog().test_run("bling-qa-tool", verbose=True) Slim models can also be loaded even more simply as part of a multi-model, multi-step LLMfx calls: from llmware.agents import LLMfx llm_fx = LLMfx() llm_fx.load_tool("quick_question") response = llm_fx.quick_question(text) ### Model Description - **Developed by:** llmware - **Model type:** GGUF - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Quantized from model:** llmware/bling-tiny-llama-1b-v0 ## Uses Model instructions, details and test samples have been packaged into the config.json file in the repository, along with the GGUF file. ## Model Card Contact Darren Oberst & llmware team