---
license: apache-2.0
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

**bling-qa-tool** is a 4_K_M quantized GGUF version of bling-tiny-llama-1b-v0, providing a small, fast inference implementation.

Load in your favorite GGUF inference engine (see details in config.json to set up the prompt template), or try with llmware as follows:

    from llmware.models import ModelCatalog

    # to load the model and make a basic inference
    qa_tool = ModelCatalog().load_model("bling-qa-tool")
    response = qa_tool.function_call(text_sample)  

    # this one line will download the model and run a series of tests
    ModelCatalog().test_run("bling-qa-tool", verbose=True)  
    
    
Slim models can also be loaded even more simply as part of a multi-model, multi-step LLMfx calls:

    from llmware.agents import LLMfx

    llm_fx = LLMfx()
    llm_fx.load_tool("quick_question")
    response = llm_fx.quick_question(text)


### Model Description

<!-- Provide a longer summary of what this model is. -->

- **Developed by:** llmware
- **Model type:** GGUF 
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Quantized from model:** llmware/bling-tiny-llama-1b-v0 

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

Model instructions, details and test samples have been packaged into the config.json file in the repository, along with the GGUF file.   


## Model Card Contact

Darren Oberst & llmware team