jorge-henao's picture
Update README.md
addf8b9
|
raw
history blame
2.98 kB
metadata
license: apache-2.0

Ask2Democracy project


What's baizemocracy-lora-7B-cfqa model?

This model is an open-source chat model fine-tuned with LoRA inspired by Baize project. It was trained with the Baize datasets and the ask2democracy-cfqa-salud-pension dataset, wich contains almost 4k instructions to answers questions based on a context relevant to citizen concerns and public debate in spanish. Two major experiments models was performed during the Hackathon Somos NLP 2023: A conversational style focused model and a contex focused style model. This model is focused in a more conversational way of asking questions. See Pre-proccessing dataset section. There is other model variation more focused on augmented retrieval based on context Baizemocracy-contextfocused.

Testing is a work in progress, we decide to share both model variations with community in order to invovle more people experimenting what it works better and find other possible use cases.

Training Parameters

  • Base Model: LLaMA-7B
  • Training Epoch: 1
  • Batch Size: 16
  • Maximum Input Length: 512
  • Learning Rate: 2e-4
  • LoRA Rank: 8
  • Updated Modules: All Linears

Training Dataset

def format_instruction_without_context(example):
  example["topic"] = example['input']
  input = "La conversación entre un humano y un asistente de IA."
  input += "\n[|Human|] "+example['input']
  input += "\n[|AI|] "+example["output"]
  if len(example["topics"])>0:
    topics = ", ".join(example["topics"])
    input += "\n[|Human|] "+"¿En cuáles tópicos clasificarías su respuesta?"
    input += "\n[|AI|] "+f"Aquí una lista de tópicos: {topics}."
    example["topic"] += f" ({topics})"
  example["input"] = input
  return example
data_reforma_salud_cfqa_without_context = data_reforma_salud_cfqa.map(format_instruction_without_context, remove_columns=['output','topics','instruction'])
data_reforma_salud_cqa_withou
</code>


More details can be found in the Ask2Democracy [GitHub](https://github.com/jorge-henao/ask2democracy)