Model Card: Fine-tuned LLaMA 3.2 Model

Model Description

This model is a fine-tuned version of LLaMA 3.2, designed specifically for tasks in the domain of learning analytics and education systems improvement. It has been trained on a carefully curated dataset that includes question-answer pairs and dialogue data, ensuring high-quality responses tailored to educational and analytical contexts.

Key Features:

  • Base Model: LLaMA 3.2
  • Fine-tuning Approach: Supervised fine-tuning with a question-answer structured dataset.
  • Domains Covered: Education systems, learning analytics, review/meta-analysis literature, and strategies for academic success.

Training Data

The fine-tuning dataset was crafted with precision to ensure the quality and relevance of the model's responses. The dataset contains thousands of entries with two primary formats:

  1. ShareGPT-style dialogues:

    • Full discussions between a human and another actor (e.g., an AI) structured as interactive conversations.
  2. Alpaca-style question-answer pairs:

    • Data structured with concise input and output information in a Q&A format.

Dataset Creation Process:

1. Literature-Based Question-Answer Pairs:

  • Lens.org Collection:

    • Papers filtered using keywords such as "review" and "meta-analysis".
    • Abstract sections were extracted for concise summaries of objectives, methods, and conclusions.
    • A Python program utilizing the Gemini API was used to generate relevant questions for each abstract.
    • Data Size: 14,000 question-answer pairs.
  • Scopus.com Collection:

    • Focused on the keyword "learning analytics."
    • An additional 8,000 question-answer pairs were generated using the same methodology.

2. ChatGPT Recommendations for Education System Improvements:

  • High-quality recommendations generated by ChatGPT on topics such as:
    • Reducing dropout rates.
    • Combating academic failure.
    • Supporting student success.
  • Data Size: 544 question-answer pairs.

Example of Dataset:

[
  {
    "instruction": "What are the key factors influencing student success?",
    "output": "Key factors include teacher effectiveness, parental involvement, and access to educational resources."
  },
  {
    "instruction": "How can dropout rates be reduced?",
    "output": "Dropout rates can be reduced by implementing early intervention programs, providing mentorship opportunities, and addressing socio-economic barriers."
  }
]

Dataset Highlights:

  • Over 22,500 entries spanning multiple sub-domains within education and learning analytics.
  • Data curated to ensure clarity, relevance, and high-quality question-answer pairs.

Model Performance

Intended Use Cases

  • Education Research: Assisting researchers and educators in analyzing learning trends and strategies.
  • Learning Analytics: Providing insights into educational systems, success factors, and intervention strategies.
  • Academic Assistance: Answering domain-specific questions in education.

Limitations

  • The model is fine-tuned for education and learning analytics; its performance in unrelated domains may vary.
  • Limited coverage of topics outside the dataset's scope.

Ethical Considerations

  • The model may reflect biases present in the training data, such as those inherent in academic literature or AI-generated content.
  • Users should verify critical outputs, especially in high-stakes scenarios such as policy-making or educational interventions.

Citation

If you use this model in your research or applications, please cite:

@misc{llama3_finetuned_education,
  title={Fine-tuned LLaMA 3.2 for Learning Analytics},
  author={Ibrahim Belayachi},
  year={2025},
  howpublished={\url{https://huggingface.co/ibrahimBlyc/Llama_be_LA_}},
  note={Fine-tuned on education and learning analytics datasets}
}

Contact

For questions or feedback, please contact Ibrahim Belayachi at [email protected].

Downloads last month
47
Safetensors
Model size
3.21B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train ibrahimBlyc/Llama_be_LA_