🧠 LOHAMEIT 3.2: Next-Gen Conversational AI

Welcome to LOHAMEIT 3.2, a cutting-edge conversational AI model built using the Llama-3.2-3B-Instruct model in MLC format q4f16_0. This repository leverages state-of-the-art natural language processing (NLP) techniques for scalable, efficient, and real-time conversational agents, with a focus on multi-turn conversation tracking and inference optimization.

This project is compatible with MLC-LLM and WebLLM, enabling easy integration into various platforms for dynamic interaction.

🚀 Model Overview

Base Model: Llama-3.2-3B-Instruct.
Format: Quantized to q4f16_0.
Project Compatibility:
- MLC-LLM
- WebLLM

This model is optimized for both server-side and local inference environments, enabling real-time, contextually aware conversational capabilities with low memory and hardware requirements.

🌟 Key Features

High Efficiency: Supports quantized model execution, making it ideal for resource-constrained environments.
Multi-Turn Conversations: Tracks conversational context for coherent, human-like responses.
Optimized for Deployment: Ready to be used in web-based interfaces via WebLLM or command-line applications.

💡 Why LOHAMEIT 3.2?

This project provides a scalable, highly efficient NLP model suitable for a wide range of applications, from interactive bots to more complex AI-driven systems. LOHAMEIT 3.2 offers:

Real-Time Inference: Designed for immediate response times, ideal for live interactions.
Low Hardware Requirements: Runs efficiently on CPUs and lower-end GPUs.
Flexible Integration: Works across command-line, server, and web-based applications, supporting a variety of use cases.

🛠️ Installation and Usage

Before starting, ensure that the MLC-LLM library is installed on your system. For installation instructions, visit the MLC-LLM Installation Documentation.

Example Commands:

Chat:

mlc_llm chat HF://mlc-ai/Llama-3.2-3B-Instruct-q4f16_0-MLC

REST Server:

mlc_llm serve HF://mlc-ai/Llama-3.2-3B-Instruct-q4f16_0-MLC

Python API:

from mlc_llm import MLCEngine

# Create engine
model = "HF://mlc-ai/Llama-3.2-3B-Instruct-q4f16_0-MLC"
engine = MLCEngine(model)

# Run chat completion
for response in engine.chat.completions.create(
    messages=[{"role": "user", "content": "What is the meaning of life?"}],
    model=model,
    stream=True,
):
    for choice in response.choices:
        print(choice.delta.content, end="", flush=True)
print("\n")

engine.terminate()

📂 Project Structure

LOHAMEIT-3.2/
├── models/                # Model weights and configurations
├── src/                   # Source code for model interaction and API integration
├── scripts/               # Utility scripts for deployment
├── README.md              # This file
└── requirements.txt       # Python dependencies

🔧 Installation Guide

Clone the repository:

git clone https://github.com/LOHAMEIT/lohameit-3.2.git
cd lohameit-3.2

Install Dependencies: Install the required dependencies using the provided requirements.txt file.
Run the Model: Follow the instructions above to either run the chat interface or deploy as a REST server.

🧠 Future Enhancements

Quantization Improvements: Additional support for 4-bit quantization will further reduce resource usage while maintaining performance.
Enhanced Conversational Tracking: Improved context management for even more natural conversations over extended sessions.

📧 Contact

For questions, feedback, or collaboration, feel free to reach out at [email protected].

A heartfelt thank you to G. Akshitha from MLRIT College for her valuable contributions and support throughout this project. Your insights were instrumental in shaping key aspects of LOHAMEIT 3.2.

LOHAMEIT
/

lohameit-3.2

You need to agree to share your contact information to access this model