merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the TIES merge method using Qwen/Qwen2.5-1.5B-Instruct as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: ties  # Use TIES for merging multiple models
base_model: Qwen/Qwen2.5-1.5B-Instruct  # Base model for the merge
dtype: bfloat16  # Data type for the merged model

models:
  - model: Qwen/Qwen2.5-1.5B-Instruct  # Base model
    parameters:
      weight: 0.5  # Weight for the base model

  - model: Qwen/Qwen2.5-Math-1.5B-Instruct  # Math-focused model
    parameters:
      density: 0.6  # Retain 60% of significant parameters
      weight: 0.3  # Weight for the math model

  - model: Qwen/Qwen2.5-Coder-1.5B-Instruct  # Code-focused model
    parameters:
      density: 0.6  # Retain 60% of significant parameters
      weight: 0.2  # Weight for the coder model

parameters:
  normalize: true  # Normalize weights to ensure compatibility
  int8_mask: true  # Optimize memory and computational efficiency
Downloads last month
16
Safetensors
Model size
1.78B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mergekit-community/SuperQwen-2.5-1.5B