bamec66557's picture
Upload folder using huggingface_hub
85ed3ab verified
|
raw
history blame
2 kB
metadata
base_model:
  - bamec66557/MNRP_0.5
  - bamec66557/MISCHIEVOUS-12B
library_name: transformers
tags:
  - mergekit
  - merge

merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the SLERP merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

slices:
  - sources:
      - model: bamec66557/MNRP_0.5
        layer_range: [0, 40]  # MNRP_0.5 ๋ชจ๋ธ์˜ ๋ณ‘ํ•ฉ ๋ ˆ์ด์–ด ๋ฒ”์œ„
      - model: bamec66557/MISCHIEVOUS-12B
        layer_range: [0, 40]  # MISCHIEVOUS-12B ๋ชจ๋ธ์˜ ๋ณ‘ํ•ฉ ๋ ˆ์ด์–ด ๋ฒ”์œ„
    
    # Layer๋ณ„ ๋ณ‘ํ•ฉ ๋น„์œจ์„ ์กฐ์ •ํ•˜์—ฌ ๋” ๋ถ€๋“œ๋Ÿฌ์šด ํ†ตํ•ฉ์„ ์œ ๋„
    # ๊ฐ ํ•„ํ„ฐ๋Š” ๋ชจ๋ธ ๋‚ด ํŠน์ • ๋ฉ”์ปค๋‹ˆ์ฆ˜์— ์˜ํ–ฅ์„ ๋ฏธ์นจ
    parameters:
      t:
        - filter: self_attn
          value: [0.2, 0.4, 0.6, 0.8, 1.0]  # Self-attention ๋ ˆ์ด์–ด์˜ ์ ์ง„์  ๋ณ‘ํ•ฉ
        - filter: mlp
          value: [0.8, 0.6, 0.4, 0.2, 0.0]  # MLP ๋ ˆ์ด์–ด๋Š” ๋ฐ˜๋Œ€ ๋น„์œจ๋กœ ๋ณ‘ํ•ฉ
        - filter: layer_norm
          value: [0.5, 0.5, 0.5, 0.5, 0.5]  # Layer Normalization์€ ๊ท ์ผ ๋ณ‘ํ•ฉ
        - value: 0.7  # ๊ธฐ๋ณธ๊ฐ’

merge_method: slerp  # ๋ณ‘ํ•ฉ ๋ฐฉ์‹์„ slerp๋กœ ๋ณ€๊ฒฝ

base_model: bamec66557/MISCHIEVOUS-12B  # ๋ณ‘ํ•ฉ์˜ ๊ธฐ๋ณธ ๋ชจ๋ธ

dtype: bfloat16  # ๋ณ‘ํ•ฉ ์‹œ ํšจ์œจ์ ์ด๊ณ  ๋น ๋ฅธ ์—ฐ์‚ฐ์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ํƒ€์ž…

# ์ถ”๊ฐ€์ ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์˜ต์…˜
regularization:
  - method: l2_norm  # L2 ์ •๊ทœํ™”๋ฅผ ํ†ตํ•ด ๋ณ‘ํ•ฉ๋œ ๋ชจ๋ธ ๊ฐ€์ค‘์น˜ ์•ˆ์ •ํ™”
    scale: 0.01

postprocessing:
  - operation: smoothing  # ๋ณ‘ํ•ฉ ํ›„ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€๋“œ๋Ÿฝ๊ฒŒ ์กฐ์ •
    kernel_size: 3
  - operation: normalize  # ์ „์ฒด ๊ฐ€์ค‘์น˜๋ฅผ ์ •๊ทœํ™”