sh2orc commited on 8 days ago

Commit

a4a48a9

verified ·

1 Parent(s): 626ec85

Upload LlamaForCausalLM

Browse files

Files changed (19) hide show

README.md +3 -121
config.json +55 -0
generation_config.json +9 -0
model-00001-of-00015.safetensors +3 -0
model-00002-of-00015.safetensors +3 -0
model-00003-of-00015.safetensors +3 -0
model-00004-of-00015.safetensors +3 -0
model-00005-of-00015.safetensors +3 -0
model-00006-of-00015.safetensors +3 -0
model-00007-of-00015.safetensors +3 -0
model-00008-of-00015.safetensors +3 -0
model-00009-of-00015.safetensors +3 -0
model-00010-of-00015.safetensors +3 -0
model-00011-of-00015.safetensors +3 -0
model-00012-of-00015.safetensors +3 -0
model-00013-of-00015.safetensors +3 -0
model-00014-of-00015.safetensors +3 -0
model-00015-of-00015.safetensors +3 -0
model.safetensors.index.json +0 -0

README.md CHANGED Viewed

@@ -1,121 +1,3 @@
----
-license: mit
-base_model:
-- deepseek-ai/DeepSeek-R1-Distill-Llama-70B
----
-BCCard/DeepSeek-R1-Distill-Llama-8B-BCGPT is a quantized model based on DeepSeek-R1-Distill-Llama-70B.
-```
-model.eval()
-instruction = "서울의 유명한 관광 코스를 상세하게 일주일간 일별로 만들어줄래?"
-messages = [
-    {"role": "system", "content": f"당신은 훌륭한 AI 한국 전문가입니다. 한국어로 답변해주세요."} ,
-    {"role": "user", "content": f"{instruction}"}
-]
-input_ids = tokenizer.apply_chat_template(
-    messages,
-    add_generation_prompt=True,
-    return_tensors="pt"
-).to(model.device)
-terminators = [
-    tokenizer.eos_token_id,
-    tokenizer.convert_tokens_to_ids("<|eot_id|>")
-]
-outputs = model.generate(
-    input_ids,
-    max_new_tokens=4096,
-    eos_token_id=terminators,
-    do_sample=True,
-    temperature=1.0,
-    top_p=0.9,
-)
-print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))
-```
-```
-<think>
-</think>
-물론 서울의 유명한 관광 코스는 서울의 대표적인 관광명소들을 모아 놓은 코스로, 보통 1일, 3일~4일 정도의 일정으로 구성되어 있습니다. 아래는 대표적인 서울 관광 코스를 1일, 3일~4일 단위로 나누어 설명드리겠습니다.
-### 1일 코스
-**명소:** 광화문 일대, 서울의 대표적인 명소를 하루 동안 방문할 수 있는 코스입니다.
-1. **경복궁**
-   - 서울의 대표적인 궁전으로, 전통 한옥 양식으로 유명한 곳입니다.
-2. **북촌 한옥마을**
-   - 전통과 현대적인 예술을 함께 즐길 수 있는 공간입니다.
-3. **명동 거리**
-   - 서울의 역사를 느끼고 쇼핑을 즐길 수 있는 활기찬 거리입니다.
-### 3일~4일 코스
-**명소:** 서울의 더 많은 명소를 방문하기 위한 3일~4일 일정의 코스입니다.
-1. **광화문**
-   - 서울의 중심부로, 역사적 의미가 깊은 장소입니다.
-2. **인사동**
-   - 전통과 현대 예술이 공존하는 거리로, 다양한 문화 체험이 가능합니다.
-3. **명동 거리**
-   - 서울의 역사를 배우고 쇼핑을 즐길 수 있는 대표적인 장소입니다.
-4. **동대문**
-   - 서울의 문화와 역사를 느낄 수 있는 활기찬 지역입니다.
-### 상세 일정
-1. **경복궁** (1일차)
-   - 위치: 서울 종로구
-   - 관람 시간: 09:00 ~ 10:00
-   - 특징: 경복궁의 아름다운 경관과 역사적인 배경을 감상할 수 있습니다.
-2. **북촌 한옥마을** (1일차)
-   - 위치: 서울 종로구
-   - 관람 시간: 10:30 ~ 12:00
-   - 특징: 전통 한옥과 현대 예술이 어우러진 독특한 분위기를 경험할 수 있습니다.
-3. **명동 거리** (2일차)
-   - 위치: 서울 중구
-   - 관람 시간: 13:00 ~ 15:00
-   - 특징: 활기찬 거리에서 다양한 쇼핑과 먹거리를 즐길 수 있습니다.
-4. **동대문** (3일차)
-   - 위치: 서울 종로구
-   - 관람 시간: 15:30 ~ 17:30
-   - 특징: 서울의 문화와 현대적인 디자인이 조화를 이루는 공간입니다.
-이처럼 서울의 유명한 관광 코스를 1일 또는 3일~4일 일정으로 나누어 방문할 수 있습니다. 각 명소의 매력을 즐기면서 서울의 다양한 문화와 역사를 경험해 보세요.
-```
-- Question
-```
-Suppose I borrowed 3 million won. A month later, I have 3 million won available to repay the loan. However, instead of paying it back, I'm considering investing it in stocks as of December 2022. What do you think about that?
-```
-- Answer
-```
-<think>
-</think>
-</think>
-If you borrowed 3 million won and after a month you have the same amount available to repay the loan, it's important to consider the implications of not repaying the loan. Instead of repaying the loan, investing the funds in stocks carries significant risks, including market volatility and potential losses.
-Here are some key points to consider:
-1. **Interest and Fees**: By not repaying the loan, you may incur interest charges and potential late fees, which could increase the amount you owe over time.
-2. **Risk of Loss in Stocks**: The stock market can be unpredictable, and there's a risk that the value of your investment could decrease, leading to potential losses.
-3. **Legal and Credit Implications**: Failing to repay a loan can damage your credit score and may lead to legal consequences, depending on the terms of the loan and the lender's policies.
-4. **Opportunity Cost**: While investing might offer the potential for gains, it's important to weigh this against the guaranteed costs of not repaying the loan.
-It's generally advisable to prioritize repaying debts over speculative investments, especially if the loan terms are unfavorable or if the interest rate is high. If you're considering investing, ensure you have a clear understanding of the risks involved and consider consulting with a financial advisor to make an informed decision.
-```

+---
+license: mit
+---

config.json ADDED Viewed

	@@ -0,0 +1,55 @@

+{
+  "_name_or_path": "DeepSeek-R1-Distill-Llama-70B",
+  "architectures": [
+    "LlamaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 128000,
+  "eos_token_id": [
+    128001,
+    128008,
+    128009
+  ],
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 8192,
+  "initializer_range": 0.02,
+  "intermediate_size": 28672,
+  "max_position_embeddings": 131072,
+  "mlp_bias": false,
+  "model_type": "llama",
+  "num_attention_heads": 64,
+  "num_hidden_layers": 80,
+  "num_key_value_heads": 8,
+  "pretraining_tp": 1,
+  "quantization_config": {
+    "_load_in_4bit": false,
+    "_load_in_8bit": true,
+    "bnb_4bit_compute_dtype": "float32",
+    "bnb_4bit_quant_storage": "uint8",
+    "bnb_4bit_quant_type": "fp4",
+    "bnb_4bit_use_double_quant": false,
+    "llm_int8_enable_fp32_cpu_offload": false,
+    "llm_int8_has_fp16_weight": false,
+    "llm_int8_skip_modules": null,
+    "llm_int8_threshold": 6.0,
+    "load_in_4bit": false,
+    "load_in_8bit": true,
+    "quant_method": "bitsandbytes"
+  },
+  "rms_norm_eps": 1e-05,
+  "rope_scaling": {
+    "factor": 8.0,
+    "high_freq_factor": 4.0,
+    "low_freq_factor": 1.0,
+    "original_max_position_embeddings": 8192,
+    "rope_type": "llama3"
+  },
+  "rope_theta": 500000.0,
+  "tie_word_embeddings": false,
+  "torch_dtype": "bfloat16",
+  "transformers_version": "4.48.2",
+  "use_cache": true,
+  "vocab_size": 128256
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 128000,
+  "do_sample": true,
+  "eos_token_id": 128001,
+  "temperature": 0.6,
+  "top_p": 0.95,
+  "transformers_version": "4.48.2"
+}

model-00001-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:81a2023952146d1ece44cdc893616cf0c2fa3efd1557cdea67bc89f9b40e6e50
+size 4820443929

model-00002-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4433056377edcb0ed67701128346d0da64d5f7925f0065065c63a12f982c3636
+size 4984984974

model-00003-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b825d7109c787df2b2811c5741b3b07bf36a8ac164b1ca3b8bf1c1122cbe456e
+size 4901107969

model-00004-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8a6dd53c70d35b0a6de7dc0f7f9cdc962377d42c70d1e8066b86ce571dda51b0
+size 4901059073

model-00005-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4946bc0223e56f8cb36eda54955da58318c05823c43e95f4f8f9833cd0d65fde
+size 4901059073

model-00006-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b9248732a6e7c800fcad9f83acd077354a2252e6914ea7eaac1059c6c75b4530
+size 4984985102

model-00007-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7bd3a5fbc22562b899dc0210a266a4db6024a443001050c383f7298273673409
+size 4901107993

model-00008-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ee0afec835fcd6e36247bebaa9c05b7575b1de689c377b85231795991bb01ca4
+size 4901059073

model-00009-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a03b9978e7cfb1697d7717de532b5afac941605f3c2215b03c27a9f4e9592d42
+size 4901059073

model-00010-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:591673fbaa6a6eb8de835bff4d421a25a4a1bc9525cf7f14cd393e0a8f8c22ad
+size 4984985102

model-00011-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b07957184cfcaee5f8898893c10ecd5bf7ee4bbc52ee1e0695c86f650e2e5cce
+size 4901107993

model-00012-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4d9db6a46be2d2bfa6617b8f82b21506352d6db0bad44f3d78453a9c451f5659
+size 4901059073

model-00013-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3320c119c8ea2f586078dac894c778c992ce33d723ea692b59098f09902f7f95
+size 4901059073

model-00014-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:590751c92bdd7e09ebbea8c52ccd4b6e1199032f91606f9fb7ffdbf4bd80bb93
+size 4984985102

model-00015-of-00015.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:505d8e58668ad57fcbf054b276acdc5fe04809cbcab5794dcb4688735b16c4c1
+size 3813381278

model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff