File size: 2,913 Bytes

5d91a83
 
dab8b3f
 
 
 
 
 
5ef5b50
6797664
 
ac742dd
d97580a
8429aff
d97580a
8429aff
dab8b3f
 
32a590b
c6ef738
3488529
71547b1
3488529
d97580a
78adb3e
0ff18d4
74f6add
650b7f8
 
b8d8133
8df527f
41e9c6d
 
d97580a
41e9c6d
b672a5a
41e9c6d
 
 
 
 
 
 
 
 
 
024a65a
 
41e9c6d
024a65a
41e9c6d
024a65a
41e9c6d
 
 
 
 
 
 
 
 
 
 
 
b672a5a
41e9c6d

---
license: bigscience-bloom-rail-1.0
datasets:
- jslin09/Fraud_Case_Verdicts
language:
- zh
metrics:
- accuracy
pipeline_tag: text-generation
text-generation:
  parameters:
    max_length: 400
    do_sample: true
    temperature: 0.75
    top_k: 50
    top_p: 0.9
tags:
- legal
widget:
- text: 王大明意圖為自己不法所有，基於竊盜之犯意，
  example_title: 生成竊盜罪之犯罪事實
- text: 騙人布意圖為自己不法所有，基於詐欺取財之犯意，
  example_title: 生成詐欺罪之犯罪事實
- text: 森上梅前明知其無資力支付酒店消費，亦無付款意願，竟意圖為自己不法之所有，
  example_title: 生成吃霸王餐之詐欺犯罪事實
- text: 闕很大明知金融帳戶之存摺、提款卡及密碼係供自己使用之重要理財工具，
  example_title: 生成賣帳戶幫助詐欺犯罪事實
---

# 判決書草稿自動生成
本模型是以司法院公開之「詐欺」案件判決書做成之資料集，基於 [BLOOM 560m](https://huggingface.co/bigscience/bloomz-560m) 模型進行遷移學習訓練。資料集之資料範圍從100年1月1日至110年12月31日，所蒐集到的原始資料共有 74823 篇（判決以及裁定），我們只取判決書的「犯罪事實」欄位內容，並把這原始的資料分成三份，用於訓練的資料集有59858篇，約佔原始資料的80%，剩下的20%，則是各分配10%給驗證集（7482篇），10%給測試集（7483篇）。在本網頁進行測試時，請在模型載入完畢並生成第一小句後，持續按下Compute按鈕，就能持續生成文字。或是輸入自己想要測試的資料到文字框中進行測試。

# 使用範例
如果要在自己的程式中調用本模型，可以參考下列的 Python 程式碼，藉由呼叫 API 的方式來生成刑事判決書「犯罪事實」欄的內容。
<pre>
  <code>
import requests, json
from time import sleep
from tqdm.auto import tqdm, trange

API_URL = "https://api-inference.huggingface.co/models/jslin09/bloom-560m-finetuned-fraud"
headers = {"Authorization": "Bearer XXXXXXXXXXXXXXX"} # 調用模型的 API token

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return json.loads(response.content.decode("utf-8"))

prompt = "森上梅前明知其無資力支付酒店消費，亦無付款意願，竟意圖為自己不法之所有，"
query_dict = {
	"inputs": prompt,
}
text_len = 300
t = trange(text_len, desc= '生成例稿', leave=True)
for i in t:
    response = query(query_dict)
    try:
        response_text = response[0]['generated_text']
        query_dict["inputs"] = response_text
        t.set_description(f"{i}: {response[0]['generated_text']}")
        t.refresh()
    except KeyError:
        sleep(30) # 如果伺服器太忙無回應，等30秒後再試。
        pass
print(response[0]['generated_text'])
  </code>
</pre>