antoinelouis's picture
Update README.md
e5a285f
|
raw
history blame
No virus
1.56 kB
metadata
language: fr
license: mit
tags:
  - legal
datasets: maastrichtlawtech/bsard
pipeline_tag: fill-mask
widget:
  - text: >-
      Chaque commune de la Région peut adopter un <mask> communal de
      développement, applicable à l'ensemble de son territoire.

Legal-CamemBERT-Base

  • Legal-CamemBERT-Base is a CamemBERT-Base model further pre-trained on 23,000+ legislative articles from the Belgian legislation.
  • We chose the following training set-up: 50k training steps (200 epochs) with batches of 32 sequences of length 512 with an initial learning rate of 5e-5.
  • Training was performed on one Tesla V100 GPU with 32 GB using the code provided by Hugging Face.

Load Pretrained Model

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("maastrichtlawtech/legal-camembert-base")
model = AutoModel.from_pretrained("maastrichtlawtech/legal-camembert-base")

About Us

The Maastricht Law & Tech Lab develops algorithms, models, and systems that allow computers to process natural language texts from the legal domain.

Author: Antoine Louis on behalf of the Maastricht Law & Tech Lab.