Model Description

Work in progress

  • Finetuned from model: Llama-3.2-1B-Instruct

Downstream Use

Model for predicting relations between entities in the financial documents.

Relation Types

  • no_relation
  • title
  • operations_in
  • employee_of
  • agreement_with
  • formed_on
  • member_of
  • subsidiary_of
  • shares_of
  • revenue_of
  • loss_of
  • headquartered_in
  • acquired_on
  • founder_of
  • formed_in

Load Model with PEFT adapter

finetune_name = 'Askinkaty/llama-finance-relations'


finetined_model = AutoPeftModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=finetune_name,
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
    device_map="auto",
)

base_model_name = "meta-llama/Llama-3.2-1B-Instruct"
base_model = AutoModelForCausalLM.from_pretrained(base_model_name,
                                                  torch_dtype=torch.float16,
                                                  low_cpu_mem_usage=True)

tokenizer = AutoTokenizer.from_pretrained(model_name)

base_model.config.pad_token_id = base_model.config.eos_token_id
model.config.pad_token_id = model.config.eos_token_id

pipeline = pipeline('text-generation', model=base_model, tokenizer=tokenizer, max_length=1024, device=device)
pipeline.model = model.to(device) 

Training Details

Training Data

Samples from ReFinD dataset. 100 examples for each relation type were used, least frequent relation types are omitted.

Preprocessing

Dataset is converted into a message format as in the code snippet below:

def batch_convert_to_messages(data):
        
    questions = data.apply(
        lambda x: f"Entity 1: {' '.join(x['token'][x['e1_start']:x['e1_end']])}. "
                  f"Entity 2: {' '.join(x['token'][x['e2_start']:x['e2_end']])}. "
                  f"Input sentence: {' '.join(x['token'])}",
        axis=1
    )

    relations = data['relation'].apply(lambda relation: relation.split(':')[-1])
    
    messages = [
        [
            {
                "role": "system",
                "content": "You are an expert in financial documentation and market analysis. Define relations between two specified entities: entity 1 [E1] and entity 2 [E2] in a sentence. Return a short response of the required format. "
            },
            {"role": "user", "content": question},
            {"role": "assistant", "content": relation},
        ]
        for question, relation in zip(questions, relations)
    ]
    
    return messages

Training Hyperparameters

SFT parameters:

  • num_train_epochs=1
  • per_device_train_batch_size=2
  • gradient_accumulation_steps=2
  • gradient_checkpointing=True
  • optim="adamw_torch_fused"
  • learning_rate=2e-4
  • max_grad_norm=0.3
  • warmup_ratio=0.01
  • lr_scheduler_type="cosine"
  • bf16=True

LORA parameters:

  • rank_dimension = 6
  • lora_alpha = 8
  • lora_dropout = 0.05

Evaluation

Testing Data

Test set sampled from Samples from ReFinD dataset.

Metrics

Overall Performance:
Precision: 0.77
Recall: 0.69
F1 Score: 0.71

Classification Report:
                  precision    recall  f1-score   support

     no_relation       0.00      0.00      0.00         0
           title       0.00      0.00      0.00         0
   operations_in       0.65      0.66      0.66       100
     employee_of       0.00      0.00      0.00         0
  agreement_with       0.58      0.88      0.70       100
       formed_on       0.00      0.00      0.00         0
       member_of       0.99      0.96      0.97        96
   subsidiary_of       0.00      0.00      0.00         0
       shares_of       0.00      0.00      0.00         0
      revenue_of       0.60      0.27      0.38        95
         loss_of       0.64      0.37      0.47       100
headquartered_in       0.99      0.73      0.84       100
     acquired_on       0.00      0.00      0.00         0
      founder_of       0.74      0.77      0.76        83
       formed_in       0.96      0.91      0.93       100

        accuracy                           0.69       774
       macro avg       0.41      0.37      0.38       774
    weighted avg       0.77      0.69      0.71       774
  • PEFT 0.14.0
Downloads last month
59
Safetensors
Model size
1.5B params
Tensor type
F32
·
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Askinkaty/llama-finance-relations

Adapter
(158)
this model