Safetensors
English
logo

BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation

Python PyTorch arXiv Dataset GitHub Repo stars

English | ็ฎ€ไฝ“ไธญๆ–‡

๐Ÿ”ฅ NEWS

  • [2025-03-06] ๐ŸŒŸ The complete dataset and code are now officially open source!
  • [2024-12-11] โซ We are now working on making the code of BearLLM public. Stay tuned!
  • [2024-12-10] ๐ŸŽ‰ The BearLLM paper is accepted by the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25).
  • [2024-08-21] ๐Ÿ“ The preprint of the BearLLM paper is available on arXiv. Check the paper page for more details.

๐Ÿ“… TODO

  • Improve related comments and documentation.
  • Upload the complete BearLLM demo code.
  • Upload the health management corpus of the MBHM dataset.
  • Collect the codes for pre-training and fine-tuning BearLLM.
  • Collect the codes of BearLLM's classification network and other comparison models.
  • Upload the vibration signal portion of the MBHM dataset.

๐Ÿ“š Introduction

The MBHM dataset is the first multimodal dataset designed for the study of bearing health management. It is divided into two parts: vibration signals and health management corpus. The vibration signals and condition information are derived from 9 publicly available datasets, and are still under continuous updating and improvement. The thousands of working conditions pose more difficult challenges for the identification model and better represent real-world usage scenarios.

BearLLM is a prior knowledge-enhanced bearing health management framework with a unified vibration signal representation. This framework transforms the signal to be tested into the frequency domain, enabling effective identification of spectral differences compared to the vibration signal under fault-free conditions. By aligning the vibration signal with the fault semantic embedding, we achieve a unified natural language response for various health management tasks through a fine-tuned language model with low computational overhead. Experiments demonstrate that this framework achieves leading performance under thousands of working conditions.

๐Ÿ’ป Requirements

The code is implemented in Python 3.12. The required packages are listed in the requirements.txt file. You can install the required packages by running the following command:

conda create --name bearllm python=3.12
conda activate bearllm
pip install -r requirements.txt

๐Ÿš€ Quick Start

1. Download Demo Data / Use Your Own Data

First, you need to download the demo_data.json from the MBHM dataset. For users in mainland China, you can use the mirror link to speed up the download:

Or, you can also build your own test data in the same format: instruction: Text instruction for health management task. vib_data: Vibration signal data to be identified, with a required duration of 1 second. ref_data: Reference vibration signal data without faults, with a required duration of 1 second.

{
    "instruction": "xxx.",
    "vib_data": [1.0, 0.0, 1.0, ...],
    "ref_data": [1.0, 0.0, 1.0, ...],
}

2. Download Weights

You can download the pre-trained weights of Qwen2.5-1.5B from Hugging Face.

Additionally, you need to download the weights of BearLLM.

3. Organize Files

It is recommended to organize the weights and test data as follows:

BearLLM/
โ”œโ”€โ”€ qwen_weights/
โ”‚   โ”œโ”€โ”€ model.safetensors
โ”‚   โ”œโ”€โ”€ tokenizer.json
โ”‚   โ”œโ”€โ”€ config.json
โ”‚   โ””โ”€โ”€ other files...
โ”œโ”€โ”€ bearllm_weights/
โ”‚   โ”œโ”€โ”€ vibration_adapter.pth
โ”‚   โ”œโ”€โ”€ adapter_config.json
โ”‚   โ””โ”€โ”€ adapter_model.safetensors
โ””โ”€โ”€ mbhm_dataset/
    โ””โ”€โ”€ demo_data.json 

4. Run Code

First, copy the .env.example file to .env and modify the data paths inside. Then, you can run the code using the following command:

python run_demo.py

โš™๏ธ Development

1. Download Dataset

First, you need to download the following files from the MBHM dataset. For users in mainland China, you can use the mirror link to speed up the download:

  • data.hdf5: Contains the vibration signal data.
  • corpus.json: Contains the health management corpus.
  • metadata.sqlite: Contains metadata information of the dataset.

2. Download Weights

You can download the pre-trained weights of Qwen2.5-1.5B from Hugging Face.

3. Modify Environment Variables

Copy the .env.example file to .env and modify the data paths inside.

4. Pre-train and Fine-tune Model

Pre-train according to src/pre_training.py. Fine-tune according to src/fine_tuning.py.

๐Ÿ“– Citation

Please cite the following paper if you use this study in your research:

@misc{peng2024bearllmpriorknowledgeenhancedbearing,
      title={BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation}, 
      author={Haotian Peng and Jiawei Liu and Jinsong Du and Jie Gao and Wei Wang},
      year={2024},
      eprint={2408.11281},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2408.11281}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for SIA-IDE/BearLLM

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(385)
this model

Dataset used to train SIA-IDE/BearLLM