simona-rc commited on
Commit
db3d173
·
verified ·
1 Parent(s): 2b25d1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -4
README.md CHANGED
@@ -1,8 +1,82 @@
1
  ---
2
  tags:
3
- - model_hub_mixin
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
7
- - Library: [More Information Needed]
8
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  tags:
3
+ - drug-discovery
4
+ - ibm
5
+ - mammal
6
+ - pytorch
7
+ - small molecules drugs
8
+ - smiles
9
+ - MoleculeNet
10
+ - blood-brain barrier
11
+ - safetensors
12
+ - biomed-multi-alignment
13
+ license: apache-2.0
14
+ library_name: biomed-multi-alignment
15
+ base_model:
16
+ - ibm/biomed.omics.bl.sm.ma-ted-458m
17
  ---
18
 
19
+ Drugs targeting the central nervous system must meet stringent criteria for both efficacy and safety, including their ability to penetrate the blood-brain barrier (BBB).
20
+ This model predicts the likelihood of small-molecule drugs crossing the BBB, a critical factor in CNS drug development.
21
+ The molecules are represented using SMILES (Simplified Molecular Input Line Entry System) strings.
22
+
23
+ The model is a fine-tuned version of IBM's biomedical foundation model, ibm/biomed.omics.bl.sm.ma-ted-458m [1], which was trained on over 2 billion biological samples
24
+ across multiple modalities, including proteins, small molecules, and single-cell gene expression data.
25
+
26
+ The fine-tuning was performed using the MoleculeNet BBBP dataset [2]. For benchmarking, we employed predefined training, validation, and testing splits
27
+ provided by MolFormer [3], sourced from the dataset referenced in [4].
28
+
29
+ - [1] https://huggingface.co/ibm/biomed.omics.bl.sm.ma-ted-458m
30
+ - [2] Zhenqin Wu et al. “MoleculeNet: a benchmark for molecular machine learning”.
31
+ In: Chemical science 9.2 (2018), pp. 513–530.
32
+ - [3] Jerret Ross et al. “Large-scale chemical language representations capture molecular
33
+ structure and properties”. In: Nature Machine Intelligence 4.12 (2022),
34
+ pp. 1256–1264.
35
+ - [4] https://github.com/IBM/molformer/tree/main/data that points to https://ibm.ent.box.com/v/MoLFormer-data (file: finetune datasets.zip).
36
+
37
+
38
+ ## Model Summary
39
+
40
+ - **Developers:** IBM Research
41
+ - **GitHub Repository:** https://github.com/BiomedSciAI/biomed-multi-alignment
42
+ - **Paper:** https://arxiv.org/abs/2410.22367
43
+ - **Release Date**: Dec 4th, 2024
44
+ - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
45
+
46
+ ## Usage
47
+
48
+ Using `biomed.omics.bl.sm.ma-ted-458m.moleculenet_bbbp` requires installing https://github.com/BiomedSciAI/biomed-multi-alignment
49
+
50
+ ```
51
+ pip install git+https://github.com/BiomedSciAI/biomed-multi-alignment.git
52
+ ```
53
+
54
+ A simple example for using `ibm/omics.bl.sm.ma-ted-458m.moleculenet_bbbp`:
55
+
56
+ ```
57
+ from mammal.examples.molnet.molnet_infer import load_model, task_infer
58
+
59
+ smiles_seq = "C(Cl)Cl"
60
+
61
+ task_dict = load_model(task_name="BBBP", device="cpu")
62
+ result = task_infer(task_dict=task_dict, smiles_seq=smiles_seq)
63
+ print(f"The prediction for {smiles_seq=} is {result}")
64
+ ```
65
+
66
+ See our detailed example at: on `https://github.com/BiomedSciAI/biomed-multi-alignment`
67
+
68
+
69
+ ## Citation
70
+
71
+ If you found our work useful, please consider giving a star to the repo and cite our paper:
72
+ ```
73
+ @misc{shoshan2024mammalmolecularaligned,
74
+ title={MAMMAL -- Molecular Aligned Multi-Modal Architecture and Language},
75
+ author={Yoel Shoshan and Moshiko Raboh and Michal Ozery-Flato and Vadim Ratner and Alex Golts and Jeffrey K. Weber and Ella Barkan and Simona Rabinovici-Cohen and Sagi Polaczek and Ido Amos and Ben Shapira and Liam Hazan and Matan Ninio and Sivan Ravid and Michael M. Danziger and Joseph A. Morrone and Parthasarathy Suryanarayanan and Michal Rosen-Zvi and Efrat Hexter},
76
+ year={2024},
77
+ eprint={2410.22367},
78
+ archivePrefix={arXiv},
79
+ primaryClass={q-bio.QM},
80
+ url={https://arxiv.org/abs/2410.22367},
81
+ }
82
+ ```