Commit
·
b4c1176
1
Parent(s):
d15f1ce
Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,8 @@ This model is not intended for protein function prediction, but rather as a chec
|
|
18 |
with Low Rank Adaptation (LoRA). This is an experimental model fine-tuned from the
|
19 |
[esm2_t6_8M_UR50D](https://huggingface.co/facebook/esm2_t6_8M_UR50D) model
|
20 |
for multi-label classification. In particular, the model is fine-tuned on the CAFA-5 protein sequence dataset available
|
21 |
-
[here](). More precisely, the `train_sequences.fasta` file is the
|
|
|
22 |
`train_terms.tsv` file contains the gene ontology protein function labels for each protein sequence. For more details on using
|
23 |
ESM-2 models for multi-label sequence classification, [see here](https://huggingface.co/docs/transformers/model_doc/esm).
|
24 |
Due to the potentially complicated class weighting necessary for the hierarchical ontology, further fine-tuning will be necessary.
|
|
|
18 |
with Low Rank Adaptation (LoRA). This is an experimental model fine-tuned from the
|
19 |
[esm2_t6_8M_UR50D](https://huggingface.co/facebook/esm2_t6_8M_UR50D) model
|
20 |
for multi-label classification. In particular, the model is fine-tuned on the CAFA-5 protein sequence dataset available
|
21 |
+
[here](https://huggingface.co/datasets/AmelieSchreiber/cafa_5). More precisely, the `train_sequences.fasta` file is the
|
22 |
+
list of protein sequences that were trained on, and the
|
23 |
`train_terms.tsv` file contains the gene ontology protein function labels for each protein sequence. For more details on using
|
24 |
ESM-2 models for multi-label sequence classification, [see here](https://huggingface.co/docs/transformers/model_doc/esm).
|
25 |
Due to the potentially complicated class weighting necessary for the hierarchical ontology, further fine-tuning will be necessary.
|