Model fine-tuning results problem

#496

by HJY12138 - opened 7 days ago

7 days ago

Dear developers,

I run "Geneformer Fine-Tuning for Classification of Dosage-Sensitive vs. -Insensitive Transcription Factors (TFs)" test, Use the pre-trained model provided , the dataset for fine tuning and the code provided for fine-tuning the model. But I didn't get the same result as the article.This task is a binary classification problem, but the fine-tuned model does not seem to perform well.

I want to know which of my steps is wrong,

Thank you very much!

ctheodoris

Owner 7 days ago

Please ensure you are using the correct dictionary for the correct model. Otherwise, the tokens will not correspond to the same genes and the gene identities will be shuffled. To replicate the analysis from the Nature paper, please use the model, dataset, and dictionary corresponding to the 30M model. The current default for the argument "token_dictionary_file" in the Classifier is the 95M dictionary.

ctheodoris changed discussion status to closed 7 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment