Model fine-tuning results problem
Dear developers,
I run "Geneformer Fine-Tuning for Classification of Dosage-Sensitive vs. -Insensitive Transcription Factors (TFs)" test, Use the pre-trained model provided , the dataset for fine tuning and the code provided for fine-tuning the model. But I didn't get the same result as the article.This task is a binary classification problem, but the fine-tuned model does not seem to perform well.
I want to know which of my steps is wrong,
Thank you very much!
Please ensure you are using the correct dictionary for the correct model. Otherwise, the tokens will not correspond to the same genes and the gene identities will be shuffled. To replicate the analysis from the Nature paper, please use the model, dataset, and dictionary corresponding to the 30M model. The current default for the argument "token_dictionary_file" in the Classifier is the 95M dictionary.