Question regarding high accuracy on the GSM8K dataset

#15
by wzy1023 - opened

Hello! I noticed that the model achieves high accuracy on the GSM8K dataset, and I have a few questions regarding this:
Training Data Usage
Could you please clarify whether the model was trained using the GSM8K train dataset? If so, could you provide details on the specific training setup?
Evaluation Rigor
If the reported accuracy result involved using the train data, does this mean the evaluation was not performed on a completely independent test set? Could this potentially introduce overfitting risks and affect the credibility of the results?
Experimental Setup
Could you please share the complete experimental setup? For example, did you follow the standard train/validation/test split to ensure the validity of the results?
Thank you for your help!

Sign up or log in to comment