The last decade has witnessed a prosperous development of computational
methods and dataset curation for AI-aided drug discovery (AIDD). However,
real-world pharmaceutical datasets often exhibit highly imbalanced
distribution, which is largely overlooked by the current literature but may
severely compromise the fairness and generalization of machine learning
applications. Motivated by this observation, we introduce ImDrug, a
comprehensive benchmark with an open-source Python library which consists of 4
imbalance settings, 11 AI-ready datasets, 54 learning tasks and 16 baseline
algorithms tailored …

