Submitted by Zhou 13 Efficient RLVR Training via Weighted Mutual Information Data Selection LARK Lab@HKUST (GZ) 2