stereoplegic 's Collections Optimizer
updated
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of
Language Model
Paper
• 2305.15265
• Published
• 1
Mesa: A Memory-saving Training Framework for Transformers
Paper
• 2111.11124
• Published
• 1
Full Parameter Fine-tuning for Large Language Models with Limited
Resources
Paper
• 2306.09782
• Published
• 31
Layered gradient accumulation and modular pipeline parallelism: fast and
efficient training of large language models
Paper
• 2106.02679
• Published
• 1
Outliers with Opposing Signals Have an Outsized Effect on Neural Network
Optimization
Paper
• 2311.04163
• Published
• 1
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Paper
• 2309.01825
• Published
• 1
Utility-based Perturbed Gradient Descent: An Optimizer for Continual
Learning
Paper
• 2302.03281
• Published
• 1
Fine-Tuning Language Models with Just Forward Passes
Paper
• 2305.17333
• Published
• 4
Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Paper
• 2308.02060
• Published
• 1
Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
Paper
• 1909.12778
• Published
• 1
Lottery Tickets in Evolutionary Optimization: On Sparse
Backpropagation-Free Trainability
Paper
• 2306.00045
• Published
• 1
Multiplication-Free Transformer Training via Piecewise Affine Operations
Paper
• 2305.17190
• Published
• 2
XGrad: Boosting Gradient-Based Optimizers With Weight Prediction
Paper
• 2305.18240
• Published
• 1
Gradients without Backpropagation
Paper
• 2202.08587
• Published
• 1
Learning with Local Gradients at the Edge
Paper
• 2208.08503
• Published
• 1
HyperTuning: Toward Adapting Large Language Models without
Back-propagation
Paper
• 2211.12485
• Published
• 1
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Paper
• 2302.12022
• Published
• 1
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and
Uncertainty in Zeroth-order Optimization
Paper
• 2312.15184
• Published
• 1
Versatile Black-Box Optimization
Paper
• 2004.14014
• Published
PyPop7: A Pure-Python Library for Population-Based Black-Box
Optimization
Paper
• 2212.05652
• Published
• 2
B2Opt: Learning to Optimize Black-box Optimization with Little Budget
Paper
• 2304.11787
• Published
MKOR: Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1
Updates
Paper
• 2306.01685
• Published
CoRe Optimizer: An All-in-One Solution for Machine Learning
Paper
• 2307.15663
• Published