A central optimization challenge in machine learning is parameter-tuning. Adaptive gradient methods, such as AdaGrad and Adam, are ubiquitously used for training machine learning models in practice, owing to their ability to adjust the stepsizes without granular knowledge of the loss functions. While these methods have shown remarkable empirical success in training deep neural networks for supervised learning tasks, they often struggle in more challenging scenarios involving adversarial learning, such as adversarial training and generative adversarial networks. In this talk, we will explore some of the most pressing questions regarding adaptive gradient methods: What are the provable benefits of adaptive methods? How can we improve their robustness and effectiveness for adversarial learning?