明日活动丨基于梯度下降的神经网络学习中的不变低维子空间

报告主题：基于梯度下降的神经网络学习中的不变低维子空间

报告日期：12月26日（周二）11:00-12:00

主题简介：

在过去的几年里，梯度下降对于简洁解的隐式偏向是在深度网络训练中广泛研究的现象。在这项工作中，我们首先将焦点缩小到深度线性网络并来研究这一现象。通过我们的分析，在数据具有低维结构时，我们的研究揭示了学习动态中的一个令人惊讶的“简洁法则”。

具体而言，我们表明从正交初始化开始的梯度下降的演化只会影响所有权重矩阵的一小部分奇异向量空间。换句话说，尽管在整个训练过程中更新了所有权重参数，但学习过程仅发生在每个权重矩阵的一个小不变子空间内。学习动态的这种简单性对于提高训练的高效性和对更好的理解深度网络的表示都有重大影响。首先，该分析使我们能够通过利用学习动态中的低维结构来显著提高训练效率。我们可以构建更小但等效的深度线性网络，而不会牺牲对应的宽网络关联的优势。

此外，我们展示了对于高效训练深度非线性网络的潜在可能性。其次，它使我们能够更好地理解深度表示学习，并理论阐明从浅层到深层网络的逐渐特征压缩和区分。这项研究为深度非线性网络中的分层表示的理解奠定了基础。

本次演讲基于三项最近的研究成果：

https://arxiv.org/abs/2306.01154

https://arxiv.org/abs/2311.05061

https://arxiv.org/abs/2311.02960

Over the past few years, an extensively studied phenomenon in training deep networks is the implicit bias of gradient descent towards parsimonious solutions. In this work, we first investigate this phenomenon by narrowing our focus to deep linear networks. Through our analysis, we reveal a surprising “law of parsimony” in the learning dynamics when the data possesses low-dimensional structures. Specifically, we show that the evolution of gradient descent starting from orthogonal initialization only affects a minimal portion of singular vector spaces across all weight matrices. In other words, the learning process happens only within a small invariant subspace of each weight matrix, even though all weight parameters are updated throughout training. This simplicity in learning dynamics could have significant implications for both efficient training and a better understanding of deep networks. First, the analysis enables us to considerably improve training efficiency by taking advantage of the low-dimensional structure in learning dynamics. We can construct smaller, equivalent deep linear networks without sacrificing the benefits associated with the wider counterparts. Moreover, we demonstrate the potential implications for efficient training deep nonlinear networks.

Second, it allows us to better understand deep representation learning by elucidating the progressive feature compression and discrimination from shallow to deep layers. The study paves the foundation for understanding hierarchical representations in deep nonlinear networks.

报告嘉宾：

曲庆是密歇根大学电子工程与计算机科学系的助理教授。他分别于2018年10月从哥伦比亚大学获得电气工程博士学位，2011年7月从清华大学获得学士学位。他的研究兴趣集中在数据科学基础、机器学习、数值优化和信号/图像处理的交叉领域。

扫描下方二维码

明日活动丨基于梯度下降的神经网络学习中的不变低维子空间

或点击「阅读原文」报名

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

明日活动丨基于梯度下降的神经网络学习中的不变低维子空间

腾讯Q3财报：AI生态价值释放，To B营收双位数增长至582亿元

北京人形开源最新VLM模型，推动具身智能再迈关键一步 !

openEuler发布超节点操作系统，引领AI时代

比0.99元羊毛更重要的，是跟AI砍价的快乐

雷军下铺的兄弟，创业家务机器人

谁在带队小鹏机器人：IRON背后的四位关键人物

医疗AI质变时刻来临！国产医疗AI率先突破，临床诊疗能力问鼎全球

孙正义再次清仓英伟达！上一次教训“价值2500亿美元”

罗福莉C位亮相小米，离职DeepSeek后首次官宣

华为刚投的物理AI：首家国产世界模型公司