Meta AI｜关于transformer初始化的有效理论

1,658次阅读

【推荐理由】本文对Transformer进行了前向-后向信号传播的有效性分析。该分析建议这些模型的初始化和训练超参数具有特定的宽度缩放。

Effective Theory of Transformers at Initialization
Emily Dinan, Sho Yaida, Susan Zhang

[Meta AI]

【论文链接】https://arxiv.org/pdf/2304.02034.pdf

【摘要】本文对宽且深的Transformer进行了前向-后向信号传播的有效性分析，即使用多头自注意力块和多层感知器块的残差神经网络。该分析建议这些模型的初始化和训练超参数具有特定的宽度缩放。然后文章采用了这些建议，在实际设置中训练视觉和语言Transformer。

Meta AI｜关于transformer初始化的有效理论

正文完

可以使用微信扫码关注公众号（ID：xzluomor）

发表至：智源

2023年4月6日

马斯克Ilya私密邮件被Claude破译，OpenAI打码信息公开，谷歌很受伤

人大系初创与OpenAI三次“撞车”：类Sora架构一年前已发论文

微软 | 大型语言模型的语境忠实提示法

集智×安远AI ： OpenAI风波背后，如何&谁来确保AGI安全？ | 读书会联动

阿尔伯塔大学｜强化学习中的经验设计

Edge AI 变得更快|在 C/C++ 中移植 Facebook 的 LLaMA 模型

评论（没有评论）

2023 年 4 月
一	二	三	四	五	六	日
	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

文心AIGC

人工智能ChatGPT，AIGC指利用人工智能技术来生成内容，其中包括文字、语音、代码、图像、视频、机器人动作等等。被认为是继PGC、UGC之后的新型内容创作方式。AIGC作为元宇宙的新方向，近几年迭代速度呈现指数级爆发，谷歌、Meta、百度等平台型巨头持续布局

文章搜索

最新评论

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

经典留声机

经典流行从来都不冲突

在这里，听见你曾经的故事

新浪微博：主播小D

小红书：小D就是我

抖音号：52915017

Search Episodes

薛之谦：从“人歌分离”到“深情解构者”的音乐涅槃之路（上）

2025年6月30日

主播小D

你一定听过这些经典合唱–第一篇

2025年1月20日

主播小D

缅怀一代歌王罗文的经典之声–第二篇

2024年12月30日

主播小D

缅怀一代歌王罗文的经典之声–第一篇

2024年12月27日

主播小D

在这里，听琼瑶，岁月长歌–第二篇

2024年12月24日

主播小D

在这里，听琼瑶，岁月长歌–第一篇

2024年12月21日

主播小D

你总能在这些歌里找到你的回忆–第一百零三篇

2024年12月18日

主播小D

你总能在这些歌里找到你的回忆–第一百零四篇

2024年12月13日

主播小D

《这些歌都发行在2001年–第三篇》

2024年12月10日

主播小D

《这些歌都发行在2001年–第二篇》

2024年12月7日

主播小D

Search Results placeholder

2023 年 4 月
一	二	三	四	五	六	日
	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30