体验完整Gemini

Google官网体验Gemini

论文 | Natural Language is All a Graph Needs

1,493次阅读

论文 | Natural Language is All a Graph Needs https://arxiv.org/abs/2308.07134

Ruosong Ye, Caiqi Zhang, Runhui Wang, Shuyuan Xu and Yongfeng Zhang的论文《Natural Language is All a Graph Needs》在 arXiv 上引起轰动！该论文概述了一个名为 InstructGLM 的模型，该模型进一步证明图表示学习的未来包括大型语言模型 (LLMs) 和图神经网络 (GNNs)。它描述了一种仅使用指令调整来教授语言模型文本属性图（TAG）的结构和语义的方法。指令微调Flan-T5（https://arxiv.org/abs/2210.11416）和Llama-7b（https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/）能够在多个基准上的引文图的节点分类和链接预测任务上实现最先进的性能：obgn-arxiv、CoRa和PubMed。结合节点的特征，用通俗易懂的英语描述了图的结构。这两项任务都使用了许多提示。

InstructGLM

作者描述了一种名为 InstructGLM 的模型，与GPT4Graph（使用图文件格式而不是规划语言进行微调）等相比，该模型开启了一个新视角，以证明指令微调LLM是可能的，例如Google 的 Flan-T5，其中包含引文图结构的描述（可选）以及训练其执行图机器学习任务（例如通过prompt工程进行节点分类和链接预测）的功能。

论文 | Natural Language is All a Graph Needs InstructGLM 使用多任务学习来对大型语言模型 (LLMs) 进行指令微调

可以看到上面使用的各种提示。主要训练任务是节点分类，但作为多任务多提示指令调整的一部分，通过链接预测任务对其进行了增强。任务有多种形式：仅有结构、仅有特征、两者都有，有或没有边列表，并且结构描述在过度平滑成为问题之前最多扩展三跳。

论文 | Natural Language is All a Graph Needs

InstructGLM 的架构。唯一的“技巧”是使用特殊的tokens作为节点 IDs。否则它只是向 LLama 或 Flan-T5 解释如何进行图机器学习

InstructGLM 不需要 GNN 就能实现最先进的性能来对引文网络中的节点进行分类并预测引文。文本属性图（TAG）是由编码文本构成节点特征的图。该模型的一个方面是一种超越简单指令微调的“作弊”，它扩展了 LLM 的词汇表，为每个唯一节点创建一个新的token。在考虑结果时，请记住他们在 OGB 基准测试中使用的节点特征是稀疏的：词袋 (BoW) 或 TF-IDF。正确的节点特征编码可以显着提高性能。

LLM如何学习拓扑？

图邻接列表或游走由矩阵表示，Tranformer 架构中的注意力头也是如此。也许 Transformer 能够以这种方式进行推理并不奇怪。这个Stack Exchange 答案指出“……注意力矩阵是对称的，自然具有加权邻接矩阵的形式。” DGL文档将 Transformer 建模为 GNN，您可以在下图中看到表示为一组多重矩阵的注意力头，这些注意力头来自Jesse Vig的jessevig/bertviz Github 项目( colab )。LLM 学习网络拓扑是否类似于 Transformer 学习其注意力头中的权重？

论文 | Natural Language is All a Graph Needs

建立InstructGLM

论文中没有附带代码，但作者确实发布了他们用来微调 Alpaca 和 Flan-T5 的提示。这使得该论文相对容易以原始形式复制。该论文暗示了该方法的广泛潜力，以及如何通过改进节点特征（如 Bag of Words 或 TF-IDF 等稀疏特征）来提高性能。希望句子编码比这些稀疏表示更强大。

论文 | Natural Language is All a Graph Needs

泛化InstructGLM

异构网络具有复杂的、半结构化的节点特征数据。阅读这篇论文，联想起一种对复杂节点功能进行编码的方法。它来自Megadon Labs的名为Ditto 的实体匹配模型。Ditto [Ditto Light ] 在 2020 年的一篇具有里程碑意义的论文中进行了描述，该论文名为《使用预训练语言模型进行深度实体匹配Deep Entity Matching with Pre-Trained Language Models》。它提供了一种相当通用的机制，可以使用句子transformer对半结构化记录进行句子编码，以实现实体匹配。

论文 | Natural Language is All a Graph Needs

是否可以像 InstructGLM 论文的作者那样生成节点嵌入作为特殊节点标记的特征，通过句子transformers使用交叉编码器来提高BoW /TF-IDF 的性能。希望这将使该方法应用于引文图之外的网络，例如实体和身份解析、财务合规性、业务图和网络安全领域处理的网络。

结论

每天都有新的论文发表，涉及LLMs与知识图谱和 GNN 的交叉。这是一个值得关注的方向。

论文 | Natural Language is All a Graph Needs

微信群 公众号

论文 | Natural Language is All a Graph Needs

正文完

可以使用微信扫码关注公众号（ID：xzluomor）

post-qrcode

AI AR F1 Google GPT HTML Prompt RSS Web 大型语言模型机器学习架构

发表至：智源

2023年10月11日

0

小红书 x Hugging Face 邀请你一起晒「创意新春照」

媲美Gen-2，Meta多模态创AI生图新里程碑！破文生视频历史难题，静图秒变视频逼真到炸裂

数据产品的产品力（文末有惊喜）

英矽智能又一个大单，首付款8000万美元！

核心团队来自百度，大模型AI Agents创业团队招聘啦！

PNAS 速递：DNA 存储数据的并行分子计算

评论（没有评论）

文心AIGC

人工智能ChatGPT，AIGC指利用人工智能技术来生成内容，其中包括文字、语音、代码、图像、视频、机器人动作等等。被认为是继PGC、UGC之后的新型内容创作方式。AIGC作为元宇宙的新方向，近几年迭代速度呈现指数级爆发，谷歌、Meta、百度等平台型巨头持续布局

文章搜索

最新评论

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

经典留声机

经典留声机

经典流行从来都不冲突

在这里，听见你曾经的故事

新浪微博：主播小D

小红书：小D就是我

抖音号：52915017

薛之谦：从“人歌分离”到“深情解构者”的音乐涅槃之路（上）

Search Episodes

薛之谦：从“人歌分离”到“深情解构者”的音乐涅槃之路（上）

2025年6月30日

主播小D

你一定听过这些经典合唱–第一篇

2025年1月20日

主播小D

缅怀一代歌王罗文的经典之声–第二篇

2024年12月30日

主播小D

缅怀一代歌王罗文的经典之声–第一篇

2024年12月27日

主播小D

在这里，听琼瑶，岁月长歌–第二篇

2024年12月24日

主播小D

在这里，听琼瑶，岁月长歌–第一篇

2024年12月21日

主播小D

你总能在这些歌里找到你的回忆–第一百零三篇

2024年12月18日

主播小D

你总能在这些歌里找到你的回忆–第一百零四篇

2024年12月13日

主播小D

《这些歌都发行在2001年–第三篇》

2024年12月10日

主播小D

《这些歌都发行在2001年–第二篇》

2024年12月7日

主播小D

Search Results placeholder

文心AIGC

人工智能ChatGPT，AIGC指利用人工智能技术来生成内容，其中包括文字、语音、代码、图像、视频、机器人动作等等。被认为是继PGC、UGC之后的新型内容创作方式。AIGC作为元宇宙的新方向，近几年迭代速度呈现指数级爆发，谷歌、Meta、百度等平台型巨头持续布局

文章搜索

最新评论

ufabet มีเกมให้เลือกเล่นมากมาย: เกมเดิมพันหลากหลาย ครบทุกค่ายดัง

tornado crypto mixer Discover the power of privacy with TornadoCash! Learn how this decentralized mixer ensures your transactions remain confidential.

ดูบอลสด Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Pretty! This has been a really wonderful post. Many thanks for providing these details.

ดูบอลสด Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

Obrazy Sztuka Nowoczesna Thank you for this wonderful contribution to the topic. Your ability to explain complex ideas simply is admirable.

ufabet Hi there to all, for the reason that I am genuinely keen of reading this website’s post to be updated on a regular basis. It carries pleasant stuff.

ufabet You’re so awesome! I don’t believe I have read a single thing like that before. So great to find someone with some original thoughts on this topic. Really.. thank you for starting this up. This website is something that is needed on the internet, someone with a little originality!

ufabet Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

经典留声机

经典留声机

经典流行从来都不冲突

在这里，听见你曾经的故事

新浪微博：主播小D

小红书：小D就是我

抖音号：52915017

薛之谦：从“人歌分离”到“深情解构者”的音乐涅槃之路（上）

Search Episodes

薛之谦：从“人歌分离”到“深情解构者”的音乐涅槃之路（上）

2025年6月30日

主播小D

你一定听过这些经典合唱–第一篇

2025年1月20日

主播小D

缅怀一代歌王罗文的经典之声–第二篇

2024年12月30日

主播小D

缅怀一代歌王罗文的经典之声–第一篇

2024年12月27日

主播小D

在这里，听琼瑶，岁月长歌–第二篇

2024年12月24日

主播小D

在这里，听琼瑶，岁月长歌–第一篇

2024年12月21日

主播小D

你总能在这些歌里找到你的回忆–第一百零三篇

2024年12月18日

主播小D

你总能在这些歌里找到你的回忆–第一百零四篇

2024年12月13日

主播小D

《这些歌都发行在2001年–第三篇》

2024年12月10日

主播小D

《这些歌都发行在2001年–第二篇》

2024年12月7日

主播小D

Search Results placeholder