AI 幻觉生成的代码依赖,正演变为新的供应链安全风险

安全动态   ·   发表于 2025-04-13 12:02:33   ·   安全动态每天看
<p data-lake-id="u0201a07c" id="u0201a07c"><span data-lake-id="u3f7b898a" id="u3f7b898a" class="lake-fontsize-9" style="color: rgb(10, 10, 10)">​</span><br></p><h1 data-lake-id="smryd" id="smryd" style="text-align: center"><span data-lake-id="u46cbcb89" id="u46cbcb89" style="color: rgb(63, 63, 63)">AI 幻觉生成的代码依赖，正演变为新的供应链安全风险</span></h1><p data-lake-id="ue7542f56" id="ue7542f56" style="text-align: center"><card type="inline" name="image" value="data:%7B%22src%22%3A%22https%3A%2F%2Fwww.bleepstatic.com%2Fcontent%2Fhl-images%2F2022%2F05%2F12%2Fai-cybersecurity-hacker.jpg%22%2C%22originalType%22%3A%22binary%22%2C%22linkTarget%22%3A%22_blank%22%2C%22from%22%3A%22url%22%2C%22originWidth%22%3A1600%2C%22originHeight%22%3A900%2C%22ratio%22%3A1%2C%22status%22%3A%22done%22%2C%22style%22%3A%22none%22%2C%22showTitle%22%3Afalse%2C%22title%22%3A%22%22%2C%22rotation%22%3A0%2C%22crop%22%3A%5B0%2C0%2C1%2C1%5D%2C%22id%22%3A%22dwTM3%22%2C%22margin%22%3A%7B%22top%22%3Atrue%2C%22bottom%22%3Atrue%7D%7D" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"></card></p><p data-lake-id="u4b923ebe" id="u4b923ebe" style="text-align: center"><span data-lake-id="ue98f44e3" id="ue98f44e3" class="lake-fontsize-12" style="color: rgb(136, 136, 136)">AI hallucinations</span></p><p data-lake-id="u1744ba3a" id="u1744ba3a" style="text-align: justify"><span data-lake-id="ue0fd409f" id="ue0fd409f" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">一种名为</span><span data-lake-id="uae295ac6" id="uae295ac6" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="uebc31a84" id="uebc31a84" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">“slopsquatting”（污名投机）</span></strong><span data-lake-id="u394da744" id="u394da744" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><span data-lake-id="uc08fbf48" id="uc08fbf48" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">的新型供应链攻击方式，正在伴随生成式 AI 编程工具的广泛使用而崛起。这类攻击源于大模型在生成代码时，</span><strong><span data-lake-id="ue39d917a" id="ue39d917a" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">常“幻觉”出并不存在的依赖包名称</span></strong><span data-lake-id="u97fc4919" id="u97fc4919" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">。</span></p><p data-lake-id="uc7c759df" id="uc7c759df" style="text-align: justify"><span data-lake-id="u7d3ed009" id="u7d3ed009" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">“Slopsquatting” 这一术语由安全研究员</span><span data-lake-id="u52f89f33" id="u52f89f33" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><span data-lake-id="u2160ba65" id="u2160ba65" class="lake-fontsize-12" style="color: rgb(87, 107, 149)">Seth Larson</span><span data-lake-id="u74ad2297" id="u74ad2297" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><span data-lake-id="ua26a2785" id="ua26a2785" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">提出，灵感来源于 “typosquatting（拼写投机）”。后者是通过构造与热门库名称极为相似的拼写错误包名，引诱开发者误装恶意依赖。而 slopsquatting 不靠拼写错误取胜，而是</span><strong><span data-lake-id="ufcd56b57" id="ufcd56b57" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">利用大模型“幻想”出的虚假包名</span></strong><span data-lake-id="u5e08c6e5" id="u5e08c6e5" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">来布局钓鱼攻击。</span></p><p data-lake-id="u2700f70d" id="u2700f70d" style="text-align: justify"><span data-lake-id="u6bdb2006" id="u6bdb2006" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">攻击者会预先在 PyPI、npm 等软件包仓库中注册这些虚构的包名，并植入恶意代码。一旦有开发者照着 AI 推荐使用这些不存在但“看起来正常”的包名，系统就可能中招。</span></p><p data-lake-id="u6cbde9f3" id="u6cbde9f3" style="text-align: justify"><span data-lake-id="u1a4eeeb2" id="u1a4eeeb2" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">一篇于 2025 年 3 月发布的研究论文（论文名：we have a package for you! a comprehensive analysis of package hallucinations by code generating llms）指出，在对 57.6 万条由 AI 生成的 Python 与 JavaScript 示例代码分析后，约有</span><span data-lake-id="ua5fec3f0" id="ua5fec3f0" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="u5d8031af" id="u5d8031af" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">20% 推荐了并不存在的依赖包</span></strong><span data-lake-id="u07812e9c" id="u07812e9c" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">。</span></p><p data-lake-id="u4df1a8e4" id="u4df1a8e4" style="text-align: justify"><span data-lake-id="u90fa9275" id="u90fa9275" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">在开源模型（如 CodeLlama、DeepSeek、WizardCoder、Mistral）中，该现象更为严重；即便是商用模型如 ChatGPT-4，仍有大约</span><span data-lake-id="ue5717a4e" id="ue5717a4e" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="u60438b3d" id="u60438b3d" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">5% 的依赖为幻觉生成</span></strong><span data-lake-id="ue6f6f239" id="ue6f6f239" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">，足以引发广泛安全隐患。</span></p><p data-lake-id="u1bbaf00c" id="u1bbaf00c" style="text-align: center"><card type="inline" name="image" value="data:%7B%22src%22%3A%22https%3A%2F%2Fwww.bleepstatic.com%2Fimages%2Fnews%2Fu%2F1220909%2F2025%2FApril%2Frates.jpg%22%2C%22originalType%22%3A%22binary%22%2C%22linkTarget%22%3A%22_blank%22%2C%22from%22%3A%22url%22%2C%22originWidth%22%3A790%2C%22originHeight%22%3A441%2C%22ratio%22%3A1%2C%22status%22%3A%22done%22%2C%22style%22%3A%22none%22%2C%22showTitle%22%3Afalse%2C%22title%22%3A%22%22%2C%22rotation%22%3A0%2C%22crop%22%3A%5B0%2C0%2C1%2C1%5D%2C%22id%22%3A%22Jjq41%22%2C%22margin%22%3A%7B%22top%22%3Atrue%2C%22bottom%22%3Atrue%7D%7D" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"></card></p><p data-lake-id="ud02034ee" id="ud02034ee" style="text-align: center"><span data-lake-id="u7eb0d87d" id="u7eb0d87d" class="lake-fontsize-12" style="color: rgb(136, 136, 136)">各大主流大模型的“幻觉包”生成率 来源：arxiv.org</span></p><p data-lake-id="u8a271f63" id="u8a271f63" style="text-align: justify"><span data-lake-id="u31f73d16" id="u31f73d16" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">尽管研究中记录到的“幻觉依赖包”数量超过</span><span data-lake-id="u8c264c80" id="u8c264c80" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="u5d0d0f1e" id="u5d0d0f1e" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">20 万个</span></strong><span data-lake-id="u0259091b" id="u0259091b" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">，但其中</span><span data-lake-id="u7685540b" id="u7685540b" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="ua551fdde" id="ua551fdde" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">43% 在相似提示词中反复出现</span></strong><span data-lake-id="u163ae078" id="u163ae078" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">，而</span><span data-lake-id="u215aa145" id="u215aa145" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="ua0584da4" id="ua0584da4" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">58% 在 10 次生成中至少出现过一次</span></strong><span data-lake-id="u5cbaefcf" id="u5cbaefcf" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">，说明这类幻觉并非完全随机。</span></p><p data-lake-id="u62c77999" id="u62c77999" style="text-align: justify"><span data-lake-id="ub3ba4c7c" id="ub3ba4c7c" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">研究还显示，</span></p><ul list="u519b3034"><li fid="uf0c726d7" data-lake-id="ued03359a" id="ued03359a" style="text-align: left"><span data-lake-id="u3dd8c9d9" id="u3dd8c9d9" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">•</span><span data-lake-id="u5984d368" id="u5984d368" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="ua1ccbd75" id="ua1ccbd75" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">38%</span></strong><span data-lake-id="u7764f297" id="u7764f297" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><span data-lake-id="udceff04b" id="udceff04b" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">的幻觉包名称受真实包启发（如拼写相近或语义相似），</span></li><li fid="uf0c726d7" data-lake-id="u9d98f042" id="u9d98f042" style="text-align: left"><span data-lake-id="u182c8351" id="u182c8351" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">•</span><span data-lake-id="ua5849c16" id="ua5849c16" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="u7204c12a" id="u7204c12a" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">13%</span></strong><span data-lake-id="u91f5b358" id="u91f5b358" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><span data-lake-id="u38f24ef0" id="u38f24ef0" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">属于拼写错误导致，</span></li><li fid="uf0c726d7" data-lake-id="uc67c6577" id="uc67c6577" style="text-align: left"><span data-lake-id="uca57e23a" id="uca57e23a" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">• 剩下</span><span data-lake-id="u4d2bbfaa" id="u4d2bbfaa" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="u2f1f14c5" id="u2f1f14c5" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">51%</span></strong><span data-lake-id="u74d1f3d8" id="u74d1f3d8" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><span data-lake-id="u3538cbdc" id="u3538cbdc" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">则是</span><strong><span data-lake-id="ufb3c36b9" id="ufb3c36b9" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">模型完全杜撰出来的</span></strong><span data-lake-id="u4a82e8aa" id="u4a82e8aa" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">。</span></li></ul><p data-lake-id="u6be7c771" id="u6be7c771" style="text-align: justify"><span data-lake-id="ud8254d8a" id="ud8254d8a" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">虽然目前尚未发现有攻击者大规模利用该手法发动攻击，但开源网络安全公司</span><span data-lake-id="u6608bec4" id="u6608bec4" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><strong><span data-lake-id="ucb166800" id="ucb166800" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">Socket</span></strong><span data-lake-id="u0eb96ae5" id="u0eb96ae5" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"> </span><span data-lake-id="uede3fcfe" id="uede3fcfe" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">的研究人员指出，这类幻觉包的</span><strong><span data-lake-id="u079b4d9d" id="u079b4d9d" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">重复性强、结构合理且语义可信</span></strong><span data-lake-id="udd203b2d" id="udd203b2d" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">，具备</span><strong><span data-lake-id="u15522200" id="u15522200" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">高度可预测性</span></strong><span data-lake-id="u08d6f932" id="u08d6f932" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">，攻击者完全可以据此构建“slopsquatting”钓鱼攻击面。</span></p><p data-lake-id="u8d5056b1" id="u8d5056b1" style="text-align: left"><span data-lake-id="u699bd95f" id="u699bd95f" class="lake-fontsize-12" style="color: rgb(63, 63, 63); background-color: rgb(247, 247, 247)">“整体来看，在 10 次生成中，有 58% 的幻觉包被重复提及，这表明这些并非偶发错误，而是模型在面对某些提示时的</span><strong><span data-lake-id="uba42346c" id="uba42346c" class="lake-fontsize-12" style="color: rgb(15, 76, 129); background-color: rgb(247, 247, 247)">固定响应模式</span></strong><span data-lake-id="uf1d65f3b" id="uf1d65f3b" class="lake-fontsize-12" style="color: rgb(63, 63, 63); background-color: rgb(247, 247, 247)">。”<br />——来自Socket 的研究说明：http://socket.dev/blog/slopsquatting-how-ai-hallucinations-are-fueling-a-new-class-of-supply-chain-attacks</span></p><p data-lake-id="ua6be2ce2" id="ua6be2ce2" style="text-align: left"><span data-lake-id="ub6dd4aca" id="ub6dd4aca" class="lake-fontsize-12" style="color: rgb(63, 63, 63); background-color: rgb(247, 247, 247)">“而正是这种</span><strong><span data-lake-id="u45504d3b" id="u45504d3b" class="lake-fontsize-12" style="color: rgb(15, 76, 129); background-color: rgb(247, 247, 247)">可重复性</span></strong><span data-lake-id="uc63d23d0" id="uc63d23d0" class="lake-fontsize-12" style="color: rgb(63, 63, 63); background-color: rgb(247, 247, 247)">，大大提升了这些幻觉包对攻击者的价值 —— 攻击者仅需观察少量生成结果，就能精准筛选出潜在的攻击目标。”</span></p><p data-lake-id="u9f66a8e6" id="u9f66a8e6" style="text-align: center"><card type="inline" name="image" value="data:%7B%22src%22%3A%22https%3A%2F%2Fwww.bleepstatic.com%2Fimages%2Fnews%2Fu%2F1220909%2F2025%2FApril%2Fattack-overview.jpg%22%2C%22originalType%22%3A%22binary%22%2C%22linkTarget%22%3A%22_blank%22%2C%22from%22%3A%22url%22%2C%22originWidth%22%3A890%2C%22originHeight%22%3A388%2C%22ratio%22%3A1%2C%22status%22%3A%22done%22%2C%22style%22%3A%22none%22%2C%22showTitle%22%3Afalse%2C%22title%22%3A%22%22%2C%22rotation%22%3A0%2C%22crop%22%3A%5B0%2C0%2C1%2C1%5D%2C%22id%22%3A%22a587t%22%2C%22margin%22%3A%7B%22top%22%3Atrue%2C%22bottom%22%3Atrue%7D%7D" class="lake-fontsize-12" style="color: rgb(63, 63, 63)"></card></p><p data-lake-id="u84b3f719" id="u84b3f719" style="text-align: center"><span data-lake-id="u78fca9e7" id="u78fca9e7" class="lake-fontsize-12" style="color: rgb(136, 136, 136)">供应链风险概览 来源：arxiv.org</span></p><p data-lake-id="u64c514f8" id="u64c514f8" style="text-align: justify"><span data-lake-id="u47f1ce33" id="u47f1ce33" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">应对这类风险的唯一有效方式是：</span><strong><span data-lake-id="uf174de81" id="uf174de81" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">手动核实依赖包名称</span></strong><span data-lake-id="u581da72a" id="u581da72a" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">，</span><strong><span data-lake-id="ucdb80d22" id="ucdb80d22" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">绝不能轻信 AI 生成的代码片段中提到的包一定真实或安全</span></strong><span data-lake-id="ua8f916b0" id="ua8f916b0" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">。</span></p><p data-lake-id="u07f553cb" id="u07f553cb" style="text-align: justify"><span data-lake-id="u16a22db4" id="u16a22db4" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">安全专家建议通过以下方法加强依赖管理安全：</span></p><ul list="u095d25da"><li fid="u15f9f590" data-lake-id="ue568c164" id="ue568c164" style="text-align: left"><span data-lake-id="uabbe9ef1" id="uabbe9ef1" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">• 使用依赖扫描工具（dependency scanners）</span></li><li fid="u15f9f590" data-lake-id="u4cfee9c5" id="u4cfee9c5" style="text-align: left"><span data-lake-id="ue9e8d866" id="ue9e8d866" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">• 启用锁文件（lockfiles）</span></li><li fid="u15f9f590" data-lake-id="u2d1acbe2" id="u2d1acbe2" style="text-align: left"><span data-lake-id="ub61429fd" id="ub61429fd" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">• 通过哈希值校验（hash verification）将依赖锁定为已知且可信的版本</span></li></ul><p data-lake-id="ue633f5d2" id="ue633f5d2" style="text-align: justify"><span data-lake-id="u6920f3ff" id="u6920f3ff" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">研究还发现，</span><strong><span data-lake-id="uf8867d72" id="uf8867d72" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">降低 AI 的“温度值”（temperature）</span></strong><span data-lake-id="u6efc30e4" id="u6efc30e4" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">，即减少随机性，可以有效减少依赖包“幻觉”的出现。因此，如果你常使用 AI 协助编程或追求灵感式写代码，这一点值得特别留意。</span></p><p data-lake-id="u3e9178e7" id="u3e9178e7" style="text-align: justify"><span data-lake-id="u695ce5d6" id="u695ce5d6" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">归根结底：</span><strong><span data-lake-id="uc551cdd5" id="uc551cdd5" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">所有 AI 生成的代码都应在隔离、安全的环境中测试后，再部署到实际生产环境中使用。</span></strong></p><p data-lake-id="uc9dfaac5" id="uc9dfaac5" style="text-align: justify"><strong><span data-lake-id="u09b46199" id="u09b46199" class="lake-fontsize-12" style="color: rgb(15, 76, 129)">​</span></strong><br></p><p data-lake-id="uc85cb5e7" id="uc85cb5e7" style="text-align: justify"><em><span data-lake-id="uaa24bcb2" id="uaa24bcb2" class="lake-fontsize-12" style="color: rgb(63, 63, 63)">参考：https://www.bleepingcomputer.com/news/security/ai-hallucinated-code-dependencies-become-new-supply-chain-risk/</span></em></p><p data-lake-id="ud6c34f6a" id="ud6c34f6a"><span data-lake-id="ubde52df7" id="ubde52df7" class="lake-fontsize-9" style="color: rgb(10, 10, 10)">​</span><br></p>

打赏我,让我更有动力~

1 条回复   |  直到 7个月前 | 190 次浏览

小瑟斯
发表于 7个月前

PHA+PHNwYW4+55Ge5oCd5oucPC9zcGFuPjwvcD4=

评论列表

  • 加载数据中...

编写评论内容
登录后才可发表内容
返回顶部 投诉反馈

© 2016 - 2025 掌控者 All Rights Reserved.