Transformer Encoder Decoder

杨立昆路线的新胜利：VL-JEPA来了，抛弃预测下一个词，不靠生成 ...

它接收视频或图像输入，将其压缩成一串紧凑的视觉嵌入向量。这里研究团队选用的是冻结参数的V-JEPA 2 ViT-L模型。这个模型本身就在自监督视觉任务上表现优异，能把复杂的视频画面浓缩成高密度的信息流。

CornPheno: A game-changer in corn breeding with smartphone-based phenotyping

Corn is one of the world's most important crops, critical for food, feed, and industrial applications. In 2023, corn ...

12 天

告别像素重建！密歇根大学等提出NEPA：极简自回归预训练，无需 ...

NEPA 正是将这种 GPT 式的哲学引入视觉领域的一次大胆尝试。作者认为，与其学习如何重建图像，不如学习如何“推演”图像。如果模型能够根据已有的视觉片段（Patches），准确预测出下一个片段的特征表示（Embedding），那么它一定已经理解了图像的语义结构和物体间的空间关系。

12 天

Google Real-Time Translator: More Than Word-for-Word Translations

Google's real-time translator looks ahead and anticipates what is being said, explains Niklas Blum, Director Product ...

17 天

T5Gemma模型再更新，谷歌还在坚持编码器-解码器架构

今年上半年，谷歌发布了开放模型 Gemma 3 系列，性能强大，反响热烈，衍生出许多基于 Gemma 3 系列模型的优秀工作。这次更新的 T5Gemma 2 模型正是其中之一。简而言之： T5Gemma 2 ，是谷歌新一代编码器 - 解码器模型，是首个多模态和长上下文的编码器 - 解码器模型，建立在 Gemma 3 的强大功能之上。

IEEE

Encoder-Decoder Based Deep Reinforcement Learning for Multi-AUV Assisted Data Collection in ...

Abstract: Reliable and timely data collection poses a significant challenge for underwater wireless sensor networks (UWSNs), primarily due to the extremely low data rate of underwater communication ...

21 天

Bolmo’s architecture unlocks efficient byte‑level LM training without sacrificing quality

Ai2 releases Bolmo, a new byte-level language model the company hopes would encourage more enterprises to use byte level ...

知乎 on MSN

2026年，还想要入局大模型领域的学习和工作，还来得及吗? 红利期还 ...

AI这个圈子有一个很神奇的特点：就是复利性基本为零。每次我看到类似「202X年，入行YYY方向还来得及吗？」的问题的时候，我都会想到这个特点。原因其实很简单，我只从科研上举一些例子。比方说从2023年之后入行做生成的小伙伴，你大概率不用再去了解基于GAN的一些知识，因为就算你弄得很懂，对于diffusion ...

IEEE

Multivariate Segment Expandable Encoder-Decoder Model for Time Series Forecasting

Abstract: Accurate time series forecasting is critical in a variety of fields, including transportation, weather prediction, energy management, infrastructure monitoring, and finance. Forecasting ...

知乎 on MSN

学transformer前需不需要先把RNN学一遍?

直接给结论，不用。甚至可以说，都要2026年了，如果你现在还抱着十年前的教材，非要先啃明白RNN，再搞懂LSTM里那个该死的遗忘门，最后才敢翻开Transformer的第一页，那你纯粹是在浪费生命。

一些您可能无法访问的结果已被隐去。

显示无法访问的结果