cover

Hawk and Griffin: Efficient RNN Models Redefining AI Performance

14 Jan 2025

This research introduces Hawk and Griffin models, efficient RNN alternatives to Transformers, with reduced latency and strong long-sequence performance.

cover

RNNs vs. Transformers: Innovations in Scalability and Efficiency

14 Jan 2025

This research explores scalable RNN and SSM innovations, comparing their efficiency and performance to Transformers and linear attention techniques.

cover

Hawk and Griffin: Mastering Long-Context Extrapolation in AI

14 Jan 2025

This research shows Hawk and Griffin models excel at long-context extrapolation, predicting tokens for sequences 4x longer than training.

cover

Griffin Model: Advancing Copying and Retrieval in AI Tasks

14 Jan 2025

This research shows Griffin excels in copying and retrieval tasks, outperforming Hawk and Transformers in extrapolation for longer sequences.

cover

Hawk and Griffin Models: Superior Latency and Throughput in AI Inference

14 Jan 2025

This research shows Hawk and Griffin outperform MQA Transformers in latency and throughput, excelling in long-sequence and large-batch inference.

cover

Recurrent Models: Enhancing Latency and Throughput Efficiency

14 Jan 2025

This research shows recurrent models reduce cache size, improving latency and throughput over Transformers for long sequences.

cover

Recurrent Models: Decoding Faster with Lower Latency and Higher Throughput

14 Jan 2025

This research shows recurrent models excel in decoding, offering lower latency and higher throughput than Transformers, especially for long sequences.

cover

Training speed on longer sequences

14 Jan 2025

The research paper compares training speeds across different model sizes and sequence lengths to conclude the computational advantages of Hawk and Griffin.

cover

Efficient linear recurrences on device

14 Jan 2025

This research optimizes RG-LRU layers with a custom Pallas kernel for TPU-v3, achieving 3x speedup and 10-20% faster Hawk model training times.