Hawk and Griffin: Efficient RNN Models Redefining AI Performance
14 Jan 2025
This research introduces Hawk and Griffin models, efficient RNN alternatives to Transformers, with reduced latency and strong long-sequence performance.
RNNs vs. Transformers: Innovations in Scalability and Efficiency
14 Jan 2025
This research explores scalable RNN and SSM innovations, comparing their efficiency and performance to Transformers and linear attention techniques.
Hawk and Griffin: Mastering Long-Context Extrapolation in AI
14 Jan 2025
This research shows Hawk and Griffin models excel at long-context extrapolation, predicting tokens for sequences 4x longer than training.
Griffin Model: Advancing Copying and Retrieval in AI Tasks
14 Jan 2025
This research shows Griffin excels in copying and retrieval tasks, outperforming Hawk and Transformers in extrapolation for longer sequences.
Hawk and Griffin Models: Superior Latency and Throughput in AI Inference
14 Jan 2025
This research shows Hawk and Griffin outperform MQA Transformers in latency and throughput, excelling in long-sequence and large-batch inference.
Recurrent Models: Enhancing Latency and Throughput Efficiency
14 Jan 2025
This research shows recurrent models reduce cache size, improving latency and throughput over Transformers for long sequences.
Recurrent Models: Decoding Faster with Lower Latency and Higher Throughput
14 Jan 2025
This research shows recurrent models excel in decoding, offering lower latency and higher throughput than Transformers, especially for long sequences.
Training speed on longer sequences
14 Jan 2025
The research paper compares training speeds across different model sizes and sequence lengths to conclude the computational advantages of Hawk and Griffin.
Efficient linear recurrences on device
14 Jan 2025
This research optimizes RG-LRU layers with a custom Pallas kernel for TPU-v3, achieving 3x speedup and 10-20% faster Hawk model training times.