Ray 2.4.0: New Features for Generative AI Workloads and Scalability Improvements

Image credit: Ray
Ray has announced the release of version 2.4.0, featuring numerous enhancements across the Ray ecosystem. This release focuses on providing improved support for Generative AI workloads, Large Language Models (LLMs), and increased scalability for large clusters.
Key Additions:
New examples and features for Generative AI workloads using Ray, including fine-tuning GPT-J, fine-tuning Dreambooth, scalable batch inference with GPT-J, and serving GPT-J and Stable Diffusion.
Introduction of AccelerateTrainer to enable large model workloads on Ray, allowing users to run HuggingFace Accelerate and DeepSpeed with minimal code changes.
Updated LightningTrainer for better compatibility with other Ray libraries, enabling seamless scaling of PyTorch Lightning on Ray.
Enhancements in Ray Data for improved stability, ease of use, and observability, including a streaming-based default backend execution model and extended ability to fetch data from common SQL data sources.
Improved Serve observability, enabling monitoring of Ray Serve applications through the Ray dashboard.
Introduction of RLlib’s new RLModule abstraction (alpha) for custom reinforcement learning models in RLlib.
Improved Ray core scalability, officially supporting Ray clusters with up to 2000 nodes.
These improvements demonstrate Ray's commitment to providing a performant compute infrastructure for LLM training, inference, and serving. With the release of Ray 2.4.0, the platform continues to evolve, offering better compatibility, integration, and scalability for a wide range of workloads.
To try the latest release, install Ray with pip install "ray[default]" and provide feedback through Github or Discuss. Source