RAY SERVE
Ray Serve
Ray Serve Upgrade Delivers 88% Lower Latency for AI Inference at Scale
Anyscale announces major Ray Serve optimizations with HAProxy and gRPC, achieving 11.1x throughput gains for LLM inference workloads on enterprise deployments.
Ray Serve
Ray Serve v2.54 Adds Grafana Dashboard for Production ML Debugging
Anyscale releases new Ray Serve Grafana dashboard enabling real-time debugging of ML model serving latency, autoscaling issues, and deployment failures.
Ray Serve
Anyscale Introduces New Replica Compaction to Optimize Resource Usage
Anyscale launches Replica Compaction to address resource fragmentation, enhancing resource utilization and reducing costs for Ray Serve deployments.
Ray Serve
Anyscale Introduces Multi-Tenant Serve Applications with Containerized Runtime Environments
Anyscale unveils multi-tenant serve applications using runtime environments as containers, enhancing efficiency and reducing operational complexity.