COMPUTER VISION
Integrating Agentic AI in Computer Vision: Enhancing Video Analytics
Explore three ways to integrate agentic AI into computer vision, enhancing video analytics with dense captions, VLM reasoning, and automatic scenario analysis, according to NVIDIA.
IBM Research Advances Computer Vision with Human-Like Perception
IBM Research is developing AI systems that enhance computer vision capabilities by integrating human-like perception for improved image recognition and analysis.
Microsoft's Florence-2: Bridging the Gap Between LLMs and Large Vision Models
Microsoft's Florence-2 is a foundational image model capable of performing diverse computer vision tasks, inspired by the advancements in large language models (LLMs).
PIGEON: Predicting Your Location with Images
PIGEON and PIGEOTTO are groundbreaking AI models in image geolocalization, predicting locations from images with remarkable accuracy. PIGEON excels with Street View data, while PIGEOTTO thrives on diverse global imagery, both significantly reducing median distance errors in geolocalization.
Google DeepMind: Subtle Adversarial Image Manipulation Influences Both AI Model and Human Perception
Recent DeepMind research reveals that subtle adversarial image manipulations, originally designed to deceive AI models, also subtly influence human perception. This discovery underscores similarities and distinctions in human and machine vision, emphasizing the need for further research in AI safety and security.