Safety evaluation for multimodal AI
Research PaperDetailed the safety evaluation of GPT-4's vision capabilities, revealing extensive red-teaming around facial recognition, bias, medical advice, and CAPTCHA solving — establishing a template for multimodal safety assessment.
• Facial recognition and surveillance potential • Person identification from photos • Medical image diagnosis (potentially harmful if wrong) • CAPTCHA solving (could enable automated attacks) • Bias in describing people's appearance • Geolocalization from images
Refused to identify real people, restricted medical diagnosis, implemented geographic and demographic fairness testing.
External red teamers from diverse backgrounds tested for adversarial exploitation.