ImageCLEF: Multimodal Challenge in CLEF
ImageCLEF 2025 focuses on evaluating technologies for annotating, indexing, classifying, retrieving and generating multimodal data, providing access to large datasets across a veriety of scenarios, including medical, social media, and internet-based applications. Building on the success of recent editions, it encourages interdisciplinary methods by engaging participants in diverse domains, providing large amounts of challenging multimodal data and providing am evaluation platform for a large number of use cases.
Tasks:
-
Task 1 - ImageCLEFmedical
In its 21st edition, the task will continue all the medical sub-tasks from from the last 2 years, namely: (i) the Caption task with medical concept detection and caption prediction, (ii) the GAN task focused on synthetic medical images, (iii) MEDVQA regarding Visual Question Answering for gastrointestinal data, and (iv) MEDIQA-MAGIC, introducing a new use-case on multimodal dermatology response generation.
-
Task 2 - Image Retrieval/Generation for Arguments
As a joint task between Touché and ImageCLEF since 2022, the task aims to show the impact of images in arguments, making them more compelling. In this year's task, participants shall find suitable images that convey a given argument. Two submission styles are possible, either as a retrieval task or as prompt generation for an image generator.
-
Task 3 - ImageCLEFtoPicto
The aim of this task to convert either speech or text into a meaningful sequence of pictograms, aiding communication for people with language impairments, enhancing user understanding or helping with translation. Therefore, 2 sub-tasks are derived from this: (i) Text-to-Picto, involving generating pictograms starting from a French text and (ii) Speech-to-Picto, which focuses on translating speech to pictograms directly.
-
Task 4 - MultimodalReason
MultimodalReason is a new task, focusing on Multilingual Visual Question Answering. Participants are given multiple-choice questions and corresponding images and are asked to identify the correct answer, in multiple languages, disciplines and difficulty levels. The task aims to assess the reasoning abilities of modern LLMs across a wide range of real-world situations.
Organizers
- Bogdan Ionescu, National University of Science and Technology POLITEHNICA Bucharest
- Henning Müller, University of Applied Sciences Western Switzerland (HES-SO), Switzerland
- Dan-Cristian Stanciu, National University of Science and Technology POLITEHNICA Bucharest
Contact
- henning.mueller@hevs.ch
- bogdanlapi@gmail.com
- stanciu.cristi12@gmail.com