QuantumCLEF

The second edition of the QuantumCLEF lab is composed of three tasks and aims at:

  • Discovering and evaluating Quantum Annealing approaches compared to their traditional counterpart;
  • Identifying new ways of formulating Information Retrieval and Recommender Systems algorithms and methods, so that they can be solved with Quantum Annealing;
  • Establishing collaborations among researchers from different fields to harness their knowledge and skills to solve the considered challenges and promote the usage of Quantum Annealing.

This lab allows participants to use real quantum computers provided by CINECA, one of the most important computing centers worldwide.

Tasks:

  • Task 1 - Feature Selection

    Task 1 focuses on formulating the well-known NP-Hard Feature Selection problem and solving it with quantum annealers. Feature Selection is a widespread problem for both Information Retrieval and Recommender systems which requires to identify a subset of the available features (e.g., the most informative, less noisy, etc.) to train a learning model. This problem is very impacting since many of these systems involve the optimization of learning models, and reducing the dimensionality and noise of the input data can improve their performance.

  • Task 2 - Instance Selection

    Task 2 focuses on formulating the Instance Selection problem to solve it through Quantum Annealing. Currently, transformer-based architectures, including 1st and 2nd generation transformers (e.g., RoBERTa) as well as current large language models (e.g., Llama3), are used and considered state-of-the-art in several fields. Given the LLMs high-cost application, one of the big challenges is to fine-tune these models efficiently. Instance Selection focuses on selecting a representative subset of instances from a dataset to make the training of these models faster while maintaining a high level of effectiveness of the trained model.

  • Task 3 - Clustering

    Task 3 focuses on the formulation of the clustering problem to solve it with a quantum annealer. Clustering is a relevant problem for Information Retrieval and Recommender systems which involves grouping items together according to their characteristics. Clustering can be helpful for organizing large collections, helping users to explore a collection and providing similar results to a query. It can also be used to divide users according to their interests or build user models with the cluster centroids boosting efficiency or effectiveness for users with limited data.

Organizers

  • Andrea Pasin
  • Nicola Ferro
  • Maurizio Ferrari Dacrema
  • Paolo Cremonesi
  • Washington Cunha
  • Marcos André Goncalves

Contact

  • andrea.pasin.1@phd.unipd.it
  • nicola.ferro@dei.unipd.it
  • maurizio.ferrari@polimi.it
  • paolo.cremonesi@polimi.it
  • washingtoncunha@dcc.ufmg.br
  • mgoncalv@dcc.ufmg.br