Workshops
Selected Workshops
We thank the IAPR community for all submitted workshop proposals. The list of selected events are given below. The associated webpages with submission instructions and more information will be completed by 16 February 2026. For general queries on the workshops, please, contact workshop_chairs@icpr2026.org. For queries on a specific workshop, please, contact the respective organisers listed on their webpage.
Overview
Workshop Website: tba
Overview
- Deep learning for Earth observation
- Earth observation foundation models
- Usability of foundation model embeddings for Earth observation tasks
- Vision and language models for Earth observation
- Hybrid models, combining physics and machine learning
- Dynamic Earth observation, including multi-temporal and change detection analysis
- Active, interactive and transfer learning
- Explainable and interpretable machine learning
- Novel pattern recognition tasks in remote sensing applications
- Benchmark models and datasets
Workshop Website: available here
Overview
With extensive applications in image and video understanding, as well as in image, video, and text generation, multimodal models have become increasingly prominent and transformative in the fields of pattern recognition and computer vision.
As the scale of these models grows exponentially, there is an urgent need to explore efficient learning and deployment strategies to address the associated computational and resource challenges. This workshop aims to provide a dedicated platform for researchers and practitioners to exchange ideas and develop innovative solutions, thereby advancing the efficiency and practical deployment of multimodal models.
The topics will cover efficient inference methods and model architecture designs for multimodal models, e.g., compression, quantization, distillation, and lightweight architectures, as well as efficient learning strategies, e.g., training and fine-tuning techniques for multimodal tasks. In addition, the workshop will address related applications, including efficient multimodal generation and editing models, practical multimodal systems, and the deployment of multimodal models on resource-constrained or low-power devices.
Relevant topics include:
- Compression, quantization, conditional compute, pruning, and distillation of multimodal models;
- Efficient training/finetuning of multimodal models, e.g., low-rank adaptation;
- Efficient sampling of multimodal diffusion models, e.g., step distillation and consistency models;
- Efficient LLM/LVLM/MLLM in multimodal tasks, e.g., token pruning and merging
- Efficient multi-/cross-modal learning
- Efficient multimodal generative and editing models and sensors, e.g., for vision, language, audio and 3D objects;
- Efficient image, video and audio synthesis by multimodal data;
- Efficient multimodal applications (e.g., drone vision, autonomous driving, etc.);
- Efficient self-/un-/weakly-supervised learning for multimodal data;
- Deploying multimodal models on low power devices e.g., smartphone;
Workshop Website: tba
Overview
This workshop addresses the transformative role of artificial intelligence in multimodal transportation systems, with particular emphasis on pattern recognition techniques that enable intelligent decision-making across interconnected transport networks.
The workshop cover pattern recognition and AI applications across multiple transportation domains:
- Computer Vision for Transportation: Real-time object detection and tracking for autonomous vehicles, traffic monitoring, pedestrian behavior analysis, and infrastructure inspection using deep learning architectures
- Spatiotemporal Pattern Mining: Identification of mobility patterns, traffic flow prediction, demand forecasting, and anomaly detection in multimodal networks using recurrent and transformer-based models
- Sensor Fusion and Multimodal Data Integration: Integration of heterogeneous data sources (camera, LiDAR, radar, GPS, IoT sensors) for comprehensive transportation state estimation
- Reinforcement Learning for Operations Management: Dynamic routing, scheduling optimization, airspace management, and adaptive traffic control systems
- Graph Neural Networks for Network Analysis: Modeling transportation networks as graphs for congestion prediction, route optimization, and network resilience assessment
- Trajectory Prediction and Planning: Human mobility modeling, vehicle path planning, and conflict resolution in shared spaces
Workshop Website: tba
Overview
Machine vision for industrial inspection represents a critical real-world application domain that drives fundamental advances in pattern recognition research while addressing pressing industrial needs. The field presents unique challenges that push the boundaries of classical pattern recognition methodologies: the scarcity of defect samples motivates innovation in few-shot learning and anomaly detection; high-consequence decision-making demands advances in uncertainty quantification and explainable AI; naturally multi-modal sensing environments (visual, thermal, 3D, ultrasonic, X-ray) serve as ideal testbeds for sensor fusion algorithms; and real-time processing requirements drive research in efficient architectures and edge computing. Manufacturing variations and evolving production processes create natural domain shift scenarios that advance transfer learning and continual learning applicable across pattern recognition applications. This workshop serves as a crucial bridge between academic research and industrial deployment, addressing evaluation beyond standard metrics, scalability from laboratory prototypes to production systems, and standardization of benchmark datasets specific to inspection tasks. As Chair of IAPR Technical Committee 8, this workshop directly supports TC8's mission while creating a focused forum for presenting state-of-the-art methodologies, fostering academic-industry collaboration, and identifying future research directions. The broader impact spans global manufacturing competitiveness, product safety, infrastructure integrity, and sustainability through zero-defect manufacturing, automated quality assurance for critical systems, enhanced worker safety, and optimized production efficiency—making this workshop valuable to both pattern recognition researchers seeking impactful applications and industry practitioners requiring cutting-edge solutions.
The topics will include:
- Deep Learning Architectures for Industrial Vision
- Few-Shot and Zero-Shot Learning for Defect Recognition
- Multi-Modal and Multi-Scale Pattern Analysis
- Texture and Surface Pattern Analysis
- 3D Vision and Geometric Pattern Recognition
- Explainable and Interpretable Pattern Recognition
- Transfer Learning and Domain Adaptation
- Pattern Recognition for Non-Destructive Testing
- Real-Time Vision Systems and Optimization
- Novel Pattern Recognition Paradigms
- Benchmark Datasets and Evaluation Metrics
- Emerging Applications and Case Studies
Workshop Website: tba
Overview
Most of the medical data collected from healthcare systems is recorded in digital format. The increased availability of these data has enabled numerous artificial intelligence applications. Specifically, machine learning can generate insights to improve the discovery of new therapeutics tools, to support diagnostic decisions, to help in the rehabilitation process, to name a few. Researchers, along with expert clinicians, can play an important role in turning complex medical data (e.g., genomic data, online acquisitions of physicians, medical imagery, etc.) into actionable knowledge that ultimately improves patient care. In recent years, these topics have drawn the attention of clinical and machine learning researchers, resulting in practical and successful applications in healthcare. These techniques have been deployed across various healthcare system levels with applications ranging from diagnosis to therapeutics.
AHIA also host a special session focusing on the latest advancements in AI-based solutions to enhance assessment and facilitate recovery in rehabilitation. The goal of this session is to highlight feasible approaches driven by real-world data for the development of practical clinical solutions.
The purpose of this workshop is to present recent advances in artificial intelligence techniques for healthcare applications. To bring together the advances in this wide and multidisciplinary subject, we propose a workshop that covers (but is not limited to) the following topics:
- Biomedical image analysis;
- Data analytics for healthcare;
- Automatic disease prediction;
- Automatic diagnosis support systems;
- Genomic and proteomic data analysis;
- Artificial intelligence for personalized medicine;
- Machine Learning as a tool to support medical diagnoses and decisions;
- Machine learning for diagnosis and rehabilitation;
- Generative AI for healthcare;
- Multimodal analysis of health data;
- Explainability to support diagnosis;
- Neural signal analysis for diagnosis assistance;
- Physiological signals processing;
- Gait analysis;
- Therapy selection;
- Brain-computer interface for healthcare;
- Biomechatronics for medicine;
- Neuromotor rehabilitation;
- Machine learning approaches in rehabilitation;
- Motion analysis for healthcare.
Workshop Website: tba
Overview
- Computer vision and generative AI for art and cultural heritage
- Automated analysis and transcription of historical manuscripts and documents
- Digital acquisition, representation, and manipulation of cultural artifacts
- Augmented and virtual reality for cultural heritage communication and education
- Image processing, classification, retrieval, and similarity search in the art domain
- Point cloud analysis and segmentation for heritage sites and objects
- 3D reconstruction of historical artifacts and architectural environments
- Serious games, edutainment, and interactive storytelling for cultural heritage
- Knowledge representation, ontology learning, and semantic modeling in cultural heritage
- Robotic and autonomous systems for inspection, conservation, and preservation
- Projects, prototypes, and digital tools for restoration, conservation, and heritage outreach
Workshop Website: available here
Overview
The fourth PRHA workshop aims to continue showcasing the latest developments in pattern recognition for healthcare analytics. The scope of the workshop entails, but is not limited to, bioinformatics, phenotyping and subtyping, patient monitoring and machine learning in pervasive healthcare, temporal modeling for disease progression, interpretable models for clinical decision support, privacy-preserving techniques for distributed and sensitive patient data, medical image analysis, and disease progression modeling.
Biomedical informatics and bioinformatics are interdisciplinary fields leveraging machine learning, deep learning, and natural language processing techniques. Tasks and datasets in these fields constitute various challenges due to the complex and multimodal nature of data. Furthermore, interpretability and explainability have become inevitable requirements to consider while designing models to address bioinformatics and biomedical informatics tasks. With the increasing importance and widespread use of pattern recognition, applications such as vision, biometrics, and natural language processing have migrated to their respective communities. However, the biomedical informatics and bioinformatics studies within the machine learning and pattern recognition communities have been proliferating. While specific challenges posed by digital patient data, as well as gene and protein data, encourage novel ideas in the field of pattern recognition, achievements attained by pattern recognition techniques open new doors to a better understanding of the complex nature of data in healthcare and bioinformatics. The fourth PRHA will again take on the duty of providing a platform under the ICPR for the exchange and collaboration between pattern recognition and medical communities.
Workshop Website: tba
Overview
This workshop explores the art and engineering of compressing large language models to make them cheaper, faster, and easier to deploy while preserving practical capability. Designed for a broad audience that spans beginners to advanced practitioners, the workshop will introduce foundational concepts for newcomers, share implementation patterns and pitfalls for experienced engineers, and highlight cutting-edge research directions for specialists.
Workshop Website: tba
Overview
We invite original research papers, case studies, and technical reports on (but not limited to) the following topics:
- Integration of traditional CCTV systems with aerial images
- Innovations in urban monitoring using aerial and satellite images
- AI and machine learning applications in smart city surveillance
- Data fusion and analytics for enhanced urban security
- Privacy and ethical implications of widespread surveillance
- Case studies of real-world implementations of urban surveillance
- Real-time monitoring: applications in traffic management and anomaly detection (e.g., fires, critical events) Future trends and challenges in urban surveillance
- Use of synthetic images for training and validation of surveillance algorithms
- Synthetic-to-Real domain adaptation and generalization for surveillance tasks
- Foundation models and their applicability in video surveillance and analysis
- Reducing bias and improving AI model accuracy through synthetic data strategies
- Privacy preserving model training and deployment
- Action recognition in surveillance systems
- Enabling technologies for aerial monitoring
Workshop Website: available here
Overview
- Synthetic data generation and domain adaptation (sim-to-real, Digital Twin).
- Multimodal data fusion and remote sensing.
- 6D pose estimation, semantic segmentation, and scene understanding.
- Active perception (next-best-view, information-theoretic planning).
- SLAM and long-term mapping in dynamic environments.
- Embodied navigation and decision-making.
- Real-time perception and interaction in the Metaverse and eXtended Reality.
- Augmented perception and vision-based mixed reality.
- Etc.
Workshop Website: tba
Overview
Biological systems exhibit remarkable pattern recognition capabilities, achieved through evolutionary adaptation, self-organisation, and efficient information processing. These mechanisms inspire a growing class of computational approaches-drawing from evolutionary computation, swarm intelligence, artificial life, and neuromorphic computing-that offer new solutions beyond conventional optimisation or fixed-model learning. Recent advances demonstrate this potential: evolutionary algorithms automatically discover novel neural architectures, swarm-based methods enable distributed visual processing, and self-organising systems adapt to dynamic pattern sets in real-time. This workshop aims to present recent advances in bio-inspired techniques for pattern recognition applications and to bring together researchers and practitioners interested in exploiting the potential of such approaches. It will explore how principles from evolution, collective behaviour, and self-organisation can support robust, adaptive, and resource-efficient pattern recognition. Key questions include: How can evolutionary processes support the automated design of pattern recognition models? What insights does swarm intelligence offer for distributed perception? How do self-organising mechanisms maintain performance under shifting data conditions?
Topics:- Evolutionary neural architecture search;
- Swarm intelligence for distributed pattern recognition;
- Particle swarm optimisation for feature selection and extraction;
- Bio-inspired algorithms for image processing and analysis
- Artificial life and emergent pattern recognition
- Evolutionary multi-objective optimisation in computer vision;
- Evolutionary few-shot and zero-shot learning;
- Bio-inspired algorithms for automatic data augmentation;
- Evolutionary meta-learning;
- Hybrid evolutionary-gradient methods
- Neuromorphic and brain-inspired evolutionary systems;
- Federated evolutionary learning;
- Dynamic and online evolutionary adaptation;
- Evolutionary robotics and embodied vision;
- Quantum-inspired evolutionary pattern recognition
- Evolutionary explainability and interpretability;
- Bio-inspired hardware-software co-evolution;
- DNA computing for pattern matching
Workshop Website: tba
Overview
Throughout the workshop among others, we are aiming to address the following topics during the workshop:
- Text-Driven Synthesis: Generating human videos from textual descriptions with accurate action-semantic alignment.
- Audio-Gesture Synchronization: Modeling co-speech gesture dynamics from audio-visual correlations.
- Pose-Guided Motion Transfer: Transferring source motion to target subjects while preserving identity and context.
- Multi-Conditional Animation: Integrating heterogeneous control signals (e.g., text + pose + audio) for compositional generation.
- Long-Form Synthesis: Maintaining consistency and diversity in extended video sequences.
- Interactive Generation: Enabling real-time user control via sketches, prompts, or physical simulations.
- Evaluation Frameworks: Developing metrics for temporal stability, biomechanical plausibility, and environmental interaction.
- Dataset Curation: Constructing cross-modal datasets with annotated human motions and multi-view sequences.
- 3D-Consistent Avatars: Bridging 2D generation with 3D-aware representations for viewpoint-invariant synthesis.
Workshop Website: tba
Overview
In recent years large scale visual-language models have seen an explosive rise in capability. As general models they offer exceptional performance on a range of downstream tasks not explicitly trained for. By necessity, the full capability of an architecture is often only expressed in terms of zero-shot performance on benchmark dataset tasks, such as ImageNet classification or ADE20K segmentation. The performance, generality, comprehension and prompt-sensitivity of the architectures in specific vision fields, such as biometrics or medical imaging, has room for exploration.
Visual Language Models have the potential to revolutionise these areas through direct application, novel systems design and explainability, leading to insights on future model development, and a more comprehensive understanding of the tasks they are applied to. This workshop aims to provide a platform for the effective utilisation of such architectures in all fields of computer vision with their varying requirements.
The workshop topics include (but are not limited to): Dataset development and curation; Metrics and benchmarking methodologies (performance, robustness, and fairness, etc.); Innovative applications and methodological advances for exploiting visual-language models; Multimodal learning and representation (for example synergy between vision and language); Biometric analysis and human-computer interaction; Biomedical imaging and bioinformatics; Vehicular traffic perception and analysis; Image, speech, and video processing; Explainable and privacy-preserving artificial intelligence.
Workshop Website: tba
Overview
This workshop aims to bring together researchers and practitioners from computer vision, machine learning, and physics-based modeling to discuss the latest advancements in video generation and restoration. The workshop will emphasize the importance of integrating physical constraints and real-world priors into generative video models to ensure realism, consistency, and applicability in diverse domains.
Physics-aware video generation and restoration are fundamental for applications where realistic motion, temporal consistency, and adherence to physical laws are critical, including:
- Autonomous driving: Accurate motion forecasting and scene reconstruction.
- Medical imaging: High-fidelity generation/restoration for better diagnostics.
- Scientific simulations: Data-driven video generation for physics-based modeling.
- Streaming, AR/VR, and gaming: Real-time video enhancement for immersive experiences.
- Surveillance and forensics: Reconstruction of occluded or degraded video.
- Infant/toddler monitoring: Detecting and restoring subtle movements in low-quality recordings.
- Elderly care and assistive technology: Enhancing visibility and understanding of movement patterns.
As Generative AI (GenAI) advances, new challenges emerge, such as maintaining physical realism, enforcing temporal consistency, and ensuring generalization across domains. This workshop will explore cutting-edge research at the intersection of physics-aware modeling, deep learning, and generative AI to push the boundaries of video generation and restoration.
Workshop Website: tba
Overview
Eye tracking technology is becoming increasingly widespread, thanks to the recent availability of cheap commercial devices, both remote and wearable. At the same time, novel techniques are continually pursued to enhance the precision of gaze detection, and new methods are continuously explored to fully leverage the potential of eye data. Regardless of the considered context (human-computer interaction, user behavior understanding, biometrics, or others), pattern recognition often plays a significant role. The purpose of the ETTAC2026 workshop is to present recent eye tracking research that directly or indirectly exploits any form of pattern recognition.
The event’s scope includes (but is not limited to) the following topics:
- Gaze detection techniques
- Gaze-based human-computer interaction (e.g., assistive technologies, hybrid interfaces, etc.)
- Eye tracking and AI integration (e.g., combination of gaze and LLMs)
- Eye tracking for usability inspection
- Eye tracking in VR/AR applications
- User behavior understanding from eye data
- Eye movement analysis for biometrics, security, and privacy
- Eye tracking in medicine and health care
- Machine/deep learning for gaze analysis
Workshop Website: tba
Overview
Modern AI systems combine information from multiple sources (such as images, audio, and text) to enable better pattern recognition and understanding. MCMI brings together researchers from various fields, including audio processing, computer vision, and natural language processing, to share their work and ideas.
Topics:Technological Advancements in Multimodal Systems:
- Multimodal Fusion Strategies
- Deep Learning Architectures for Multimodal Fusion
- Cross-Modal Feature Extraction and Learning
- Audio-Visual Speech Recognition
- Scene and Object Recognition from Audio-Visual Cues
- Temporal Dynamics in Multimodal Data Processing
- Robustness and Generalization across Diverse Modalities
Applications and Interaction in Multimodal Systems:
- Emotion Recognition in Multimodal Data
- Natural Language Processing for Multimedia Content
- Multimodal Datasets and Benchmarks
- Novel Interaction Paradigms for Multimodal Data Acquisition
Security, Ethics, and Transparency in Multimodal Systems:
- Ethical Considerations in Multimodal Data Processing
- Privacy-Preserving Multimodal Learning
- Explainable AI (XAI) in Multimodal Systems
- Security and Robustness in Multimodal Systems
- Adversarial Attacks and Defense in Multimodal Systems
Workshop Website: available here
Overview
The topics covered by the workshop are Naturally explainable AI methods, Post-Hoc Explanation methods of Deep Neural Networks, including transformers and Generative Artificial Intelligence, and any ethical consideration when using pattern recognition models.
Technical issues in explainability are related to the creation of explanations, their representation, as well as the quantification of their confidence, while those in AI ethics include automated audits, detection of bias, ability to control AI systems to prevent harm and others methods to improve AI explainability in general, including algorithms and evaluation methods, user interface and visualization for achieving more explainable and ethical AI, real world applications and case studies.
Workshop Website: available here
Overview
The workshop welcomes contributions on topics including (but not limited to):
Machine unlearning in document AI
- unlearning for document image classification
- unlearning for handwritten text recognition
- unlearning for document visual question answering
- unlearning for multimodal document understanding
Robustness in document image recognition systems
- adversarial attacks & defenses
- robustness to data noise, layout variation, and handwriting styles
Privacy in document image understanding
- differential privacy for document image datasets
- membership inference and training data leakage prevention
- privacy-preserving OCR and document representation learning
Explainability and interpretability
- explaining document model decisions at pixel, region, and semantic levels
- visualization tools for understanding features learned from document images
Evaluation, benchmarks, and best practices
- metrics and datasets for assessing unlearning, privacy, robustness, and explainability
- standardized evaluation pipelines for trustworthy document intelligence
Applications and case studies
- legal, financial, medical, and government documents
- lifelong learning and compliance-driven document AI
Workshop Website: available here
Overview
The Workshop on Computer Vision for Biodiversity Monitoring and Conservation (CVBMC) aims to bridge the gap between advanced pattern recognition and ecological preservation. While computer vision has reached maturity in industrial and urban domains, its application to the natural world presents unique, high-dimensional challenges that require specialized technical approaches. The workshop will explore the deployment of state-of-the-art deep learning and AI methodologies to automate the extraction of biological information from unstructured visual datasets.
The subject is highly relevant to Pattern Recognition because biodiversity monitoring requires solving complex problems such as fine-grained visual categorization for species identification, often under conditions of high occlusion, varying illumination, and camouflage. Furthermore, several tasks in this domain (e.g., long-term population monitoring, study of behavioral changes) require handling of temporal data, thus increasing the dimensionality of the problem and calling for temporal-aware visual modeling.
Topics to be addressed include:
- animal and plant species identification
- organism tracking and movement analysis
- land-cover mapping, deforestation, and habitat monitoring
- classification of different organisms (e.g., by subspecies)
- assessment of organism behavior or behavior changes
- computer vision tools for ecological assessment
- counting and biodiversity monitoring
- analysis of terrestrial and underwater wildlife
- ecosystems and conservation case studies
Workshop Website: available here
Overview
The workshop will provide participants with foundational and applied skills for developing advanced AI agents, including preparing diverse data types to be neural-network ready, understanding model fusion techniques and the distinctions between early, late, and intermediate fusion, and performing PDF data extraction using OCR. Participants will also learn to differentiate between modality orchestration and agent orchestration and gain hands-on experience customizing NVIDIA AI Blueprints—particularly the Video Search and Summarization (VSS) blueprint—to design and deploy powerful multimodal AI agents.
Workshop Website: available here
Overview
The workshop focuses on recent developments in underwater surveillance, including underwater imaging, multimodal sensor fusion, acoustic–optical perception, generative data enhancement, graph-based reasoning, and autonomous underwater vehicles.
Topics include:
- Underwater image enhancement, dehazing, and restoration
- Object detection, segmentation, and tracking in underwater scenes
- Sensor fusion using optical, sonar, and LiDAR data
- AI-driven underwater communication and networking
- Learning-based AUV navigation and marine robotics
- Applications in environmental monitoring, Blue Economy, maritime security, and industrial inspection
Workshop Website: available here
Overview
In text analysis and synthesis areas many recent research topics like text editing in images and videos, vision languages models, text style transfer etc. are getting special attentions. The aim of this workshop is towards information processing and synthesis of textual data that may appear in normal text and different images and videos.
Topics of Interest include, but not limited to:- Text editing in images and Videos
- Text style transfer
- Vision Language models
- Sentiment analysis from text
- Personality traits detection
- Natural Language Processing
- Multimodal document understanding
- Stylistic text recognition
- Font analysis and synthesis
- Detection of synthetic manipulation in documents, and document forensics
- Analysis and interpretation of graphical documents
- Recognition and Analysis of low-resource language
- Complex Handwriting recognition
- Historical document analysis
- Text summarization and translation
- Language model for document information extraction
- Scene and video text detection and recognition
- NLP+Vision multimodal approaches
Workshop Website: tba
Overview
Human attention modeling, including human gaze modeling, eye tracking, have been key areas of focus in computer vision and pattern recognition research over the past decade. These dynamic fields have evolved from various perspectives, shaped by the diverse expertise and objectives of researchers. Despite recent advancements in technologies such as large vision and language models, these innovations have yet to be fully integrated into current research. Additionally, while the potential applications of gaze estimation, attention prediction and eye tracking across images, videos, and audio are vast, there remains a lack of groundbreaking approaches that push the boundaries of the field.
The Workshop on Human Gaze and Visual Attention Modeling aims to address this gap by providing a platform for introducing novel ideas and methodologies in human attention modeling, and gaze estimation and eye tracking. The workshop will also explore the intersection of these fields with emerging domains, such as human-computer interaction (HCI), robotics, autonomous systems and medical image analysis. By highlighting the broader, yet still underexplored, opportunities of gaze and human-attention and eye tracking research in real-world applications, we aim to spark new interdisciplinary collaborations.
Through this workshop, we aim to encourage forward-looking discussions that move beyond conventional computer vision applications and address meaningful real-world challenges. Our goal is to inspire the next wave of research and broaden the impact of human visual attention, eye-tracking, and gaze-based methods across diverse industries and domains.
Main topics that will be covered in the workshop:
- Computational Modeling of Human Visual Attention & Deep Learning for Visual Saliency
- Advances in Eye-Tracking Technologies and Applications
- Scanpath Prediction and Temporal Dynamics of Gaze in Video Understanding
- Active Vision and Real-World Applications
- Human Visual Perception for Computer Vision & Cognitive Modeling
- Benchmarking, Evaluation, and Privacy-Preserving Methods for Gaze Analysis
- Applications of Human Visual Attention in Medical Imaging & Vision-Based Interaction
- Robotics and HCI: Eye-Gaze Interaction and Gaze-Based Interfaces
Workshop Website: available here
Overview
The topic is Reproducible Research (RR) in direct relation to the Pattern Recognition domains, corresponding to most of the ICPR topics where research results rely on algorithms and code implementation. This workshop is an important activity of the newly created TC-22, specifically on computational reproducibility. As in all previous editions, RRPR 2026 is intended as both a short participative course on computational reproducibility, leading to open discussions with the participants, and also as a practical workshop on how to actually practice RR. In addition, another key goal for gathering the research community is to further advance the scientific aspects of reproducibility specifically in pattern recognition research. The call for papers includes three main tracks: (1) RR frameworks (including experiences, frameworks, or complete platforms), (2) RR results focusing on the quality of the reproducible research results, and (3) short papers. RRPR short papers are tailored to allow authors the extra space needed to document the steps they have taken to make their regular ICPR submission more reproducible. Sometimes this valuable information is not made available in conference papers, but doing so certainly grows awareness and increases visibility for reproducibility as a core aspect of pattern recognition research. RRPR is also an excellent forum for ICPR authors to share and discuss best practices in reproducibility, thereby advancing the field.
Reproducibility is an important topic in general and particularly important for PhD students and young researchers to learn about the best practices in research, including not only the scientific paper, but also the source code and the data. The special track is also related to the Deep Learning and Geometry fields where reproducibility and reliability are key points. The RRPR workshop has special relevance in ICPR 2026 since this time the conference will grant a Reproducibility Badge to authors, to recognize and celebrate the effort of authors towards trustworthy reproducible research, transparency, and reliable science. A selection of papers will be presented in RRPR and all ICPR attendees are welcomed to participate in this workshop, for which we expect substantial discussions around computational reproducibility.
Workshop Website: available here
Overview
This workshop deals with research activities about comics. MANPU workshop targets researchers in image processing, image analysis, pattern recognition and even knowledge representation. Indeed, the large variability of comic books and the complexity of their contents make comics analysis a kind of advanced problem of document image understanding or more generally knowledge-based scene recognition.
The topics of interest include among others:- Comics Image Processing
- Comics Analysis and Understanding
- Comics Recognition
- Comics Retrieval and Spotting
- Comics Enrichment
- Born-digital comics
- Reading Behavior Analysis
- Comics Generation
- Copy Protection – Fraud Detection
- Physical/Digital Comics Interfaces
Workshop Website: tba
Overview
GREEN-PR aims to group researchers working on more sustainable approaches in pattern recognition and/or recent developments of pattern recognition contributions in environmental applications.
Topics (non-exhaustive list):
- Low-complexity and more energy-efficient PR approaches : lightweight models, sparse and modular representations, edge and federated learning, etc.
- Development of pattern recognition for environmental applications : remote sensing and earth observation, climate and environmental monitoring, smart cities, wildlife and ecosystem models, agriculture, etc.
- Efficient data-driven learning approaches : transfer learning, distillation, active learning, etc.
- Lifecycle and sustainable AI frameworks, economical assessment, measure of carbon footprint, best practices for PR and AI developments.
Workshop Website: available here
Overview
The main subject of the workshop in the broadest sense of the word is the assessment and discussion of the current state of the mathematical theory of image analysis and the prospects for its development, as well as its application to solve particularly important difficult and socially significant applied problems.
The main purposes of the IMTA workshops are:
- 1) to observe, overview and discuss state of the art in mathematical foundations of image-mining;
- 2) to provide the fusion of modern fundamental mathematical approaches and techniques for image analysis/pattern recognition with the requests of applications.
This workshop is intended to cover, but it is not limited to, the following topics:
- Methodological and mathematical advances in image analysis and pattern recognition with a special focus on:
- New Mathematical Techniques in Image-Mining
- Image Models, Representations and Features
- Automation of Image- and Data-Mining
- Artificial Intelligence Techniques in Image-Mining
- Applied Problems of Image-Mining
Workshop Website: available here
Overview
Bacterial infections are the second-leading cause of death globally. Antimicrobial resistance threatens the effectiveness of modern healthcare and is recognized by the WHO as one of the top global public health threats. Infections also lead to significant economic costs, further motivating the urgent need to develop novel techniques for monitoring spread, develop accessible and accurate diagnostics, and tracking information spread and societal impact. These urgent challenges all require different forms of pattern recognition. We envision a workshop covering three broad topics: (i) diagnostics based on biomedical image and video analysis, (ii) bioinformatics as a tool for surveillance and prediction of spread, and (iii) artificial intelligence for analysis of communication strategies and societal response.
Technical pattern recognition challenges central to each topic could for (i) for example be: DL for highly skewed datasets, rare case and anomaly detection, explainable AI, low computational demand networks for point of care diagnostics, real-time video analysis of pathogen growth, and domain adapted evaluation approaches. For (ii) for example: genomic sequence pattern analysis for resistance and emerging pathogen detection, GNNs for modeling transmission networks and co-occurrence networks for tracing resistance genes across populations, time series analysis of genomic data for modeling spread and mutations, and forecasting outbreaks. For (iii), for example: NLP for detection of misinformation patterns, information diffusion modeling, behavioural response analysis, and multi-modal AI for understanding how communication strategies affect the societal response in emergency situations.
Workshop Website: tba
Overview
Key Workshop Highlights – GenAAI-2026
1. Generative Video Models as Autonomous Reasoning Engines: This theme examines how diffusion-based video models, 3D-aware world models, and large multimodal transformers can serve as cognitive backbones for agentic systems. Participants will explore how generative priors improve temporal coherence, fill in missing frames, simulate potential outcomes, and support high-level reasoning over complex dynamic scenes. By integrating planning modules and feedback loops, generative agents can autonomously interpret human activities, detect anomalies, and propose future scene evolutions.
2. Agentic AI for Real-World Environments: Agentic AI introduces capabilities such as autonomous task decomposition, context-aware decision-making, and self-optimization. This segment focuses on how these properties enhance video understanding in challenging real-world conditions—crowded surveillance scenes, variable lighting, occlusions, abrupt motion patterns, and multi-agent interactions. Discussions will center on vision-language-action models, tool-use agents, embodied decision systems, and reinforcement learning frameworks designed to operate in unconstrained environments.
3. Spatiotemporal Representation Learning & Predictive Modeling: High-quality video understanding depends on structured spatiotemporal representations. This section explores world models, graph neural video architectures, motion-aware transformers, and predictive generative models that anticipate future states of complex scenes. Use cases include trajectory prediction, pedestrian intent estimation, environmental hazard forecasting, and robotics navigation. The session will also investigate multimodal fusion across video, LiDAR, IMU, and audio, enabling agents to form a holistic perception of real-world environments.
We invite submissions and participation under the theme “GenAAI 2026”, including but not limited to:
- Agentic AI for Video Reasoning and Decision-Making
- Video Models and Spatiotemporal Generative Representations
- Vision-Language-Action Agents
- Generative Trajectory Prediction and Scene Forecasting
- Real-World Video Surveillance, Crowd Analytics, and Behavior Understanding
- Video Anomaly Detection Using Generative Priors
- Self-Supervised and Foundation Models for Long Video Sequences
- Generative Augmentation, Reconstruction, and Inpainting in Videos
- Few-shot and Zero-shot Video Understanding
- Edge and Real-Time Deployment of Generative Agents
Workshop Website: tba