Publications
2025
- Adaptive clustering for efficient Phenotype segmentation of UAV hyperspectral dataCiem Cornelissen, Sam Leroux, Pieter SimoensWACV2025, the Winter Conference on Applications of Computer Vision
- Exploring Correlated Facial Attributes in Text-to-Image Models: Unintended Consequences in Synthetic Face GenerationSander De Coninck, Sam Leroux, Pieter SimoensProceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops
- Maximum causal entropy inverse constrained reinforcement learningMattijs Baert, Pietro Mazzaglia, Sam Leroux, Pieter SimoensMACHINE LEARNINGBiblioWhen deploying artificial agents in real-world environments where they interact with humans, it is crucial that their behavior is aligned with the values, social norms or other requirements specific to that environment. However, many environments have implicit constraints that are difficult to specify and transfer to a learning agent. To address this challenge, we propose a novel method that utilizes the principle of maximum causal entropy to learn constraints and an optimal policy that adheres to these constraints, using demonstrations of agents that abide by the constraints. We prove convergence in a tabular setting and provide a practical implementation which scales to complex environments. We evaluate the effectiveness of the learned policy by assessing the reward received and the number of constraint violations, and we evaluate the learned cost function based on its transferability to other agents. Our method has been shown to outperform state-of-the-art approaches across a variety of tasks and environments, and it is able to handle problems with stochastic dynamics and a continuous state-action space.
- Reward machine inference for robotic manipulationMattijs Baert, Sam Leroux, Pieter Simoens8th Workshop on Generalization in Planning (GenPlan), part of AAAI2025
- Embedding-based pair generation for contrastive representation learning in audio-visual surveillance dataWei-Cheng Wang, Sander De Coninck, Sam Leroux, Pieter SimoensFRONTIERS IN ROBOTICS AND AIBiblioSmart cities deploy various sensors such as microphones and RGB cameras to collect data to improve the safety and comfort of the citizens. As data annotation is expensive, self-supervised methods such as contrastive learning are used to learn audio-visual representations for downstream tasks. Focusing on surveillance data, we investigate two common limitations of audio-visual contrastive learning: false negatives and the minimal sufficient information bottleneck. Irregular, yet frequently recurring events can lead to a considerable number of false-negative pairs and disrupt the model's training. To tackle this challenge, we propose a novel method for generating contrastive pairs based on the distance between embeddings of different modalities, rather than relying solely on temporal cues. The semantically synchronized pairs can then be used to ease the minimal sufficient information bottleneck along with the new loss function for multiple positives. We experimentally validate our approach on real-world data and show how the learnt representations can be used for different downstream tasks, including audio-visual event localization, anomaly detection, and event search. Our approach reaches similar performance as state-of-the-art modality- and task-specific approaches.
- Exploring and learning structure : active inference approach in navigational agentsDaria de Tinguy, Tim Verbelen, Bart DhoedtActive inference : 5th international workshop, IWAI 2024, revised selected papersBiblioDrawing inspiration from animal navigation strategies, we introduce a novel computational model for navigation and mapping, rooted in biologically inspired principles. Animals exhibit remarkable navigation abilities by efficiently using memory, imagination, and strategic decision-making to navigate complex and aliased environments. Building on these insights, we integrate traditional cognitive mapping approaches with an Active Inference Framework (AIF) to learn an environment structure in a few steps. Through the incorporation of topological mapping for long-term memory and AIF for navigation planning and structure learning, our model can dynamically apprehend environmental structures and expand its internal map with predicted beliefs during exploration. Comparative experiments with the Clone-Structured Graph (CSCG) model highlight our model’s ability to rapidly learn environmental structures in a single episode, with minimal navigation overlap. This is achieved without prior knowledge of the dimensions of the environment or the type of observations, showcasing its robustness and effectiveness in navigating ambiguous environments.
2024
- Hybrid edge-cloud models for bearing failure detection in a fleet of machinesSam Leroux, Pieter SimoensELECTRONICSBiblioReal-time condition monitoring of machinery is increasingly being adopted to minimize costs and enhance operational efficiency. By leveraging large-scale data acquisition and intelligent algorithms, failures can be detected and predicted, thereby reducing machine downtime. In this paper, we present a novel hybrid edge-cloud system for detecting rotational bearing failures using accelerometer data. We evaluate both supervised and unsupervised neural network approaches, highlighting their respective strengths and limitations. Supervised models demonstrate high accuracy but require labeled datasets representative of the failures of interesting data that are challenging to acquire due to the rarity of anomalies. Conversely, unsupervised models rely on data from normal operational conditions, which is more readily available. However, these models classify all deviations from normalcy as anomalies, including those unrelated to failure, leading to costly false positives. To address these challenges, we propose a distributed system that integrates supervised and unsupervised learning. A compact unsupervised model is deployed on edge devices near the machines to compress sensor data, which are then transmitted to a centralized cloud-based system. Over time, these data are automatically labeled and used to train a supervised model, improving the accuracy of failure predictions. Our approach enables efficient, scalable failure detection across a fleet of machines while balancing the trade-offs between supervised and unsupervised learning.
- Learning dynamic cognitive map with autonomous navigationDaria de Tinguy, Tim Verbelen, Bart DhoedtFRONTIERS IN COMPUTATIONAL NEUROSCIENCEBiblioInspired by animal navigation strategies, we introduce a novel computational model to navigate and map a space rooted in biologically inspired principles. Animals exhibit extraordinary navigation prowess, harnessing memory, imagination, and strategic decision-making to traverse complex and aliased environments adeptly. Our model aims to replicate these capabilities by incorporating a dynamically expanding cognitive map over predicted poses within an active inference framework, enhancing our agent's generative model plasticity to novelty and environmental changes. Through structure learning and active inference navigation, our model demonstrates efficient exploration and exploitation, dynamically expanding its model capacity in response to anticipated novel un-visited locations and updating the map given new evidence contradicting previous beliefs. Comparative analyses in mini-grid environments with the clone-structured cognitive graph model (CSCG), which shares similar objectives, highlight our model's ability to rapidly learn environmental structures within a single episode, with minimal navigation overlap. Our model achieves this without prior knowledge of observation and world dimensions, underscoring its robustness and efficacy in navigating intricate environments.
- A hierarchical active inference model of spatial alternation tasks and the hippocampal-prefrontal circuitToon Van de Maele, Bart Dhoedt, Tim Verbelen, Giovanni PezzuloNATURE COMMUNICATIONSBiblioCognitive problem-solving benefits from cognitive maps aiding navigation and planning. Physical space navigation involves hippocampal (HC) allocentric codes, while abstract task space engages medial prefrontal cortex (mPFC) task-specific codes. Previous studies show that challenging tasks, like spatial alternation, require integrating these two types of maps. The disruption of the HC-mPFC circuit impairs performance. We propose a hierarchical active inference model clarifying how this circuit solves spatial interaction tasks by bridging physical and task-space maps. Simulations demonstrate that the model's dual layers develop effective cognitive maps for physical and task space. The model solves spatial alternation tasks through reciprocal interactions between the two layers. Disrupting its communication impairs decision-making, which is consistent with empirical evidence. Additionally, the model adapts to switching between multiple alternation rules, providing a mechanistic explanation of how the HC-mPFC circuit supports spatial alternation tasks and the effects of disruption. How cognitive maps of physical and task space interact when executing cognitive tasks is not fully understood. This paper models how the hippocampal-prefrontal circuits solves memory-guided spatial alternation tasks, by bridging cognitive maps of physical and taskspace.
- GenRL : multimodal-foundation world models for generalization in embodied agentsPietro Mazzaglia, Tim Verbelen, Bart Dhoedt, A. Courville, S. RajeswarAdvances in Neural Information Processing Systems 37 (NeurIPS 2024)BiblioLearning generalist embodied agents, able to solve multitudes of tasks in different domains is a long-standing problem. Reinforcement learning (RL) is hard to scale up as it requires a complex reward design for each task. In contrast, language can specify tasks in a more natural way. Current foundation vision-language models (VLMs) generally require fine-tuning or other adaptations to be adopted in embodied contexts, due to the significant domain gap. However, the lack of multimodal data in such domains represents an obstacle to developing foundation models for embodied applications. In this work, we overcome these problems by presenting multimodal-foundation world models, able to connect and align the representation of foundation VLMs with the latent space of generative world models for RL, without any language annotations. The resulting agent learning framework, GenRL, allows one to specify tasks through vision and/or language prompts, ground them in the embodied domain’s dynamics, and learn the corresponding behaviors in imagination. As assessed through large-scale multi-task benchmarking in locomotion and manipulation domains, GenRL enables multi-task generalization from language and visual prompts. Furthermore, by introducing a data-free policy learning strategy, our approach lays the groundwork for foundational policy learning using generative world models.
- Representing positional information in generative world models for object manipulationStefano Ferraro, Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt, S. RajeswarCoRL 2024 Workshop on Learning Robot Fine and Dexterous Manipulation : Perception and Control, ProceedingsBiblioThe ability to predict outcomes of interactions between embodied agents and objects is paramount in the robotic setting. While model-based control methods have started to be employed for tackling manipulation tasks, they have faced challenges in accurately manipulating objects. As we analyze the causes of this limitation, we identify the cause of underperformance in the way current world models represent crucial positional information, especially about the target's goal specification for object positioning tasks. We propose two solutions for generative world models: position-conditioned (PCP) and latent-conditioned (LCP) policy learning. In particular, LCP employs object-centric latent representations that explicitly capture object positional information for goal specification. This naturally leads to the emergence of multimodal capabilities.
- Planning with tensor networks based on active inferenceSamuel Wauthier, Tim Verbelen, Bart Dhoedt, Bram VanheckeMACHINE LEARNING-SCIENCE AND TECHNOLOGYBiblioTensor networks (TNs) have seen an increase in applications in recent years. While they were originally developed to model many-body quantum systems, their usage has expanded into the field of machine learning. This work adds to the growing range of applications by focusing on planning by combining the generative modeling capabilities of matrix product states and the action selection algorithm provided by active inference. Their ability to deal with the curse of dimensionality, to represent probability distributions, and to dynamically discover hidden variables make matrix product states specifically an interesting choice to use as the generative model in active inference, which relies on 'beliefs' about hidden states within an environment. We evaluate our method on the T-maze and Frozen Lake environments, and show that the TN-based agent acts Bayes optimally as expected under active inference.
- Learning temporal task specifications from demonstrationsMattijs Baert, Sam Leroux, Pieter SimoensEXPLAINABLE AND TRANSPARENT AI AND MULTI-AGENT SYSTEMS, EXTRAAMAS 2024BiblioAs we progress towards real-world deployment, the critical need for interpretability in reinforcement learning algorithms grows more pivotal, ensuring the safety and reliability of intelligent agents. This paper tackles the challenge of acquiring task specifications in linear temporal logic through expert demonstrations, aiming to alleviate the burdensome task of specification engineering. The rich semantics of temporal logics serve as an interpretable framework for delineating intricate, multi-stage tasks. We propose a method which iteratively learns a task specification and a nominal policy solving this task. In each iteration, the task specification is refined to better distinguish expert trajectories from trajectories sampled from the nominal policy. With this process we obtain a concise and interpretable task specification. Unlike previous work, our method is capable of learning directly from trajectories in the original state space and does not require predefined atomic propositions. We showcase the effectiveness of our method on multiple tasks in both an office and a Minecraft-inspired environment.
- Computational modelling of time perception predictors and modulatorsPieter Simoens, Yara KhalufTIMING & TIME PERCEPTION
- Revisiting edge AI : opportunities and challengesTobias Meuser, Lauri Loven, Monowar Bhuyan, Shishir G. Patil, Schahram Dustdar, Atakan Aral, Suzan Bayhan, Christian Becker, Eyal de Lara, Aaron Yi Ding, Janick Edinger, James Gross, Nitinder Mohan, Andy D. Pimentel, Etienne Riviere, Henning Schulzrinne, Pieter Simoens, Guerkan Solmaz, Michael WelzlIEEE INTERNET COMPUTINGBiblioEdge artificial intelligence (AI) is an innovative computing paradigm that aims to shift the training and inference of machine learning models to the edge of the network. This paradigm offers the opportunity to significantly impact our everyday lives with new services such as autonomous driving and ubiquitous personalized health care. Nevertheless, bringing intelligence to the edge involves several major challenges, which include the need to constrain model architecture designs, the secure distribution and execution of the trained models, and the substantial network load required to distribute the models and data collected for training. In this article, we highlight key aspects in the development of edge AI in the past and connect them to current challenges. This article aims to identify research opportunities for edge AI, relevant to bring together the research in the fields of artificial intelligence and edge computing.
- Multi-stage task specification learning from demonstrationMattijs Baert, Sam Leroux, Pieter SimoensRobotic Tasks and How to Specify Them? Task Specification for General-Purpose Intelligent Robots
- Multimodal foundation world models for generalist embodied agentsPietro Mazzaglia, Tim Verbelen, Bart Dhoedt, A. Courville, S. RajeswarICML 2024 Workshop : Multi-modal Foundation Model meets Embodied AI, ProceedingsBiblioLearning generalist embodied agents, able to solve multitudes of tasks in different domains is a long-standing problem. Reinforcement learning (RL) is hard to scale up as it requires a complex reward design for each task. In contrast, language can specify tasks in a more natural way. Current foundation vision-language models (VLMs) generally require fine-tuning or other adaptations to be functional, due to the significant domain gap. However, the lack of multimodal data in such domains represents an obstacle toward developing foundation models for embodied applications. In this work, we overcome these problems by presenting multimodal foundation world models, able to connect and align the representation of foundation VLMs with the latent space of generative world models for RL, without any language annotations. The resulting agent learning framework, GenRL, allows one to specify tasks through vision and/or language prompts, ground them in the embodied domain’s dynamics, and learns the corresponding behaviors in imagination. As assessed through large-scale multi-task benchmarking, GenRL exhibits strong multi-task generalization performance in several locomotion and manipulation domains. Furthermore, by introducing a data-free RL strategy, it lays the groundwork for foundation model-based RL for generalist embodied agents
- Reactive shepherding along a dynamic pathStef Van Havermaet, Yara Khaluf, Pieter SimoensSCIENTIFIC REPORTSBiblioShepherding, the task of guiding a herd of autonomous individuals in a desired direction, is an essential skill employed in the herding of animals, crowd control, and evacuation operations. Integrating shepherding capabilities into robots holds promise to perform such tasks with increased efficiency and reduced labor costs. To date, robotic shepherds have only been designed to steer a herd towards a predetermined goal location without constraints on the trajectory. However, the tasks of a sheepdog encompass not only steering the herd but also (i) maintaining the herd within a designated area and (ii) averting dangers, obstacles, or undesirable terrain such as newly sown land. We present a decentralized control algorithm for multi-robot shepherding designed to guide a group of animals along a specified path delineated by two boundaries. The algorithm incorporates the additional objective of preserving the group within these boundaries. Simulation results reveal that, especially in sections of the path with sharp turns and a small distance between the boundaries, the group exhibits a tendency to deviate beyond the prescribed margin. Additionally, our findings emphasize the algorithm's sensitivity to the ratio of robot-group sizes and the magnitude of the group's velocity.
- Label efficient lifelong multi-view broiler detectionThorsten Cardoen, Sam Leroux, Pieter Simoens2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRWBiblioBroiler localization is crucial for welfare monitoring, particularly in identifying issues such as wet litter. We focus on multi-camera detection systems since multiple viewpoints not only ensure comprehensive pen coverage but also reduce occlusions caused by lighting, feeder and drinking equipment. Previous multi-view detection studies localize subjects either by aggregating ground plane projections of single-view predictions or by developing end-to-end multi-view detectors capable of directly generating predictions. However, single-view detections may suffer from reduced accuracy due to occlusions, and obtaining ground plane labels for training end-to-end multi-view detectors is challenging. In this paper, we combine the strengths of both approaches by using the readily available aggregated single-view detections as labels for training a multi-view detector. Our approach alleviates the need for hard-to-acquire ground-plane labels. Through experiments on a real-world broiler dataset, we demonstrate the effectiveness of our approach.
- Mitigating bias using model-agnostic data attributionSander De Coninck, Sam Leroux, Pieter Simoens2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRWBiblioMitigating bias in machine learning models is a critical endeavor for ensuring fairness and equity. In this paper, we propose a novel approach to address bias by leveraging pixel image attributions to identify and regularize regions of images containing significant information about bias attributes. Our method utilizes a model-agnostic approach to extract pixel attributions by employing a convolutional neural network (CNN) classifier trained on small image patches. By training the classifier to predict a property of the entire image using only a single patch, we achieve region-based attributions that provide insights into the distribution of important information across the image. We propose utilizing these attributions to introduce targeted noise into datasets with confounding attributes that bias the data, thereby constraining neural networks from learning these biases and emphasizing the primary attributes. Our approach demonstrates its efficacy in enabling the training of unbiased classifiers on heavily biased datasets.
- Multi-bit, black-box watermarking of deep neural networks in embedded applicationsSam Leroux, Stijn Vanassche, Pieter Simoens2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRWBiblioThe effort required to collect data and train a large neural network requires a significant investment from organizations. Therefore, trained neural networks are often seen as valuable intellectual property that needs to be protected. At the same time, we are increasingly seeing applications where a model is deployed on an edge device. This has several benefits, including improved privacy and reduced latency but it also opens up the possibility for third parties to extract the trained model from the device and to use it for their own purposes. Watermarking techniques aim to safeguard neural networks from unauthorized usage. These methods alter the model’s behavior for specific trigger inputs, enabling the owner to recognize stolen instances. However, existing watermarking algorithms are not suited for distributed edge AI scenarios as they only create a single watermarked model instance. We introduce a novel multi-bit watermarking approach capable of efficiently generating a large number of model instances. Each of these instances maintains functional equivalence but exhibits unique behaviors when prompted with a predefined trigger input. This allows the owner to trace the source of a model leak to a potentially malicious user. We experimentally validate our approach on the MNIST, CIFAR-10, and ImageNet datasets, evaluating model performance and resilience against watermark removal attacks.
- Test-time specialization of dynamic neural networksSam Leroux, D Katare, A.Y. Ding, Pieter Simoens2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRWBiblioIn recent years, there has been a notable increase in the size of commonly used image classification models. This growth has empowered models to recognize thousands of diverse object types. However, their computational demands pose significant challenges, especially when deploying them on resource-constrained edge devices. In many use cases where a model is deployed on an edge device, only a small subset of the classes will ever be observed by a given model instance. Our proposed test-time specialization of dynamic neural networks allows these models to become faster at recognizing the classes that are observed frequently, while maintaining the ability to recognize all other classes, albeit slightly less efficient. We benchmark our approach on a real-world edge device, obtaining significant speedups compared to the baseline model without test-time adaptation.
- Committing to the wrong artificial delegate in a collective-risk dilemma is better than directly committing mistakesInês Terrucha, Elias Fernandez Domingos, Pieter Simoens, Tom LenaertsSCIENTIFIC REPORTSBiblioWhile autonomous artificial agents are assumed to perfectly execute the strategies they are programmed with, humans who design them may make mistakes. These mistakes may lead to a misalignment between the humans' intended goals and their agents' observed behavior, a problem of value alignment. Such an alignment problem may have particularly strong consequences when these autonomous systems are used in social contexts that involve some form of collective risk. By means of an evolutionary game theoretical model, we investigate whether errors in the configuration of artificial agents change the outcome of a collective-risk dilemma, in comparison to a scenario with no delegation. Delegation is here distinguished from no-delegation simply by the moment at which a mistake occurs: either when programming/choosing the agent (in case of delegation) or when executing the actions at each round of the game (in case of no-delegation). We find that, while errors decrease success rate, it is better to delegate and commit to a somewhat flawed strategy, perfectly executed by an autonomous agent, than to commit execution errors directly. Our model also shows that in the long-term, delegation strategies should be favored over no-delegation, if given the choice.
- Privacy-preserving visual analysis : training video obfuscation models without sensitive labelsSander De Coninck, Wei-Cheng Wang, Sam Leroux, Pieter SimoensAPPLIED INTELLIGENCEBiblioVisual analysis tasks, including crowd management, often require resource-intensive machine learning models, posing challenges for deployment on edge hardware. Consequently, cloud computing emerges as a prevalent solution. To address privacy concerns associated with offloading video data to remote cloud platforms, we present a novel approach using adversarial training to develop a lightweight obfuscator neural network. Our method focuses on pedestrian detection as an example of visual analysis, allowing the transformation of video frames on the camera itself to retain only essential information for pedestrian detection while preserving privacy. Importantly, the obfuscated data remains compatible with publicly available object detectors, requiring no modifications or significant loss in accuracy. Additionally, our technique overcomes the common limitation of relying on labeled sensitive attributes for privacy preservation. By demonstrating the inability of pedestrian attribute recognition models to detect attributes in obfuscated videos, we validate the efficacy of our privacy protection method. Our results suggest that this scalable approach holds promise for enabling camera usage in video analytics while upholding personal privacy.
- Object-centric scene representations using active inferenceToon Van de Maele, Tim Verbelen, Pietro Mazzaglia, Stefano Ferraro, Bart DhoedtNEURAL COMPUTATIONBiblioRepresenting a scene and its constituent objects from raw sensory data is a core ability for enabling robots to interact with their environment. In this letter, we propose a novel approach for scene understanding, leveraging an object-centric generative model that enables an agent to infer object category and pose in an allocentric reference frame using active inference, a neuro-inspired framework for action and perception. For evaluating the behavior of an active vision agent, we also propose a new benchmark where, given a target viewpoint of a particular object, the agent needs to find the best matching viewpoint given a workspace with randomly positioned objects in 3D. We demonstrate that our active inference agent is able to balance epistemic foraging and goal-driven behavior, and quantitatively outperforms both supervised and reinforcement learning baselines by more than a factor of two in terms of success rate.
- Learning safety constraints from demonstration using one-class decision treesMattijs Baert, Sam Leroux, Pieter SimoensWorkshop on Neuro-Symbolic Learning and Reasoning in the Era of Large Language Models at AAAI 2024, Proceedings
- The art of compensation : how hybrid teams solve collective-risk dilemmasInês Terrucha, Elias Fernandez Domingos, Francisco C. Santos, Pieter Simoens, Tom LenaertsPLOS ONEBiblioIt is widely known how the human ability to cooperate has influenced the thriving of our species. However, as we move towards a hybrid human-machine future, it is still unclear how the introduction of artificial agents in our social interactions affect this cooperative capacity. In a one-shot collective risk dilemma, where enough members of a group must cooperate in order to avoid a collective disaster, we study the evolutionary dynamics of cooperation in a hybrid population. In our model, we consider a hybrid population composed of both adaptive and fixed behavior agents. The latter serve as proxies for the machine-like behavior of artificially intelligent agents who implement stochastic strategies previously learned offline. We observe that the adaptive individuals adjust their behavior in function of the presence of artificial agents in their groups to compensate their cooperative (or lack of thereof) efforts. We also find that risk plays a determinant role when assessing whether or not we should form hybrid teams to tackle a collective risk dilemma. When the risk of collective disaster is high, cooperation in the adaptive population falls dramatically in the presence of cooperative artificial agents. A story of compensation, rather than cooperation, where adaptive agents have to secure group success when the artificial agents are not cooperative enough, but will rather not cooperate if the others do so. On the contrary, when risk of collective disaster is low, success is highly improved while cooperation levels within the adaptive population remain the same. Artificial agents can improve the collective success of hybrid teams. However, their application requires a true risk assessment of the situation in order to actually benefit the adaptive population (i.e. the humans) in the long-term.
- Spatial and temporal hierarchy for autonomous navigation using active inference in minigrid environmentDaria de Tinguy, Toon Van de Maele, Tim Verbelen, Bart DhoedtENTROPYBiblioRobust evidence suggests that humans explore their environment using a combination of topological landmarks and coarse-grained path integration. This approach relies on identifiable environmental features (topological landmarks) in tandem with estimations of distance and direction (coarse-grained path integration) to construct cognitive maps of the surroundings. This cognitive map is believed to exhibit a hierarchical structure, allowing efficient planning when solving complex navigation tasks. Inspired by human behaviour, this paper presents a scalable hierarchical active inference model for autonomous navigation, exploration, and goal-oriented behaviour. The model uses visual observation and motion perception to combine curiosity-driven exploration with goal-oriented behaviour. Motion is planned using different levels of reasoning, i.e., from context to place to motion. This allows for efficient navigation in new spaces and rapid progress toward a target. By incorporating these human navigational strategies and their hierarchical representation of the environment, this model proposes a new solution for autonomous navigation and exploration. The approach is validated through simulations in a mini-grid environment.
- Cyclic Action Graphs for goal recognition problems with inaccurately initialised fluentsHelen Harman, Pieter SimoensKNOWLEDGE AND INFORMATION SYSTEMSBiblioGoal recognisers attempt to infer an agent's intentions from a sequence of observed actions. This is an important component of intelligent systems that aim to assist or thwart actors; however, there are many challenges to overcome. For example, the initial state of the environment could be partially unknown, and agents can act suboptimally and observations could be missing. Approaches that adapt classical planning techniques to goal recognition have previously been proposed, but, generally, they assume the initial world state is accurately defined. In this paper, a state is inaccurate if any fluent's value is unknown or incorrect. Our aim is to develop a goal recognition approach that is as accurate as the current state-of-the-art algorithms and whose accuracy does not deteriorate when the initial state is inaccurately defined. To cope with this complication, we propose solving goal recognition problems by means of an Action Graph. An Action Graph models the dependencies, i.e. order constraints, between all actions rather than just actions within a plan. Leaf nodes correspond to actions and are connected to their dependencies via operator nodes. After generating an Action Graph, the graph's nodes are labelled with their distance from each hypothesis goal. This distance is based on the number and type of nodes traversed to reach the node in question from an action node that results in the goal state being reached. For each observation, the goal probabilities are then updated based on either the distance the observed action's node is from each goal or the change in distance. Our experimental results, for 15 different domains, demonstrate that our approach is robust to inaccuracies within the defined initial state.
2023
- Inferring hierarchical structure in multi-room maze environmentsDaria de Tinguy, Toon Van de Maele, Tim Verbelen, Bart DhoedtProceedings of the 40th International Conference on Machine LearningBiblioCognitive maps play a crucial role in facilitating flexible behaviour by representing spatial and conceptual relationships within an environment. The ability to learn and infer the underlying structure of the environment is crucial for effective exploration and navigation. This paper introduces a hierarchical active inference model addressing the challenge of inferring structure in the world from pixel-based observations. We propose a three-layer hierarchical model consisting of a cognitive map, an allocentric, and an egocentric world model, combining curiosity-driven exploration with goal-oriented behaviour at the different levels of reasoning from context to place to motion. This allows for efficient exploration and goal-directed search in room-structured mini-grid environments.
- Learning to navigate from scratch using world models and curiosity : the good, the bad, and the uglyDaria de Tinguy, Sven Remmery, Pietro Mazzaglia, Tim Verbelen, Bart DhoedtIROS 2023 : the Workshop World Models and Predictive Coding in Cognitive Robotics, Proceedings
- Learning spatial and temporal hierarchies : hierarchical active inference for navigation in multi-room maze environmentsDaria de Tinguy, Toon Van de Maele, Tim Verbelen, Bart DhoedtIROS 2023 : the Workshop World Models and Predictive Coding in Cognitive Robotics, Proceedings
- The duplication of genomes and genetic networks and its potential for evolutionary adaptation and survival during environmental turmoilMehrshad Ebadi, Quinten Bafort, Eshchar Mizrachi, P. Audenaert, Pieter Simoens, Marc Van Montagu, Dries Bonte, Yves Van de PeerPROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICABiblioThe importance of whole-genome duplication (WGD) for evolution is controversial. Whereas some view WGD mainly as detrimental and an evolutionary dead end, there is growing evidence that polyploidization can help overcome environmental change, stressful conditions, or periods of extinction. However, despite much research, the mechanistic underpinnings of why and how polyploids might be able to outcompete or outlive nonpolyploids at times of environmental upheaval remain elusive, especially for autopolyploids, in which heterosis effects are limited. On the longer term, WGD might increase both mutational and environmental robustness due to redundancy and increased genetic variation, but on the short—or even immediate—term, selective advantages of WGDs are harder to explain. Here, by duplicating artificially generated Gene Regulatory Networks (GRNs), we show that duplicated GRNs—and thus duplicated genomes—show higher signal output variation than nonduplicated GRNs. This increased variation leads to niche expansion and can provide polyploid populations with substantial advantages to survive environmental turmoil. In contrast, under stable environments, GRNs might be maladaptive to changes, a phenomenon that is exacerbated in duplicated GRNs. We believe that these results provide insights into how genome duplication and (auto)polyploidy might help organisms to adapt quickly to novel conditions and to survive ecological uproar or even cataclysmic events.
- To avoid collective disasters, it is better to commit to a flawed AI than to commit the errors ourselvesInês Terrucha, Elias Domingos, Pieter Simoens, Tom LenaertsEDAI 2023 : Evolutionary Dynamics in social, cooperative and hybrid AI, ProceedingsBiblioHumans make mistakes. Even when a strategy is per- fectly crafted to address a problem in hand, the implementation of such a strategy can still be plagued by execution errors if conducted by a human. The noise associated with human execution is one of the main contributors to the growth of the AI industry: autonomous arti- ficial agents are expected to execute the strategies that they are pro- grammed to implement without such noise. However, because the designers of such agents are human, errors may occur on the pro- gramming of such agents. This might lead to an AI agent that per- fectly executes the strategy it was programmed with, but the strategy is actually misaligned with the intended goals of the human who con- figured it, a problem of AI alignment. In this work, we explore, by means of an evolutionary game-theoretical model, how errors in the configuration of artificial agents (or in the choice of an artificial del- egate) changes the outcome of a collective risk dilemma (CRD). We find that for high risk situations, errors decrease the success rate in comparison with the case of perfect execution. However, it is better to delegate and commit to a flawed strategy executed perfectly by an autonomous agent, than to make execution errors ourselves.
- The effect of rapport on delegation to virtual agentsNingyuan Sun, Jean Botev, Pieter SimoensPROCEEDINGS OF THE 23RD ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS, IVA 2023BiblioThis paper presents the initial results of a study exploring whether the perceived rapport with a virtual agent can influence users' decisions on delegating critical tasks to the agent. We hypothesize that users are more likely to delegate to virtual agents that attempt to build rapport with users than to agents that avoid building rapport. The samples we collected so far still need to validate the hypothesis fully. Nevertheless, we found that the perceived rapport with a virtual agent is highly relevant to trust in the agent.
- FMCW radar sensing for indoor drones using variational auto-encodersAli Safa, Tim Verbelen, Ozan Catal, Toon Van de Maele, Matthias Hartmann, Bart Dhoedt, Andre Bourdoux2023 IEEE RADAR CONFERENCE, RADARCONF23BiblioThis paper investigates unsupervised learning of low-dimensional representations from FMCW radar data, which can be used for multiple downstream tasks in a drone navigation context. To this end, we release a first-of-its-kind dataset of raw radar ADC data recorded from a radar mounted on a flying drone in an indoor environment, together with ground truth detection targets. We show that, by utilizing our learned representations, we match the performance of conventional radar processing techniques while training our models on different input modalities such as range-doppler maps, range-azimuth maps, or raw ADC samples of only two consecutively transmitted chirps.
- Dermatologist versus artificial intelligence confidence in dermoscopy diagnosis : complementary information that may affect decision-makingPieter Van Molle, Sofie Mylle, Tim Verbelen, Cedric De Boom, Bert Vankeirsbilck, Evelien Verhaeghe, Bart Dhoedt, Lieve BrochezEXPERIMENTAL DERMATOLOGYBiblioIn dermatology, deep learning may be applied for skin lesion classification. However, for a given input image, a neural network only outputs a label, obtained using the class probabilities, which do not model uncertainty. Our group developed a novel method to quantify uncertainty in stochastic neural networks. In this study, we aimed to train such network for skin lesion classification and evaluate its diagnostic performance and uncertainty, and compare the results to the assessments by a group of dermatologists. By passing duplicates of an image through such a stochastic neural network, we obtained distributions per class, rather than a single probability value. We interpreted the overlap between these distributions as the output uncertainty, where a high overlap indicated a high uncertainty, and vice versa. We had 29 dermatologists diagnose a series of skin lesions and rate their confidence. We compared these results to those of the network. The network achieved a sensitivity and specificity of 50% and 88%, comparable to the average dermatologist (respectively 68% and 73%). Higher confidence/less uncertainty was associated with better diagnostic performance both in the neural network and in dermatologists. We found no correlation between the uncertainty of the neural network and the confidence of dermatologists (R = -0.06, p = 0.77). Dermatologists should not blindly trust the output of a neural network, especially when its uncertainty is high. The addition of an uncertainty score may stimulate the human-computer interaction.
- Delegation to autonomous agents : a key to overcome past failure and focus on the collective target aheadInês Terrucha, E. F. Domingos, R. Suchon, F. C. Santon, Pieter Simoens, T. LenaertsIC²S² 2023 : the 9th International Conference on Computational Social Science
- FOCUS : object-centric world models for robotics manipulationStefano Ferraro, Pietro Mazzaglia, Tim Verbelen, Bart DhoedtRSS 2023 Workshop : Interdisciplinary Exploration of Generalizable Manipulation Policy Learning : Paradigms and Debates, Proceedings
- Mastering the unsupervised reinforcement learning benchmark from PixelsS. Rajeswar, Pietro Mazzaglia, Tim Verbelen, A. Piche, Bart Dhoedt, A. Courville, A. LacosteProceedings of the 40th International Conference on Machine Learning
- Learning logic constraints from demonstrationMattijs Baert, Sam Leroux, Pieter SimoensNEURAL-SYMBOLIC LEARNING AND REASONING 2023, NESY 2023BiblioAutonomous agents operating in real-world settings are often required to efficiently accomplish a task while adhering to certain environmental constraints. For instance, a self-driving car must transport its passengers to their intended destination as fast as possible while complying with traffic regulations. Inverse Constrained Reinforcement Learning (ICRL) is a technique that enables the learning of a policy from demonstrations of expert agents. When these expert agents adhere to the environmental constraints, ICRL thus allows for compliant policies to be learned without the need to define constraints beforehand. However, this approach provides no insight into the constraints themselves although this is desired for safety-critical applications such as autonomous driving. In such settings, it is important to verify what is learned from the given demonstrations. In this work, we propose a novel approach for learning logic rules that represent the environmental constraints given demonstrations of agents that comply with them, thus providing an interpretable representation of the environmental constraints.
- Sparse random neural networks for online anomaly detection on sensor nodesSam Leroux, Pieter SimoensFUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCEBiblioWhether it is used for predictive maintenance, intrusion detection or surveillance, on-device anomaly detection is a very valuable functionality in sensor and Internet-of-things (IoT) systems. In this paper, we introduce a novel anomaly detection technique based on sparse, random neural networks. The sparsity in the model allows for a very efficient implementation on embedded or resource constrained hardware. Our approach supports continuous online learning where the model is deployed to the sensor device without any prior training. As new data becomes available, the model is updated and becomes better at detecting anomalies. We experimentally validate our approach on several default benchmark data sets in the visual domain as well as on industrial quality inspection and predictive maintenance tasks. We show that our approach achieves a very favorable trade-off between computational cost and anomaly detection accuracy.
- Steering herds away from dangers in dynamic environmentsStef Van Havermaet, Pieter Simoens, Tim Landgraf, Yara KhalufROYAL SOCIETY OPEN SCIENCEBiblioShepherding, the task of guiding a herd of autonomous individuals in a desired direction, is an essential skill to herd animals, enable crowd control and rescue from danger. Equipping robots with the capability of shepherding would allow performing such tasks with increased efficiency and reduced labour costs. So far, only single-robot or centralized multi-robot solutions have been proposed. The former is unable to observe dangers at any place surrounding the herd, and the latter does not generalize to unconstrained environments. Therefore, we propose a decentralized control algorithm for multi-robot shepherding, where the robots maintain a caging pattern around the herd to detect potential nearby dangers. When danger is detected, part of the robot swarm positions itself in order to repel the herd towards a safer region. We study the performance of our algorithm for different collective motion models of the herd. We task the robots to shepherd a herd to safety in two dynamic scenarios: (i) to avoid dangerous patches appearing over time and (ii) to remain inside a safe circular enclosure. Simulations show that the robots are always successful in shepherding when the herd remains cohesive, and enough robots are deployed.
- Symmetry and complexity in object-centric deep active inference modelsStefano Ferraro, Toon Van de Maele, Tim Verbelen, Bart DhoedtINTERFACE FOCUSBiblioHumans perceive and interact with hundreds of objects every day. In doing so, they need to employ mental models of these objects and often exploit symmetries in the object's shape and appearance in order to learn generalizable and transferable skills. Active inference is a first principles approach to understanding and modelling sentient agents. It states that agents entertain a generative model of their environment, and learn and act by minimizing an upper bound on their surprisal, i.e. their free energy. The free energy decomposes into an accuracy and complexity term, meaning that agents favour the least complex model that can accurately explain their sensory observations. In this paper, we investigate how inherent symmetries of particular objects also emerge as symmetries in the latent state space of the generative model learnt under deep active inference. In particular, we focus on object-centric representations, which are trained from pixels to predict novel object views as the agent moves its viewpoint. First, we investigate the relation between model complexity and symmetry exploitation in the state space. Second, we do a principal component analysis to demonstrate how the model encodes the principal axis of symmetry of the object in the latent space. Finally, we also demonstrate how more symmetrical representations can be exploited for better generalization in the context of manipulation.
- Inverse reinforcement learning through logic constraint inferenceMattijs Baert, Sam Leroux, Pieter SimoensMACHINE LEARNINGBiblioAutonomous robots start to be integrated in human environments where explicit and implicit social norms guide the behavior of all agents. To assure safety and predictability, these artificial agents should act in accordance with the applicable social norms. However, it is not straightforward to define these rules and incorporate them in an agent's policy. Particularly because social norms are often implicit and environment specific. In this paper, we propose a novel iterative approach to extract a set of rules from observed human trajectories. This hybrid method combines the strengths of inverse reinforcement learning and inductive logic programming. We experimentally show how our method successfully induces a compact logic program which represents the behavioral constraints applicable in a Tower of Hanoi and a traffic simulator environment. The induced program is adopted as prior knowledge by a model-free reinforcement learning agent to speed up training and prevent any social norm violation during exploration and deployment. Moreover, expressing norms as a logic program provides improved interpretability, which is an important pillar in the design of safe artificial agents, as well as transferability to similar environments.
- Disentangling shape and pose for object-centric deep active inference modelsStefano Ferraro, Toon Van de Maele, Pietro Mazzaglia, Tim Verbelen, Bart DhoedtACTIVE INFERENCE, IWAI 2022BiblioActive inference is a first principles approach for understanding the brain in particular, and sentient agents in general, with the single imperative of minimizing free energy. As such, it provides a computational account for modelling artificial intelligent agents, by defining the agent’s generative model and inferring the model parameters, actions and hidden state beliefs. However, the exact specification of the generative model and the hidden state space structure is left to the experimenter, whose design choices influence the resulting behaviour of the agent. Recently, deep learning methods have been proposed to learn a hidden state space structure purely from data, alleviating the experimenter from this tedious design task, but resulting in an entangled, non-interpretable state space. In this paper, we hypothesize that such a learnt, entangled state space does not necessarily yield the best model in terms of free energy, and that enforcing different factors in the state space can yield a lower model complexity. In particular, we consider the problem of 3D object representation, and focus on different instances of the ShapeNet dataset. We propose a model that factorizes object shape, pose and category, while still learning a representation for each factor using a deep neural network. We show that models, with best disentanglement properties, perform best when adopted by an active agent in reaching preferred observations.
- Learning generative models for active inference using tensor networksSamuel Wauthier, Bram Vanhecke, Tim Verbelen, Bart DhoedtACTIVE INFERENCE, IWAI 2022BiblioActive inference provides a general framework for behavior and learning in autonomous agents. It states that an agent will attempt to minimize its variational free energy, defined in terms of beliefs over observations, internal states and policies. Traditionally, every aspect of a discrete active inference model must be specified by hand, i.e. by manually defining the hidden state space structure, as well as the required distributions such as likelihood and transition probabilities. Recently, efforts have been made to learn state space representations automatically from observations using deep neural networks. In this paper, we present a novel approach of learning state spaces using quantum physics-inspired tensor networks. The ability of tensor networks to represent the probabilistic nature of quantum states as well as to reduce large state spaces makes tensor networks a natural candidate for active inference. We show how tensor networks can be used as a generative model for sequential data. Furthermore, we show how one can obtain beliefs from such a generative model and how an active inference agent can use these to compute the expected free energy. Finally, we demonstrate our method on the classic T-maze environment.
- Home run : finding your way home by imagining trajectoriesDaria de Tinguy, Pietro Mazzaglia, Tim Verbelen, Bart DhoedtACTIVE INFERENCE, IWAI 2022BiblioWhen studying unconstrained behaviour and allowing mice to leave their cage to navigate a complex labyrinth, the mice exhibit foraging behaviour in the labyrinth searching for rewards, returning to their home cage now and then, e.g. to drink. Surprisingly, when executing such a “home run”, the mice do not follow the exact reverse path, in fact, the entry path and home path have very little overlap. Recent work proposed a hierarchical active inference model for navigation, where the low level model makes inferences about hidden states and poses that explain sensory inputs, whereas the high level model makes inferences about moving between locations, effectively building a map of the environment. However, using this “map” for planning, only allows the agent to find trajectories that it previously explored, far from the observed mice’s behaviour. In this paper, we explore ways of incorporating before-unvisited paths in the planning algorithm, by using the low level generative model to imagine potential, yet undiscovered paths. We demonstrate a proof of concept in a grid-world environment, showing how an agent can accurately predict a new, shorter path in the map leading to its starting point, using a generative model learnt from pixel-based observations.
- Choreographer : learning and adapting skills in imaginationPietro Mazzaglia, Tim Verbelen, Bart Dhoedt, Lacoste A., Rajeswar S.ICLR 2023, the Eleventh International Conference on Learning Representations, ProceedingsBiblioUnsupervised skill learning aims to learn a rich repertoire of behaviors without external supervision, providing artificial agents with the ability to control and in- fluence the environment. However, without appropriate knowledge and explo- ration, skills may provide control only over a restricted area of the environment, limiting their applicability. Furthermore, it is unclear how to leverage the learned skill behaviors for adapting to downstream tasks in a data-efficient manner. We present Choreographer, a model-based agent that exploits its world model to learn and adapt skills in imagination. Our method decouples the exploration and skill learning processes, being able to discover skills in the latent state space of the model. During adaptation, the agent uses a meta-controller to evaluate and adapt the learned skills efficiently by deploying them in parallel in imagination. Chore- ographer is able to learn skills both from offline data and by collecting data simul- taneously with an exploration policy. The skills can be used to effectively adapt to downstream tasks, as we show in the URL benchmark, where we outperform previous approaches from both pixels and states inputs. The learned skills also explore the environment thoroughly, finding sparse rewards more frequently, as shown in goal-reaching tasks from the DMC Suite and Meta-World.
2022
- An adaptive metric model for collective motion structures in dynamic environmentsStef Van Havermaet, Pieter Simoens, Yara KhalufSwarm Intelligence, 13th International Conference, ANTS 2022, ProceedingsBiblioRobot swarms often use collective motion. Most models generate collective motion using the repulsion zone, alignment zone, and attraction zone. Despite being widely used, these models have a limited capacity for generating group structures in response to environmental stimuli. Enabling robot swarms to display proper spatial structures is crucial for several swarm robotics tasks. In this paper, we focus on three spatial structures that allow the swarm to adapt its aggregation (coverage) and alignment (order) in response to environmental changes. We show that the metric and long-range models are unable to generate every structure. We propose an extension to the metric model that allows the swarm to display the three structures, which is demonstrated in a simulated dynamic environment where different stimuli appear over time.
- An opt-in framework for privacy protection in audio-based applicationsWei-Cheng Wang, Sander De Coninck, Sam Leroux, Pieter SimoensIEEE PERVASIVE COMPUTINGBiblioInstalling audio-based applications exposes users to the risk of the data processor extracting additional information beyond the task the user permitted. To solve these privacy concerns, we propose to integrate an on-edge data obfuscation between the audio sensor and the recognition algorithm. We introduce a novel privacy loss metric and use adversarial learning to train an obfuscator. Contrary to existing work, our technique does not require users to specify which sensitive attributes they want to protect (opt-out) but instead only provide permission for specific tasks (opt-in). Moreover, we do not require retraining of recognition algorithms, making the obfuscated data compatible with existing methods. We experimentally validate our approach on four voice datasets and show that we can protect several attributes of the speaker, including gender, identity, and emotional state with a minimal recognition accuracy degradation.
- Iterative online 3D reconstruction from RGB imagesThorsten Cardoen, Sam Leroux, Pieter SimoensSENSORSBiblio3D reconstruction is the computer vision task of reconstructing the 3D shape of an object from multiple 2D images. Most existing algorithms for this task are designed for offline settings, producing a single reconstruction from a batch of images taken from diverse viewpoints. Alongside reconstruction accuracy, additional considerations arise when 3D reconstructions are used in real-time processing pipelines for applications such as robot navigation or manipulation. In these cases, an accurate 3D reconstruction is already required while the data gathering is still in progress. In this paper, we demonstrate how existing batch-based reconstruction algorithms lead to suboptimal reconstruction quality when used for online, iterative 3D reconstruction and propose appropriate modifications to the existing Pix2Vox++ architecture. When additional viewpoints become available at a high rate, e.g., from a camera mounted on a drone, selecting the most informative viewpoints is important in order to mitigate long term memory loss and to reduce the computational footprint. We present qualitative and quantitative results on the optimal selection of viewpoints and show that state-of-the-art reconstruction quality is already obtained with elementary selection algorithms.
- Enforcing object permanence using hierarchical, object-centric generative modelsToon Van de Maele, Stefano Ferraro, Tim Verbelen, Bart DhoedtNeurIPS 2022, thirty-sixth Conference on Neural Information Processing Systems, 4th Workshop on Shared Visual Representations in Human and Machine Visual Intelligence (SVRHM), Proceedings
- Tensor networks for active inference with discrete observation spacesSamuel Wauthier, Bram Vanhecke, Tim Verbelen, Bart DhoedtMachine Learning and the Physical Sciences (ML4PS) workshop, part of NeurIPS2022, the 36th Conference on Neural Information Processing Systems, Proceedings
- Computational optimization of image-based reinforcement learning for roboticsStefano Ferraro, Toon Van de Maele, Pietro Mazzaglia, Tim Verbelen, Bart DhoedtSENSORSBiblioThe robotics field has been deeply influenced by the advent of deep learning. In recent years, this trend has been characterized by the adoption of large, pretrained models for robotic use cases, which are not compatible with the computational hardware available in robotic systems. Moreover, such large, computationally intensive models impede the low-latency execution which is required for many closed-loop control systems. In this work, we propose different strategies for improving the computational efficiency of the deep-learning models adopted in reinforcement-learning (RL) scenarios. As a use-case project, we consider an image-based RL method on the synergy between push-and-grasp actions. As a first optimization step, we reduce the model architecture in complexity, by decreasing the number of layers and by altering the architecture structure. Second, we consider downscaling the input resolution to reduce the computational load. Finally, we perform weight quantization, where we compare post-training quantization and quantized-aware training. We benchmark the improvements introduced in each optimization by running a standard testing routine. We show that the optimization strategies introduced can improve the computational efficiency by around 300 times, while also slightly improving the functional performance of the system. In addition, we demonstrate closed-loop control behaviour on a real-world robot, while processing everything on a Jetson Xavier NX edge device.
- Unsupervised model-based pre-training for data-efficient reinforcement learning from pixelsSai Rajeswar, Pietro Mazzaglia, Tim Verbelen, Alexandre Piché, Bart Dhoedt, Aaron Courville, Alexandre LacosteDecision Awareness in Reinforcement Learning : workshop at the International Conference on Machine Learning (ICML) 2022, ProceedingsBiblioReinforcement learning (RL) aims at autonomously performing complex tasks. To this end, a reward signal is used to steer the learning process. While successful in many circumstances, the approach is typically data-hungry, requiring large amounts of task-specific interaction between agent and environment to learn efficient behaviors. To alleviate this, unsupervised RL proposes to collect data through self-supervised interaction to accelerate task-specific adaptation. However, whether current unsupervised strategies lead to improved generalization capabilities is still unclear, more so when the input observations are high-dimensional. In this work, we advance the field by closing the performance gap in the Unsupervised RL Benchmark, a collection of tasks to be solved in a data-efficient manner, after interacting with the environment in a self-supervised way. Our approach uses unsupervised exploration for collecting experience to pre-train a world model. Then, when fine-tuning for downstream tasks, the agent leverages the learned model and a hybrid planner to efficiently adapt for the given tasks, achieving comparable results to task-specific baselines, while using 20x less data. We extensively evaluate our work, comparing several exploration methods and improving the fine-tuning process by studying the interactions between the learned components. Furthermore, we investigate the limitations of the pre-trained agent, gaining insights into how these influence the decision process and shedding light on new research directions.
- TinyMLOps : operational challenges for widespread edge AI adoptionSam Leroux, Pieter Simoens, Meelis Lootus, Kartik Thakore, Akshay Sharma2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022)BiblioDeploying machine learning applications on edge devices can bring clear benefits such as improved reliability, latency and privacy but it also introduces its own set of challenges. Most works focus on the limited computational resources of edge platforms but this is not the only bottleneck standing in the way of widespread adoption. In this paper we list several other challenges that a TinyML practitioner might need to consider when operationalizing an application on edge devices. We focus on tasks such as monitoring and managing the application, common functionality for a MLOps platform, and show how they are complicated by the distributed nature of edge deployment. We also discuss issues that are unique to edge applications such as protecting a model's intellectual property and verifying its integrity.
- Permanence with object-centric representationsToon Van de Maele, Tim Verbelen, Stefano Ferraro, Bart DhoedtICDL 22 : the 2022 IEEE International Conference on Development and Learning
- Embodied object representation learning and recognitionToon Van de Maele, Tim Verbelen, Ozan Catal, Bart DhoedtFRONTIERS IN NEUROROBOTICSBiblioScene understanding and decomposition is a crucial challenge for intelligent systems, whether it is for object manipulation, navigation, or any other task. Although current machine and deep learning approaches for object detection and classification obtain high accuracy, they typically do not leverage interaction with the world and are limited to a set of objects seen during training. Humans on the other hand learn to recognize and classify different objects by actively engaging with them on first encounter. Moreover, recent theories in neuroscience suggest that cortical columns in the neocortex play an important role in this process, by building predictive models about objects in their reference frame. In this article, we present an enactive embodied agent that implements such a generative model for object interaction. For each object category, our system instantiates a deep neural network, called Cortical Column Network (CCN), that represents the object in its own reference frame by learning a generative model that predicts the expected transform in pixel space, given an action. The model parameters are optimized through the active inference paradigm, i.e., the minimization of variational free energy. When provided with a visual observation, an ensemble of CCNs each vote on their belief of observing that specific object category, yielding a potential object classification. In case the likelihood on the selected category is too low, the object is detected as an unknown category, and the agent has the ability to instantiate a novel CCN for this category. We validate our system in an simulated environment, where it needs to learn to discern multiple objects from the YCB dataset. We show that classification accuracy improves as an embodied agent can gather more evidence, and that it is able to learn about novel, previously unseen objects. Finally, we show that an agent driven through active inference can choose their actions to reach a preferred observation.
- Theory of mind and delegation to robotic virtual agentsNingyuan Sun, Jean Botev, Yara Khaluf, Pieter Simoens2022 31ST IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (IEEE RO-MAN 2022)BiblioDespite already being commonplace, delegation to robotic virtual agents (VAs) is often considered challenging and error-prone in critical situations by the general public. Theory of mind, the human capacity to take another person's perspective, is deemed an important enabler for human-human cooperation. This study explores the effect of a robotic VA's ability to use theory of mind on users' delegation behavior. To this end, we conducted a between-subjects experiment with participants playing the Colored Trails game with robotic VAs of varying levels of theory of mind. The results invalidate our hypothesis that the ToM level is a reliable indicator of delegation choices. Instead, we found that the participants' performance strongly correlates with their delegatory intentions. Therefore, to facilitate delegation, designers of robots and robotic agents may consider refraining from using ToM-resemblance features and focusing on balancing user performance perception instead to induce the desired delegation behaviors.
- The free energy principle for perception and action : a deep learning perspectivePietro Mazzaglia, Tim Verbelen, Ozan Catal, Bart DhoedtENTROPYBiblioThe free energy principle, and its corollary active inference, constitute a bio-inspired theory that assumes biological agents act to remain in a restricted set of preferred states of the world, i.e., they minimize their free energy. Under this principle, biological agents learn a generative model of the world and plan actions in the future that will maintain the agent in an homeostatic state that satisfies its preferences. This framework lends itself to being realized in silico, as it comprehends important aspects that make it computationally affordable, such as variational inference and amortized planning. In this work, we investigate the tool of deep learning to design and realize artificial agents based on active inference, presenting a deep-learning oriented presentation of the free energy principle, surveying works that are relevant in both machine learning and active inference areas, and discussing the design choices that are involved in the implementation process. This manuscript probes newer perspectives for the active inference framework, grounding its theoretical aspects into more pragmatic affairs, offering a practical guide to active inference newcomers and a starting point for deep learning practitioners that would like to investigate implementations of the free energy principle.
- Curiosity-driven exploration via latent Bayesian surprisePietro Mazzaglia, Ozan Catal, Tim Verbelen, Bart DhoedtPROCEEDINGS OF THE AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCEBiblioThe human intrinsic desire to pursue knowledge, also known as curiosity, is considered essential in the process of skill acquisition. With the aid of artificial curiosity, we could equip current techniques for control, such as Reinforcement Learning, with more natural exploration capabilities. A promising approach in this respect has consisted of using Bayesian surprise on model parameters, i.e. a metric for the difference between prior and posterior beliefs, to favour exploration. In this contribution, we propose to apply Bayesian surprise in a latent space representing the agent’s current understanding of the dynamics of the system, drastically reducing the computational costs. We extensively evaluate our method by measuring the agent's performance in terms of environment exploration, for continuous tasks, and looking at the game scores achieved, for video games. Our model is computationally cheap and compares positively with current state-of-the-art methods on several problems. We also investigate the effects caused by stochasticity in the environment, which is often a failure case for curiosity-driven agents. In this regime, the results suggest that our approach is resilient to stochastic transitions.
- Foraging behaviour and patch size distribution jointly determine population dynamics in fragmented landscapesJohannes Nauta, Pieter Simoens, Yara Khaluf, Ricardo Martinez-GarciaJOURNAL OF THE ROYAL SOCIETY INTERFACEBiblioIncreased fragmentation caused by habitat loss represents a major threat to the persistence of animal populations. How fragmentation affects populations depends on the rate at which individuals move between spatially separated patches. Whereas negative effects of habitat loss on biodiversity are well known, the effects of fragmentation per se on population dynamics and ecosystem stability remain less well understood. Here, we use a spatially explicit predator-prey model to investigate how the interplay between fragmentation and optimal foraging behaviour affects predator-prey interactions and, subsequently, ecosystem stability. We study systems wherein prey occupies isolated patches and are consumed by predators that disperse following Levy random walks. Our results show that the Levy exponent and the degree of fragmentation jointly determine coexistence probabilities. In highly fragmented landscapes, Brownian and ballistic predators go extinct and only scale-free predators can coexist with prey. Furthermore, our results confirm that predation causes irreversible habitat loss in fragmented landscapes owing to overexploitation of smaller patches of prey. Moreover, we show that predator dispersal can reduce, but not prevent or minimize, the amount of lost habitat. Our results suggest that integrating optimal foraging theory into population and landscape ecology is crucial to assessing the impact of fragmentation on biodiversity and ecosystem stability.
- Bio-inspired monocular drone SLAMOzan Catal, Tim Verbelen, Ni Wang, Matthias Hartmann, Bart DhoedtDroneSE and RAPIDO : System Engineering for constrained embedded systemsBiblioDrone navigation in GPS-denied, indoor environments, is still a challenging problem. As drones can perceive the environment from a richer set of viewpoints, simultaneous localization and mapping (SLAM) becomes more complex, while having stringent compute and energy constraints. To tackle that problem, this research displays a biologically inspired deep-learning algorithm for monocular SLAM on a drone platform. We propose an unsupervised representation learning method that yields low-dimensional latent state descriptors, that mitigates the sensitivity to perceptual aliasing, and works on power-efficient, embedded hardware. We compare our method against ORB-SLAM3, and showcase increased robustness and an order of magnitude lower memory overhead.
- Group size and resource fractality drive multimodal search strategies : a quantitative analysis on group foragingJohannes Nauta, Pieter Simoens, Yara KhalufPHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS
- The value of measuring uncertainty in neural networks in dermoscopyPieter Van Molle, Lieve Brochez, Tim Verbelen, Cedric De Boom, Bert Vankeirsbilck, Evelien Verhaeghe, Sofie Mylle, Pieter Simoens, Bart DhoedtJOURNAL OF THE AMERICAN ACADEMY OF DERMATOLOGY
- Pass/fail prediction in programming coursesCharlotte Van Petegem, Louise Deconinck, Dieter Mourisse, Rien Maertens, Niko Strijbol, Bart Dhoedt, Bram De Wever, Peter Dawyndt, Bart MesuereJOURNAL OF EDUCATIONAL COMPUTING RESEARCHBiblioWe present a privacy-friendly early-detection framework to identify students at risk of failing in introductory programming courses at university. The framework was validated for two different courses with annual editions taken by higher education students ( N = 2 080) and was found to be highly accurate and robust against variation in course structures, teaching and learning styles, programming exercises and classification algorithms. By using interpretable machine learning techniques, the framework also provides insight into what aspects of practising programming skills promote or inhibit learning or have no or minor effect on the learning process. Findings showed that the framework was capable of predicting students’ future success already early on in the semester.
- Model reduction through progressive latent space pruning in deep active inferenceSamuel Wauthier, Cedric De Boom, Ozan Catal, Tim Verbelen, Bart DhoedtFRONTIERS IN NEUROROBOTICSBiblioAlthough still not fully understood, sleep is known to play an important role in learning and in pruning synaptic connections. From the active inference perspective, this can be cast as learning parameters of a generative model and Bayesian model reduction, respectively. In this article, we show how to reduce dimensionality of the latent space of such a generative model, and hence model complexity, in deep active inference during training through a similar process. While deep active inference uses deep neural networks for state space construction, an issue remains in that the dimensionality of the latent space must be specified beforehand. We investigate two methods that are able to prune the latent space of deep active inference models. The first approach functions similar to sleep and performs model reduction post hoc. The second approach is a novel method which is more similar to reflection, operates during training and displays "aha" moments when the model is able to reduce latent space dimensionality. We show for two well-known simulated environments that model performance is retained in the first approach and only diminishes slightly in the second approach. We also show that reconstructions from a real world example are indistinguishable before and after reduction. We conclude that the most important difference constitutes a trade-off between training time and model performance in terms of accuracy and the ability to generalize, via minimization of model complexity.
- The art of compensation : how hybrid teams solve collective risk dilemmasInês Terrucha, E. F. Dmingos, F. C. Santos, Pieter Simoens, T. LenaertsALA 2020 : workshop at AAMAS 2022
- Automated training of location-specific edge models for traffic countingSam Leroux, Bo Li, Pieter SimoensCOMPUTERS & ELECTRICAL ENGINEERINGBiblioDeep neural networks are the state of the art for various machine learning problems dealing with large amounts of rich sensor data. It is often desirable to evaluate these models on edge devices instead of relying on cloud computing. In this paper, we perform traffic counting using surveillance cameras. Edge computing is required as only aggregated counts should leave the device and not the privacy sensitive video frames. Unfortunately, only small object detection models are suited for edge devices which results in sub-optimal performance. We introduce location specific models that are each trained for one specific camera. The model does not need to generalize to other locations. We show that smaller specialized models can outperform large general purpose models. We propose an automated way to train these small models without human intervention. We experimentally show that we can achieve a similar counting accuracy with 5x fewer parameters than state-of-the-art techniques.
- Iterative neural networks for adaptive inference on resource-constrained devicesSam Leroux, Tim Verbelen, Pieter Simoens, Bart DhoedtNEURAL COMPUTING & APPLICATIONSBiblioThe computational cost of evaluating a neural network usually only depends on design choices such as the number of layers or the number of units in each layer and not on the actual input. In this work, we build upon deep Residual Networks (ResNets) and use their properties to design a more efficient adaptive neural network building block. We propose a new architecture, which replaces the sequential layers with an iterative structure where weights are reused multiple times for a single input image, reducing the storage requirements drastically. In addition, we incorporate an adaptive computation module that allows the network to adjust its computational cost at run time for each input sample independently. We experimentally validate our models on image classification, object detection and semantic segmentation tasks and show that our models only use their full capacity for the hardest input samples and are more efficient on average.
- Multi-branch neural networks for video anomaly detection in adverse lighting and weather conditionsSam Leroux, Bo Li, Pieter Simoens2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)BiblioAutomated anomaly detection in surveillance videos has attracted much interest as it provides a scalable alternative to manual monitoring. Most existing approaches achieve good performance on clean benchmark datasets recorded in well-controlled environments. However, detecting anomalies is much more challenging in the real world. Adverse weather conditions like rain or changing brightness levels cause a significant shift in the input data distribution, which in turn can lead to the detector model incorrectly reporting high anomaly scores. Additionally, surveillance cameras are usually deployed in evolving environments such as a city street of which the appearance changes over time because of seasonal changes or roadworks. The anomaly detection model will need to be updated periodically to deal with these issues. In this paper, we introduce a multi-branch model that is equipped with a trainable preprocessing step and multiple identical branches for detecting anomalies during day and night as well as in sunny and rainy conditions. We experimentally validate our approach on a distorted version of the Avenue dataset and provide qualitative results on real-world surveillance camera data. Experimental results show that our method outperforms the existing methods in terms of detection accuracy while being faster and more robust on scenes with varying visibility.
2021
- Contrastive active inferencePietro Mazzaglia, Tim Verbelen, Bart DhoedtNeurIPS 2021, 5th Conference on Neural Information Processing Systems, ProceedingsBiblioActive inference is a unifying theory for perception and action resting upon the idea that the brain maintains an internal model of the world by minimizing free energy. From a behavioral perspective, active inference agents can be seen as self-evidencing beings that act to fulfill their optimistic predictions, namely preferred outcomes or goals. In contrast, reinforcement learning requires human-designed rewards to accomplish any desired outcome. Although active inference could provide a more natural self-supervised objective for control, its applicability has been limited because of the shortcomings in scaling the approach to complex environments. In this work, we propose a contrastive objective for active inference that strongly reduces the computational burden in learning the agent's generative model and planning future actions. Our method performs notably better than likelihood-based active inference in image-based tasks, while also being computationally cheaper and easier to train. We compare to reinforcement learning agents that have access to human-designed reward functions, showing that our approach closely matches their performance. Finally, we also show that contrastive methods perform significantly better in the case of distractors in the environment and that our method is able to generalize goals to variations in the background.
- LatentSLAM : unsupervised multi-sensor representation learning for localization and mappingOzan Catal, Wouter Jansen, Tim Verbelen, Bart Dhoedt, Jan Steckel2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021)BiblioBiologically inspired algorithms for simultaneous localization and mapping (SLAM) such as RatSLAM have been shown to yield effective and robust robot navigation in both indoor and outdoor environments. One drawback however is the sensitivity to perceptual aliasing due to the template matching of low-dimensional sensory templates. In this paper, we propose an unsupervised representation learning method that yields low-dimensional latent state descriptors that can be used for RatSLAM. Our method is sensor agnostic and can be applied to any sensor modality, as we illustrate for camera images, radar range-doppler maps and lidar scans. We also show how combining multiple sensors can increase the robustness, by reducing the number of false matches. We evaluate on a dataset captured with a mobile robot navigating in a warehouse-like environment, moving through different aisles with similar appearance, making it hard for the SLAM algorithms to disambiguate locations.
- Data-efficient sensor upgrade path using knowledge distillationPieter Van Molle, Cedric De Boom, Tim Verbelen, Bert Vankeirsbilck, Jonas De Vylder, Bart Diricx, Pieter Simoens, Bart DhoedtSENSORSBiblioDeep neural networks have achieved state-of-the-art performance in image classification. Due to this success, deep learning is now also being applied to other data modalities such as multispectral images, lidar and radar data. However, successfully training a deep neural network requires a large reddataset. Therefore, transitioning to a new sensor modality (e.g., from regular camera images to multispectral camera images) might result in a drop in performance, due to the limited availability of data in the new modality. This might hinder the adoption rate and time to market for new sensor technologies. In this paper, we present an approach to leverage the knowledge of a teacher network, that was trained using the original data modality, to improve the performance of a student network on a new data modality: a technique known in literature as knowledge distillation. By applying knowledge distillation to the problem of sensor transition, we can greatly speed up this process. We validate this approach using a multimodal version of the MNIST dataset. Especially when little data is available in the new modality (i.e., 10 images), training with additional teacher supervision results in increased performance, with the student network scoring a test set accuracy of 0.77, compared to an accuracy of 0.37 for the baseline. We also explore two extensions to the default method of knowledge distillation, which we evaluate on a multimodal version of the CIFAR-10 dataset: an annealing scheme for the hyperparameter alpha and selective knowledge distillation. Of these two, the first yields the best results. Choosing the optimal annealing scheme results in an increase in test set accuracy of 6%. Finally, we apply our method to the real-world use case of skin lesion classification.
- Dynamic narrowing of VAE bottlenecks using GECO and L0 regularizationCedric De Boom, Samuel Wauthier, Tim Verbelen, Bart Dhoedt2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)BiblioWhen designing variational autoencoders (VAEs) or other types of latent space models, the dimensionality of the latent space is typically defined upfront. In this process, it is possible that the number of dimensions is under- or overprovisioned for the application at hand. In case the dimensionality is not predefined, this parameter is usually determined using time- and resource-consuming cross-validation. For these reasons we have developed a technique to shrink the latent space dimensionality of VAEs automatically and on-the-fly during training using Generalized ELBO with Constrained Optimization (GECO) and the L0-Augment-REINFORCE-Merge (L0-ARM) gradient estimator. The GECO optimizer ensures that we are not violating a predefined upper bound on the reconstruction error. This paper presents the algorithmic details of our method along with experimental results on five different datasets. We find that our training procedure is stable and that the latent space can be pruned effectively without violating the GECO constraints.
- Resource ephemerality influences effectiveness of altruistic behavior in collective foragingJohannes Nauta, Yara Khaluf, Pieter SimoensSWARM INTELLIGENCEBiblioIn collective foraging, interactions between conspecifics can be exploited to increase foraging efficiencies. Many collective systems exhibit short interaction ranges, making information about patches rich in resources only locally available. In environments wherein these patches are difficult to locate, collective systems might exhibit altruistic traits that increase average resource intake compared to non-interacting systems. In this work, we show that resource ephemerality and availability highly influence the benefits of altruistic behavior. We study an agent-based model wherein foragers can recruit others to feed on patches, instead of exploiting these individually. We show that the net gain by recruiting conspecifics can be estimated, effectively reducing the decision on patch detection to one based on a threshold. Patches with qualities above this threshold are expected to increase foraging efficiencies and should therefore induce recruiting of others. By letting foragers assume Levy searches, we show that recruitment strategies with contrasting diffusion characteristics optimize conspecific encounter rates. Our results further indicate that active recruitment is only beneficial when patches are scarce and persistent. Most interestingly, the effect of choosing suboptimal threshold values is small over a wide range of resource ephemeralities. This suggests that the decision of whether to recruit others is more impactful than fine-tuning the recruitment decision. Finally, we show that the advantages of active recruitment depend greatly on both forager density and their interaction radius, as we observe passive strategies to be more efficient, but only when forager densities or interaction ranges are large.
- Bayesian inverse reinforcement learning for strategy extraction in the iterated prisoner's dilemma gameMatthias Cami, Inês Terrucha, Yara Khaluf, Pieter SimoensProceedings of BNAIC/BeneLearn 2021, 33rd Benelux Conference on Artificial Intelligence and 30th Belgian-Dutch Conference on Machine Learning
- ChronoPilot : modulating time perceptionJ. Botev, K. Drewing, H. Hamann, Yara Khaluf, Pieter Simoens, A. Vatakis2021 4TH IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY (AIVR 2021)BiblioAlthough time can be measured objectively, human time perception is remarkably subjective and influenced by cognitive states, individual motivations, and social factors. This malleability of perceived time can be evidenced, for instance, in stressful situations where one might experience a lack of time, while one might lose track of time in more relaxing circumstances. Based on fundamental knowledge from psychology and cognitive science, the ChronoPilot project aims at developing a prototype technology driven by artificial intelligence to extend or compress human subjective time adaptively and whenever required. Mediated-reality approaches, such as virtual and augmented reality, have enormous potential for presenting the users with visual, auditory, and haptic stimulation patterns that directly or indirectly influence their subjective time and which are difficult to reproduce in the real world. Going beyond individual settings, ChronoPilot will also investigate how to coordinate time plasticity in collaborative environments where one group member's actions may affect other members' perception. Different scenarios, where humans alone or humans and robots have to collaborate in realistic and virtual environments, will validate the planned research. In this paper, we present the fundamental concepts of our project ChronoPilot, which is a work in progress.
- Disentangling what and where for 3D object-centric representations through active inferenceToon Van de Maele, Tim Verbelen, Ozan Catal, Bart DhoedtMACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT IBiblioAlthough modern object detection and classification models achieve high accuracy, these are typically constrained in advance on a fixed train set and are therefore not flexible to deal with novel, unseen object categories. Moreover, these models most often operate on a single frame, which may yield incorrect classifications in case of ambiguous viewpoints. In this paper, we propose an active inference agent that actively gathers evidence for object classifications, and can learn novel object categories over time. Drawing inspiration from the human brain, we build object-centric generative models composed of two information streams, a what- and a where-stream. The what-stream predicts whether the observed object belongs to a specific category, while the where-stream is responsible for representing the object in its internal 3D reference frame. We show that our agent (i) is able to learn representations for many object categories in an unsupervised way, (ii) achieves state-of-theart classification accuracies, actively resolving ambiguity when required and (iii) identifies novel object categories. Furthermore, we validate our system in an end-to-end fashion where the agent is able to search for an object at a given pose from a pixel-based rendering. We believe that this is a first step towards building modular, intelligent systems that can be used for a wide range of tasks involving three dimensional objects.
- Decoupled appearance and motion learning for efficient anomaly detection in surveillance videoBo Li, Sam Leroux, Pieter SimoensCOMPUTER VISION AND IMAGE UNDERSTANDINGBiblioAutomating the analysis of surveillance video footage is of great interest when urban environments or industrial sites are monitored by a large number of cameras. As anomalies are often context-specific, it is hard to predefine events of interest and collect labeled training data. A purely unsupervised approach for automated anomaly detection is much more suitable. For every camera, a separate algorithm could then be deployed that learns over time a baseline model of appearance and motion related features of the objects within the camera viewport. Anything that deviates from this baseline is flagged as an anomaly for further analysis downstream. We propose a new neural network architecture that learns the normal behavior in a purely unsupervised fashion. In contrast to previous work, we use latent code predictions as our anomaly metric. We show that this outperforms frame reconstruction-based and prediction-based methods on different benchmark datasets both in terms of accuracy and robustness against changing lighting and weather conditions. By decoupling an appearance and a motion model, our model can also process 16 to 45 times more frames per second than related approaches which makes our model suitable for deploying on the camera itself or on other edge devices.
- Robot navigation as hierarchical active inferenceOzan Catal, Tim Verbelen, Toon Van de Maele, Bart Dhoedt, Adam SafronNEURAL NETWORKS
- Towards bio-inspired unsupervised representation learning for indoor aerial navigationNi Wang, Ozan Catal, Tim Verbelen, Matthias Hartmann, Bart DhoedtICRA2021, the IEEE International Conference on Robotics and Automation, Workshops, Proceedings
- A learning gap between neuroscience and reinforcement learningSamuel Wauthier, Pietro Mazzaglia, Ozan Catal, Cedric De Boom, Tim Verbelen, Bart DhoedtBRAIN2AI, How Can Findings About The Brain Improve AI Systems, ICLR 2021 Workshop, ProceedingsBiblioHistorically, artificial intelligence has drawn much inspiration from neuroscience to fuel advances in the field. However, current progress in reinforcement learning is largely focused on benchmark problems that fail to capture many of the aspects that are of interest in neuroscience today. We illustrate this point by extending a T-maze task from neuroscience for use with reinforcement learning algorithms, and show that state-of-the-art algorithms are not capable of solving this problem. Finally, we point out where insights from neuroscience could help explain some of the issues encountered.
- No more hand-tuning rewards : masked constrained policy optimization for safe reinforcement learningStef Van Havermaet, Yara Khaluf, Pieter SimoensAAMAS '21, Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent SystemsBiblioIn safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectation of accumulated rewards and guarantee its safety to remain above a given threshold. Hence, it is straightforward to formalize safe RL problems by both a reward function and a safety constraint. We define safety as the probability of survival in environments where taking risky actions could lead to early termination of the task. Although the optimization problem is already constrained by a safety threshold, reward signals related to unsafe terminal states influence the original maximization objective of the task. Selecting the appropriate value of these signals is often a time consuming and challenging reward engineering task, which requires expert knowledge of the domain. This paper presents a safe RL algorithm, called Masked Constrained Policy Optimization (MCPO), in which the learning process is constrained by safety and excludes the risk reward signals. We develop MCPO as an extension of gradient-based policy search methods, in which the updates of the policy and the expected reward models are masked. Our method benefits from having a high probability of satisfying the given constraints for every policy in the learning process. We validate the proposed algorithm in two continuous tasks. Our findings prove the proposed algorithm is able to neglect risk reward signals, and thereby resolving the desired safety-performance trade-off without having the need for hand-tuning rewards.
- Active vision for robot manipulators using the free energy principleToon Van de Maele, Tim Verbelen, Ozan Catal, Cedric De Boom, Bart DhoedtFRONTIERS IN NEUROROBOTICSBiblioOcclusions, restricted field of view and limited resolution all constrain a robot's ability to sense its environment from a single observation. In these cases, the robot first needs to actively query multiple observations and accumulate information before it can complete a task. In this paper, we cast this problem of active vision as active inference, which states that an intelligent agent maintains a generative model of its environment and acts in order to minimize its surprise, or expected free energy according to this model. We apply this to an object-reaching task for a 7-DOF robotic manipulator with an in-hand camera to scan the workspace. A novel generative model using deep neural networks is proposed that is able to fuse multiple views into an abstract representation and is trained from data by minimizing variational free energy. We validate our approach experimentally for a reaching task in simulation in which a robotic agent starts without any knowledge about its workspace. Each step, the next view pose is chosen by evaluating the expected free energy. We find that by minimizing the expected free energy, exploratory behavior emerges when the target object to reach is not in view, and the end effector is moved to the correct reach position once the target is located. Similar to an owl scavenging for prey, the robot naturally prefers higher ground for exploring, approaching its target once located.
- Self-supervised exploration via latent Bayesian surprisePietro Mazzaglia, Ozan Catal, Tim Verbelen, Bart DhoedtICLR 2021, Self-supervision for Reinforcement Learning, Proceedings
- Leveraging the Bhattacharyya coefficient for uncertainty quantification in deep neural networksPieter Van Molle, Tim Verbelen, Bert Vankeirsbilck, Jonas De Vylder, Bart Diricx, Tom Kimpe, Pieter Simoens, Bart DhoedtNEURAL COMPUTING & APPLICATIONSBiblioModern deep learning models achieve state-of-the-art results for many tasks in computer vision, such as image classification and segmentation. However, its adoption into high-risk applications, e.g. automated medical diagnosis systems, happens at a slow pace. One of the main reasons for this is that regular neural networks do not capture uncertainty. To assess uncertainty in classification, several techniques have been proposed casting neural network approaches in a Bayesian setting. Amongst these techniques, Monte Carlo dropout is by far the most popular. This particular technique estimates the moments of the output distribution through sampling with different dropout masks. The output uncertainty of a neural network is then approximated as the sample variance. In this paper, we highlight the limitations of such a variance-based uncertainty metric and propose an novel approach. Our approach is based on the overlap between output distributions of different classes. We show that our technique leads to a better approximation of the inter-class output confusion. We illustrate the advantages of our method using benchmark datasets. In addition, we apply our metric to skin lesion classification-a real-world use case-and show that this yields promising results.
- Intelligent frame selection as a privacy-friendlier alternative to face recognitionMattijs Baert, Sam Leroux, Pieter SimoensPPAI-21, the 2nd AAAI Workshop on Privacy-Preserving Artificial Intelligence, Proceedings