Sandbox: research, tech, dev.
A collection of projects on: Artificial Intelligence, Neuroscience, Neural Engineering, Big Data, Robotics and Coding.
After earning a PhD in Neuroscience and Robotics from TUM in 2016, I've spent one more year with the TUM Center of Competence Neuroengineering before joining Huawei Research Center in Munich. Since 2017 I am Senior AI and ML Research Engineer with Huawei's largest research center outside China. At the same time, I lead the Audi Konfuzius-Institut Ingolstadt Lab, a Sino-German research initiative focused on combining modern AI and VR technology for applications ranging from sports technology to medical rehabilitation. Each term, I teach AI and ML for undergrads in TH Ingolstadt.
A recent curriculum vitae:
VIRTOOAIR: VIrtual Reality TOOlbox for Avatar Intelligent Reconstruction
The project focuses on designing and developing a Deep Learning framework for improved avatar representations in immersive collaborative virtual environments. The proposed infrastructure will be built on a modular architecture tackling: a) a predictive avatar tracking module; b) an inverse kinematic learning module; c) an efficient data representation and compression module.
In order to perform precise predictive tracking of the body without using a camera motion capture system we need proper calibration data of the 18 degrees-of-freedom provided by the VR devices, namely the HMD and the two hand controllers. Such a calibration procedure involves the mathematical modelling of a complex geometrical setup. As a first component of VIRTOOAIR we propose a novel position calibration method using deep artificial neural networks, as depicted in the next figure.
The second component in the VIRTOOAIR toolbox is the inverse kinematics learner, generically described in the following diagram. The problem of learning of inverse kinematics in VR avatars interactions is useful when the kinematics of the head, body or controllers are not accurately available, when Cartesian information is not available from camera coordinates, or when the computation complexity of analytical solutions becomes too high.
Data and bandwidth constraints are substantial in remote VR environments. However, such problems can be solved through compression techniques and network topologies advances. VIRTOOAIR proposes to tackle this problem through its third component, a neural network data representation (compression and reconstruction) module, described in the following diagram.
VR MOTION RECONSTRUCTION BASED ON A VR TRACKING SYSTEM AND A SINGLE RGB CAMERA
Our preliminary results demonstrate the advantages of our system’s avatar pose reconstruction. This is mainly determined by the use of a powerful learning system, which offers significantly better results than existing heuristic solutions for inverse kinematics. Our system supports the paradigm shift towards learning systems capable to track full-body avatars inside Virtual Reality without the need of expensive external tracking hardware. The following figure shows preliminary results of our proposed reconstruction system.
For the upper body reconstruction, the semantically higher VR tracking system data is used. The lower body parts are reconstructed using state-of-the-art deep learning mechanism which allows pose recovery in an end-to-end manner. The system can be split in five different parts: tracking data acquisition, inverse kinematic regression, image processing, end-to-end recovery and visualization.
Peer reviewed papers
A. Becher, C. Axenie, T. Grauschopf, VIRTOOAIR: VIrtual Reality TOOlbox for Avatar Intelligent Reconstruction, 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR2018).
Online Distributed Machine Learning on Streams
Streams are sequences of events (i.e. tuples containing various types of data) that are generated by various sources (e.g. sensors, machines or humans) in a chronologically ordered fashion. The stream processing paradigm involves applying business functions over the events in the stream. A typical approach to stream processing assumes accumulating such events within certain boundaries at a given time and applying functions on the resulting collection. Such transient event collections are termed windows.
Stream processing paradigm simplifies parallel software and hardware by restricting the parallel computation that can be performed. Given a sequence of data (a stream), a series of operations (functions) is applied to each element in the stream, in a declarative way, we specify what we want to achieve and not how.
Big Data Stream Online Learning is more challenging than batch or offline learning, since the data may not preserve the same distribution over the lifetime of the stream. Moreover, each example coming in a stream can only be processed once, or needs to be summarized with a small memory footprint, and the learning algorithms must be efficient.
The need for Online Machine Learning
How to compute the entropy of a collection of infinite data, where the domain of the variables can be huge and the number of classes of objects is not known a priori? How to maintain the k-most frequent items in a retail data warehouse with 3 TB of data, 100s of GB of new sales records updated daily with 1000000s different items? What becomes of statistical computations when the learner can only afford one pass through each data sample because of time and memory constraints; when the learner has to decide on-the-fly what is relevant and process it and what is redundant and could be discarded?
This project focuses on developing new Online Machine Learning algorithms that run distributedly on clusters using a stream processor. The characteristics of the streaming data entail a new vision due to the fact that: - Data are made available through unlimited streams that continuously flow, eventually at high speed, over time; - The underlying regularities may evolve over time rather than being stationary; - The data can no longer be considered as independent and identically distributed; - The data are now often spatially as well as time situated.
Peer reviewed papers
C. Axenie et al., STARLORD: Sliding window Temporal Accumulate-Retract Learning for Online Reasoning on Datastreams, 2018 IEEE International Conference on Machine Learning and Applications (ICMLA2018). (submitted)
D. Foroni, C. Axenie et al., Moira: A Goal-Oriented Incremental Machine Learning Approach to Dynamic Resource Cost Estimation in Distributed Stream Processing Systems, International Workshop on Real-Time BI and Analytics, VLDB 2018.
Deep Learning for Autonomous Systems
The current research project aims at exploring object detection algorithms using a novel neuromorphic vision sensor with deep learning neural networks for autonomous electric cars. More precisely, this work will be conducted with the Schanzer Racing Electric (SRE) team at the Technical University of Ingolstadt. SRE is a team of around 80 students, that design, develop and manufacture an all-electric racing car every year to compete in Formula Student Electric. The use of neuromorphic vision sensors together with deep neural networks for object detection is the innovation that the project proposes. The project was supported by nVidia through a GPU Research Grant.
Autonomous driving is a highly discussed topic, but in order to operate autonomously, the car needs to sense its environment. Vision provides the most informative and structured modality capable of grounding perception in autonomous vehicles. In the last decades, classical computer vision algorithms were used to not only locate relevant objects in the scene, but also to classify them. But in recent years, major improvements were reached when first deep learning object detectors were developed. In general, such object detectors use a convolutional feature extractor as their basis. Due to the multitude of feature extraction algorithms, there are numerous combinations of feature extractor and object detectors, which influences a system designer’s approach. One of the most interesting niches is the analysis of traffic scenarios. Such scenarios require fast computation of features and classification for decision making.
Our approach to object detection, recognition and decision making aims at “going away from frames”. Instead of using traditional RGB cameras we aim at utilizing dynamic vision sensors (DVS - https://inivation.com/dvs/). Dynamic vision sensors mimic basic characteristics of human visual processing (i.e. neuromorphic vision) and have created a new paradigm in vision research. Similar to photoreceptors in the human retina, a single DVS pixel (receptor) can generate events in response to a change of detected illumination. Events encode dynamic features of the scene, e.g. moving objects, using a spatio-temporal set of events. Since DVS sensors drastically reduce redundant pixels (e.g. static background features) and encode objects in a frame-less fashion with high temporal resolution (about 1 μs), it is well suited for fast motion analyses and tracking. DVS are capable of operating in uncontrolled environments with varying lighting conditions because of their high dynamic range of operation (120 dB).
As traffic situations yield fast detection and precise estimation, we plan to use such an event-based visual representation together with two convolutional networks proved to be suitable for the task. The two algorithms we plan to explore are the Single Shot Multibox Detector (SSMD), which got popular for its fast computation speed, and the Faster Region-Based Convolutional Neural Network (Faster RCNN), which is known to be a slow but performant detector.
The project tries to set a fundamental exploratory work, both in terms of sensory data for environment perception and also neural network architectures for the considered task. The experiments aim at evaluating also the points where better accuracy can only be obtained by sacrificing computation time. The two architecture we chose are opposite. The first one is the SSMD network with Inception V2 as a feature extractor. This network has a low computation time with acceptable accuracy. The correspondent network is the Faster RCNN with ResNet-101 as its feature extractor. Its accuracy is one of the highest, whereas the computation time is relatively slow. Whereas features are common for frame-based computer vision problems, no solution exists yet to determine unique features in event streams. This is the first step towards more complex algorithms operating on the sparse event-stream. The possibility to create unique filter responses gives rise to the notion of temporal features. This opens the exploratory work we envision in this project, to investigate the use of SSMD and Faster RCNN networks using event-based input in a natively parallel processing pipeline.
The initial step was carried in training a single shot detector (mobilenet) for the cone detection, a stage in the preparation the Formula Electric competition. Experiments were carried on an nVidia GTX1080Ti GPU using TensorRT. The performance evaluation is shown in the following diagrams.
Adaptive Neuromorphic Sensorimotor Control
Efficient sensorimotor processing is inherently driven by physical real-world constraints that an acting agent faces in its environment. Sensory streams contain certain statistical dependencies determined by the structure of the world, which impose constraints on a system’s sensorimotor affordances.
This limits the number of possible sensory information patterns and plausible motor actions. Learning mechanisms allow the system to extract the underlying correlations in sensorimotor streams.
This research direction focused on the exploration of sensorimotor learning paradigms for embedding adaptive behaviors in robotic system and demonstrate flexible control systems using neuromorphic hardware and neural-based adaptive control. I employed large-scale neural networks for gathering and processing complex sensory information, learning sensorimotor contingencies, and providing adaptive responses.
To investigate the properties of such systems, I developed flexible embodied robot platforms and integrate them within a rich tool suite for specifying neural algorithms that can be implemented in embedded neuromorphic hardware.
The mobile manipulator I developed at NST for adaptive sensorimotor systems consists of an omni-directional (holonomic) mobile manipulation platform with embedded low-level motor control and multimodal sensors.
The on-board micro-controller receives desired commands via WiFi and continuously adapts the platform's velocity controller. The robot’s integrated sensors include wheel encoders for estimating odometry, a 9DoF inertial measurement unit, a proximity bump-sensor ring and three event-based embedded dynamic vision sensors (eDVS) for visual input.
The mobile platform carries an optional 6 axis robotic arm with a reach of >40cm. This robotic arm is composed of a set of links connected together by revolute joints and allows lifting objects of up to 800 grams. The mobile platform contains an on-board battery of 360 Wh, which allows autonomous operation for well above 5h.
Peer reviewed journal papers
F. Mirus, C. Axenie, T. C. Stewart, J. Conradt, Neuromorphic Sensorimotor Adaptation for Robotic Mobile Manipulation: From Sensing to Behaviour, Cognitive Systems Research, 2018.
I. Sugiarto, C. Axenie, J. Conradt, FPGA-based Hardware Accelerator for an Embedded Factor Graph with Configurable Optimization, Journal of Circuits, Systems and Computers, 2018.
Synthesis of Distributed Cognitive Systems - Learning and Development of Multisensory Integration
My research interest is in developing sensor fusion mechanisms for robotic applications. In order to extend the interacting areas framework a second direction in my research focuses on learning and development mechanisms.
Human perception improves through exposure to the environment. A wealth of sensory streams which provide a rich experience continuously refine the internal representations of the environment and own state. Furthermore, these representations determine more precise motor planning.
An essential component in motor planning and navigation, in both real and artificial systems, is egomotion estimation. Given the multimodal nature of the sensory cues, learning crossmodal correlations improves the precision and flexibility of motion estimates.
During development, the biological nervous system must constantly combine various sources of information and moreover track and anticipate changes in one or more of the cues. Furthermore, the adaptive development of the functional organisation of the cortical areas seems to depend strongly on the available sensory inputs, which gradually sharpen their response, given the constraints imposed by the cross-sensory relations.
Learning processes which take place during the development of a biological nervous system enable it to extract mappings between external stimuli and its internal state. Precise egomotion estimation is essential to keep these external and internal cues coherent given the rich multisensory environment. In this work we present a learning model which, given various sensory inputs, converges to a state providing a coherent representation of the sensory space and the cross-sensory relations.
The model is based on Self-Organizing-Maps and Hebbian learning (see Figure 1) using sparse population coded representations of sensory data. The SOM is used to represent the sensory data, while the Hebbian linkage extracts the coactivation pattern given the input modalitites eliciting peaks of activity in the neural populations. The model was able to learn the intrinsic sensory data statistics without any prior knowledge (see Figure 2).
The developed model, implemented for 3D egomotion estimation on a quadrotor, provides precise estimates for roll, pitch and yaw angles (setup depicted in Figure 3a, b).
Given relatively complex and multimodal scenarios in which robotic systems operate, with noisy and partially observable environment features, the capability to precisely and timely extract estimates of egomotion critically influences the set of possible actions.
Utilising simple and computationally effective mechanisms, the proposed model is able to learn the intrinsic correlational structure of sensory data and provide more precise estimates of egomotion (see Figure 4a, b).
Moreover, by learning the sensory data statistics and distribution, the model is able to judiciously allocate resources for efficient representation and computation without any prior assumptions and simplifications. Alleviating the need for tedious design and parametrisation, it provides a flexible and robust approach to multisensory fusion, making it a promising candidate for robotic applications.
Peer reviewed journal papers
C. Axenie, C. Richter, J. Conradt, A Self-Synthesis Approach to Perceptual Learning for Multisensory Fusion in Robotics, Sensors Journal, 2016 (PDF)
Peer reviewed conference papers
C. Axenie, J. Conradt, Learning Sensory Correlations for 3D Egomotion Estimation, International Conf. on Biomimetics and Biohybrid Systems, 2015
C. Axenie, J. Conradt, A model for development and emergence in multisensory integration, Bernstein Conference on Computational Neuroscience, Göttingen, 2014. (PDF)
Synthesis of Distributed Cognitive Systems - Interacting Cortical Maps for Environmental Interpretation
The core focus of my research interest is in developing sensor fusion mechanisms for robotic applications. These mechanisms enable a robot to obtain a consistent and global percept of its environment using available sensors by learning correlations between them in a distributed processing scheme inspired by cortical mechanisms.
Environmental interaction is a significant aspect in the life of every physical entity, which allows the updating of its internal state and acquiring new behaviors. Such interaction is performed by repeated iterations of a perception-cognition-action cycle, in which the entity acquires and memorizes relevant information from the noisily and partially observable environment, to develop a set of applicable behaviors (see Figure 5).
This recently started research project is in the area of mobile robotics; and more specifically in explicit methods applicable for acquiring and maintaining such environmental representations. State-of-the-art implementations build upon probabilistic reasoning algorithms, which typically aim at optimal solutions with the cost of high processing requirements.
In this project I am developing an alternative, neurobiologically inspired method for real-time interpretation of sensory stimuli in mobile robotic systems: a distributed networked system with inter-merged information storage and processing that allows efficient parallel reasoning. This networked architecture will be comprised of interconnected heterogeneous software units, each encoding a different feature about the state of the environment that is represented by a local representation (see Figure 6).
Such extracted pieces of environmental knowledge interact by mutual influence to ensure overall system coherence. A sample instantiation of the developed system focuses on mobile robot heading estimation (see Figure 7). In order to obtain a robust and unambiguous description of robot’s current orientation within its environment inertial, proprioceptive and visual cues are fused (see image). Given available sensory data, the network relaxes to a globally consistent estimate of the robot's heading angle and position.
I. Sugiarto, C. Axenie, J. Conradt, From Adaptive Reasoning to Cognitive Factory: Bringing Cognitive Intelligence to Manufacturing Technology, International Journal of Industrial Research and Applied Engineering (2016) (PDF)
I. Susnea, C. Axenie, Cognitive Maps for Indirect Coordination of Intelligent Agents, Studies in Informatics and Control Vol. 24 (2015) (PDF)
C. Axenie, J. Conradt, Cortically inspired sensor fusion network for mobile robot egomotion estimation, Robotics and Autonomous Systems (2014) (PDF)
Peer reviewed conference papers
C. Axenie, J. Conradt, Cortically Inspired Sensor Fusion Network for Mobile Robot Heading Estimation, International Conf. on Artificial Neural Networks (ICANN), 2013, pp. 240-47. (PDF)
C. Axenie, M. Firouzi, M.Mulas, J. Conradt, Multimodal sensor fusion network for mobile robot egomotion estimation, Bernstein Conference on Computational Neuroscience, Göttingen, 2014. (PDF)
M. Firouzi, C. Axenie, J. Conradt, Multi-sensory cue integration with reliability encoding, using Line Attractor Dynamics, searching for optimality, Bernstein Conference on Computational Neuroscience, Göttingen, 2014. (PDF)
C. Axenie, M. Firouzi, J. Conradt, Multisensory Integration Network for Mobile Robot Self-motion Estimation, Bernstein Conference on Computational Neuroscience, Tübingen, 2013. (PDF)
C. Axenie, J. Conradt, Synthesis of Distributed Cognitive Systems: Interacting Maps for Sensor Fusion, Bernstein Conference on Computational Neuroscience, München, 2012. (PDF)
Adaptive Nonlinear Control Algorithm for Fault Tolerant Robot Navigation
Today’s trends in control engineering and robotics are blending gradually into a slightly challenging area, the development of fault tolerant real-time applications. Hence, applications should timely deliver synchronized data-sets, minimize latency in their response and meet their performance specifications in the presence of disturbances. The fault tolerant behavior in mobile robots refers to the possibility to autonomously detect and identify faults as well as the capability to continue operating after a fault occurred. This work introduces a real-time distributed control application with fault tolerance capabilities for differential wheeled mobile robots (see Figure 8).
Furthremore, the application was extended to introduce a novel implementation for limited sensor mobile robots environment mapping. The developed algorithm is a SLAM implementation. It uses real time data acquired from the sonar ring and uses this information to feed the mapping module for offline mapping (see Figure 9).
The latter is running on top of the real time fault tolerant control application for mobile robot trajectory tracking operation (see Figures 10, 11).
Along the developed application, the mechanical design, electronics design and implementation were made as part of the BA and MA thesis projects.
Peer reviewed conference papers
Axenie Cristian, Stancu Alexandru, Zanoschi Aurelian, Pascalin Andrei, Perjeru Marius, Maftei Florentina, A Client-Server Based Real-Time Control Tool for Complex Distributed Systems, Proc. of 9th Real-Time Linux Workshop, Linz, Austria, November 2007 (PDF)
Cristian Axenie, “Mobile Robot Fault Tolerant Control. Introducing ARTEMIC.” In the 9th International Conference on Signal Processing, Robotics and Automation WSEAS Conference Proceedings Included in ISI/SCI Web of Science and Web of Knowledge, University of Cambridge, UK, February 2010 (PDF)
Cristian Axenie, “A New Approach in Mobile Robot Fault Tolerant Control. Minimizing Costs and Extending Functionality”, Included in ISI / SCI (Web of Science) WSEAS TRANSACTIONS ON SYSTEMS AND CONTROL 2010 (PDF)
Cristian Axenie, Cernega Daniela, “Mobile Robot Fault Tolerant Control”, IEEE/IACSIT ICIEE 2010 (International Conference on Information and Electronics Engineering), Shanghai, China, June 2010 (PDF)
Cristian Axenie, Cernega Daniela, “ Adaptive Sliding Mode Controller Design for Mobile Robot Fault Tolerant Control”, 19th IEEE International Workshop on Robotics in Alpe-Adria-Danube Region, Budapest, Hungary, June 2010 (PDF)
Cristian Axenie, Razvan Solea, “Real Time Control Design for Mobile Robot Fault Tolerant Control”, 2010 IEEE/ASME International Conference on Mechatronic and Embedded Systems and Applications, July 15-17, 2010, Qingdao, ShanDong, China (PDF)