To that end, we present a curve-based representation of a sequence, based on few joints of a 3D skeleton, and a deformation-based distance function. We further introduce a time-variation model that is specifically designed for assessing the quality of a motion; we refer to a distance function that is based on such a model as~\emph{motion quality distance}. The overall advantages of the proposed approach are 1) lower dimensional yet representative sequence representation and 2) a distance function that emphasizes time variation, the motion quality distance, which is a particularly important property for quality assessment. We validate our approach using a publicly available dataset, SPHERE-StairCase2014 dataset. Qualitative and quantitative results show promising performance. Pose Encoding for Robust Skeleton-Based Action RecognitionDemisse, Girum ; Papadopoulos, Konstantinos ; Aouada, Djamila et alin CVPRW: Visual Understanding of Humans in Crowd Scene, Salt Lake City, Utah, June 18-22, 2018 (2018, June 18)Some of the main challenges in skeleton-based action recognition systems are redundant and noisy pose transformations. Earlier works in skeleton-based action recognition explored different approaches for filtering linear noise transformations, but neglect to address potential nonlinear transformations. In this paper, we present an unsupervised learning approach for estimating nonlinear noise transformations in pose estimates. Our approach starts by decoupling linear and nonlinear noise transformations. While the linear transformations are modelled explicitly the nonlinear transformations are learned from data. Subsequently, we use an autoencoder with L2-norm reconstruction error and show that it indeed does capture nonlinear noise transformations, and recover a denoised pose estimate which in turn improves performance significantly. We validate our approach on a publicly available dataset, NW-UCLA. Deformation Based 3D Facial Expression RepresentationDemisse, Girum ; Aouada, Djamila ; Ottersten, Björn in ACM Transactions on Multimedia Computing, Communications, & Applications (2018)We propose a deformation based representation for analyzing expressions from 3D faces. A point cloud of a 3D face is decomposed into an ordered deformable set of curves that start from a fixed point. We propose a deformation based representation for analyzing expressions from 3D faces. A point cloud of a 3D face is decomposed into an ordered deformable set of curves that start from a fixed point. Subsequently, a mapping function is defined to identify the set of curves with an element of a high dimensional matrix Lie group, specifically the direct product of SE(3). Representing 3D faces as an element of a high dimensional Lie group has two main advantages. First, using the group structure, facial expressions can be decoupled from a neutral face. Second, an underlying non-linear facial expression manifold can be captured with the Lie group and mapped to a linear space, Lie algebra of the group. This opens up the possibility of classifying facial expressions with linear models without compromising the underlying manifold. Alternatively, linear combinations of linearised facial expressions can be mapped back from the Lie algebra to the Lie group. The approach is tested on the BU-3DFE and the Bosphorus datasets. The results show that the proposed approach performed comparably, on the BU-3DFE dataset, without using features or extensive landmark points. The approach is tested on the BU-3DFE and the Bosphorus datasets. The results show that the proposed approach performed comparably, on the BU-3DFE dataset, without using features or extensive landmark points. [less ▲]Detailed reference viewed: 159 (22 UL) Deformation Based Curved Shape RepresentationDemisse, Girum Doctoral thesis (2017)Representation and modelling of an objects' shape is critical in object recognition, synthesis, tracking and many other applications in computer vision. As a result, there is a wide range of approaches in ... [more ▼]Representation and modelling of an objects' shape is critical in object recognition, synthesis, tracking and many other applications in computer vision. As a result, there is a wide range of approaches in formulating representation space and quantifying the notion of similarity between shapes. A similarity metric between shapes is a basic building block in modelling shape categories, optimizing shape valued functionals, and designing a classifier. Consequently, any subsequent shape based computation is fundamentally dependent on the computational efficiency, robustness, and invariance to shape preserving transformations of the defined similarity metric. In this thesis, we propose a novel finite dimensional shape representation framework that leads to a computationally efficient, closed form solution, and noise tolerant similarity distance function. Several important characteristics of the proposed curved shape representation approach are discussed in relation to earlier works. Subsequently, two different solutions are proposed for optimal parameter estimation of curved shapes. Hence, providing two possible solutions for the point correspondence estimation problem between two curved shapes. Later in the thesis, we show that several statistical models can readily be adapted to the proposed shape representation framework for object category modelling. The thesis finalizes by exploring potential applications of the proposed curved shape representation in 3D facial surface and facial expression representation and modelling. Facial Expression Recognition via Joint Deep Learning of RGB-Depth Map Latent RepresentationsOyedotun, Oyebade ; Demisse, Girum ; Shabayek, Abd El Rahman et alin 2017 IEEE International Conference on Computer Vision Workshop (ICCVW) (2017, August 21)Humans use facial expressions successfully for conveying their emotional states. However, replicating such success in the human-computer interaction domain is an active research problem. In this paper, we propose deep convolutional neural network (DCNN) for joint learning of robust facial expression features from fused RGB and depth map latent representations. We posit that learning jointly from both modalities result in a more robust classifier for facial expression recognition (FER) as opposed to learning from either of the modalities independently. Particularly, we construct a learning pipeline that allows us to learn several hierarchical levels of feature representations and then perform the fusion of RGB and depth map latent representations for joint learning of facial expressions. Our experimental results on the BU-3DFE dataset validate the proposed fusion approach, as a model learned from the joint modalities outperforms models learned from either of the modalities. However, replicating such success in the human-computer interaction domain is an active research problem. In this paper, we propose deep convolutional neural network (DCNN) for joint learning of robust facial expression features from fused RGB and depth map latent representations. We posit that learning jointly from both modalities result in a more robust classifier for facial expression recognition (FER) as opposed to learning from either of the modalities independently. Particularly, we construct a learning pipeline that allows us to learn several hierarchical levels of feature representations and then perform the fusion of RGB and depth map latent representations for joint learning of facial expressions. Our experimental results on the BU-3DFE dataset validate the proposed fusion approach, as a model learned from the joint modalities outperforms models learned from either of the modalities. [less ▲]Detailed reference viewed: 296 (51 UL) Deformation Based Curved Shape RepresentationDemisse, Girum ; Aouada, Djamila ; Ottersten, Björn in IEEE Transactions on Pattern Analysis & Machine Intelligence (2017)In this paper, we introduce a deformation based representation space for curved shapes in Rn. Given an ordered set of points sampled from a curved shape, the proposed method represents the set as an ... [more ▼]In this paper, we introduce a deformation based representation space for curved shapes in Rn. Given an ordered set of points sampled from a curved shape, the proposed method represents the set as an element of a finite dimensional matrix Lie group. Variation due to scale and location are filtered in a preprocessing stage, while shapes that vary only in rotation are identified by an equivalence relationship. The use of a finite dimensional matrix Lie group leads to a similarity metric with an explicit geodesic solution. Subsequently, we discuss some of the properties of the metric and its relationship with a deformation by least action. Furthermore, invariance to reparametrization or estimation of point correspondence between shapes is formulated as an estimation of sampling function. Thereafter, two possible approaches are presented to solve the point correspondence estimation problem. Finally, we propose an adaptation of k-means clustering for shape analysis in the proposed representation space. Experimental results show that the proposed representation is robust to uninformative cues, e.g. local shape perturbation and displacement. In comparison to state of the art methods, it achieves a high precision on the Swedish and the Flavia leaf datasets and a comparable result on MPEG-7, Kimia99 and Kimia216 datasets. Similarity Metric For Curved Shapes In Euclidean SpaceDemisse, Girum ; Aouada, Djamila ; Ottersten, Björn in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016 (2016, June 26)In this paper, we introduce a similarity metric for curved shapes that can be described, distinctively, by ordered points. The proposed method represents a given curve as a point in the deformation space, the direct product of rigid transformation matrices, such that the successive action of the matrices on a fixed starting point reconstructs the full curve. In general, both open and closed curves are represented in the deformation space modulo shape orientation and orientation preserving diffeomorphisms. The use of direct product Lie groups to represent curved shapes led to an explicit formula for geodesic curves and the formulation of a similarity metric between shapes by the $L^{2}$-norm on the Lie algebra. Additionally, invariance to reparametrization or estimation of point correspondence between shapes is performed as an intermediate step for computing geodesics. Furthermore, since there is no computation of differential quantities on the curves, our representation is more robust to local perturbations and needs no pre-smoothing. We compare our method with the elastic shape metric defined through the square root velocity (SRV) mapping, and other shape matching approaches Visual and human-interpretable feedback for assisting physical activityGoncalves Almeida Antunes, Michel ; Baptista, Renato ; Demisse, Girum et alin European Conference on Computer Vision (ECCV) Workshop on Assistive Computer Vision and Robotics Amsterdam, (2016)Physical activity is essential for stroke survivors for recovering some autonomy in daily life activities. Post-stroke patients are initially subject to physical therapy under the supervision of a health professional, but due to economical aspects, home based rehabilitation is eventually suggested. In order to support the physical activity of stroke patients at home, this paper presents a system for guiding the user in how to properly perform certain actions and movements. This is achieved by presenting feedback in form of visual information and human-interpretable messages. The core of the proposed approach is the analysis of the motion required for aligning body-parts with respect to a template skeleton pose, and how this information can be presented to the user in form of simple recommendations. Experimental results in three datasets show the potential of the proposed framework. Template-Based Statistical Shape Modelling on Deformation SpaceDemisse, Girum ; Aouada, Djamila ; Ottersten, Björn in 22nd IEEE International Conference on Image Processing (2015)A statistical model for shapes in $\mathbb{R}^2$ or $\mathbb{R}^3$ is proposed. Shape modelling is a difficult problem mainly due to the non-linear nature of its space. Our approach considers curves as shape contours, and models their deformations with respect to a deformable template shape. Contours are uniformly sampled into a discrete sequence of points. Hence, the deformation of a shape is formulated as an action of transformation matrices on each of these points. A parametrized stochastic model based on Markov process is proposed to model shape variability in the deformation space. The model's parameters are estimated from a labeled training dataset. Moreover, a similarity metric based on the Mahalanobis distance is proposed. Subsequently, the model has been successfully tested for shape recognition, synthesis, and retrieval. Our approach considers curves as ... [more ▼]A statistical model for shapes in $\mathbb{R}^2$ or $\mathbb{R}^3$ is proposed. Shape modelling is a difficult problem mainly due to the non-linear nature of its space. Our approach considers curves as shape contours, and models their deformations with respect to a deformable template shape. Contours are uniformly sampled into a discrete sequence of points. Hence, the deformation of a shape is formulated as an action of transformation matrices on each of these points. A parametrized stochastic model based on Markov process is proposed to model shape variability in the deformation space. The model's parameters are estimated from a labeled training dataset. Moreover, a similarity metric based on the Mahalanobis distance is proposed. Subsequently, the model has been successfully tested for shape recognition, synthesis, and retrieval. Interpreting Thermal 3D Models of Indoor Environments for Energy EfficiencyDemisse, Girum ; Borrman, Dorit; Nuchter, Andreasin In Journal of Intelligent and Robotic Systems, Springer (2014)In recent years 3D models of buildings are used in maintenance and inspection, preservation, and other building related applications. However, the usage of these models is limited because most models are pure representations with no or little associated semantics. In this paper we present a pipeline of techniques used for interior interpretation, object detection, and adding energy related semantics to windows of a 3D thermal model. A sequence of algorithms is presented for building the fundamental semantics of a 3D model. Among other things, these algorithms enable the system to differentiate between objects in a room and objects that are part of the room, e.g. floor, windows. Subsequently, the thermal information is used to construct a stochastic mathematical model - namely Markov Random Field - of the temperature distribution of the detected windows. As a result, the MAP (Maximum a posteriori) framework is used to further label the windows as either open, closed or damaged based upon their temperature distribution. The experimental results showed the robustness of the techniques. Furthermore, a strategy to optimize the free parameters is described, in cases where there is ample training dataset. Among other things, these algorithms enable the system to differentiate between objects in a room and objects that are part of the room, e.g. floor, windows. Subsequently, the thermal information is used to construct a stochastic mathematical model - namely Markov Random Field - of the temperature distribution of the detected windows. As a result, the MAP (Maximum a posteriori) framework is used to further label the windows as either open, closed or damaged based upon their temperature distribution. The experimental results showed the robustness of the techniques. Furthermore, a strategy to optimize the free parameters is described, in cases where there is ample training dataset. [less ▲]Detailed reference viewed: 99 (21 UL) 1