Browsing

Publication Tag: Signal Processing

An overview of all publications that have the tag you selected.

2022
3 citations
Proximally Sensitive Error for Anomaly Detection and Feature Learning
A. Gudi, F. Büttner, J. van Gemert
Mean squared error is widely used to measure differences between multi-dimensional entities, including images. However, MSE lacks local sensitivity as it doesn’t consider the spatial arrangement of pixel differences, which is crucial for structured data like images. Such spatial arrangements provide information about the source of differences; therefore, an error function that incorporates the location of errors can offer a more meaningful distance measure. We introduce Proximally Sensitive Error , suggesting that emphasizing regions in the error measure can highlight semantic differences between images over syntactic or random deviations. We demonstrate that this emphasis can be leveraged for anomaly or occlusion detection. Additionally, we explore its utility as a loss function to help models focus on learning representations of semantic objects instead of minimizing syntactic reconstruction noise.
2004
5 citations
Real time automatic scene classification
M. Israël, E.L. van den Broek, P. van der Putten, M.J. den Uyl
This work, part of the EU VICAR and SCOFI projects, aimed to develop a real-time video indexing, classification, annotation, and retrieval system. The authors introduced a generic approach for visual scene recognition using “typed patches”—groups of adjacent pixels characterized by local pixel distribution, brightness, and color. Each patch is described using an HSI color histogram and texture features. A fixed grid overlays the image, segmenting each cell into patches categorized by a classifier. Frequency vectors of these classified patches are concatenated to represent the entire image. Testing on eight scene categories from the Corel database showed 87.5% accuracy in patch classification and 73.8% in scene classification. The method’s advantages include low computational complexity and versatility for image classification, segmentation, and matching. However, manual classification of training patches is a drawback, prompting the development of algorithms for automatic extraction of relevant patch types. The approach was implemented in the VICAR project’s video indexing system for the Netherlands Institute for Sound and Vision and in the SCOFI project’s real-time Internet pornography filter, achieving 92% accuracy with minimal overblocking and underblocking.
2004
29 citations
Automating the Construction of Scene Classifiers for Content-Based Video Retrieval
M. Israël, E.L. van den Broek, P. van der Putten, M.J. den Uyl
This paper introduces a real-time automatic scene classifier within content-based video retrieval. In the proposed approach, end users like documentalists, not image processing experts, build classifiers interactively by simply indicating positive examples of a scene. Classification consists of a two-stage procedure: first, small image fragments called patches are classified; second, frequency vectors of these patch classifications are fed into a second classifier for global scene classification . The first-stage classifiers can be seen as a set of highly specialized, learned feature detectors, serving as an alternative to having an image processing expert determine features a priori. The paper presents results from experiments on a variety of patch and image classes. The scene classifier has been used successfully within television archives and for Internet porn filtering.
2006
15 citations
Learning a Sparse Representation from Multiple Still Images for On-Line Face Recognition in an Unconstrained Environment
J.W.H. Tangelder, B.A.M. Schouten
In a real-world environment a face detector can be applied to extract multiple face images from multiple video streams without constraints on pose and illumination. The extracted face images will have varying image quality and resolution. Moreover, also the detected faces will not be precisely aligned. This paper presents a new approach to on-line face identification from multiple still images obtained under such unconstrained conditions. Our method learns a sparse representation of the most discriminative descriptors of the detected face images according to their classification accuracies. On-line face recognition is supported using a single descriptor of a face image as a query. We apply our method to our newly introduced BHG descriptor, the SIFT descriptor, and the LBP descriptor, which obtain limited robustness against illumination, pose and alignment errors. Our experimental results using a video face database of pairs of unconstrained low resolution video clips of ten subjects, show that our method achieves a recognition rate of 94% with a sparse representation containing 10% of all available data, at a false acceptance rate of 4%.
2007
14 citations
Distance Measures for Gabor Jets-Based Face Authentication: A Comparative Evaluation
D. González-Jiménez, M. Bicego, J.W.H. Tangelder, B.A.M. Schouten, O. Ambekar, J.L. Alba-Castro, E. Grosso, M. Tistarelli
Local Gabor features have been widely used in face recognition systems. Once the sets of jets have been extracted from the two faces to be compared, a proper measure of similarity between corresponding features should be chosen. For instance, in the well-known Elastic Bunch Graph Matching approach and other Gabor-based face recognition systems, the cosine distance was used as a measure. In this paper, we provide an empirical evaluation of seven distance measures for comparison, using a recently introduced face recognition system, based on Shape Driven Gabor Jets . Moreover, we evaluate different normalization factors that are used to pre-process the jets. Experimental results on the BANCA database suggest that the concrete type of normalization applied to jets is a critical factor, and that some combinations of normalization and distance achieve better performance than the classical cosine measure for jet comparison.
2004
1808 citations
A survey of content based 3D shape retrieval methods
J.W.H. Tangelder, R.C. Veltkamp
Recent developments in techniques for modeling, digitizing and visualizing 3D shapes has led to an explosion in the number of available 3D models on the Internet and in domain-specific databases. This has led to the development of 3D shape retrieval systems that, given a query object, retrieve similar 3D objects. For visualization, 3D shapes are often represented as a surface, in particular polygonal meshes,forexampleinVRMLformat.Oftenthesemodelscontainholes,intersecting polygons, are not manifold, and do not enclose a volume unambiguously. On the contrary, 3D volume models, such as solid models produced by CAD systems, or voxels models, enclose a volume properly. This paper surveys the literature on methods for content based 3D retrieval, taking into account the applicability to surface models as well as to volume models. The methods are evaluated with respect to several requirements of content based 3D shape retrieval, such as: (1) shape representation requirements, (2) properties of dissimilarity measures, (3) efficiency, (4) discrimination abilities, (5) ability to perform partial matching, (6) robustness, and (7) necessity of pose normalization. Finally, the advantages and limitations of the several approaches in content based 3D shape retrieval are discussed.
2012
8 citations
User assisted stereo image segmentation
H.E. Tasli, A.A. Alatan
The wide availability of stereoscopic 3D displays created a considerable market for content producers. This encouraged researchers to focus on methods to alter and process the content for various purposes. This study concentrates on user assisted image segmentation and proposes a method to extend previous techniques on monoscopic image segmentation to stereoscopic footage with minimum effort. User assistance is required to indicate the representative locations of an image as object and background regions. An MRF based energy minimization technique is utilized where user inputs are applied only on one of the stereoscopic pairs. A key contribution of the proposed study is the elimination of dense disparity estimation by introducing a sparse feature matching idea. Segmentation results are evaluated by objective metrics on a ground truth stereo segmentation dataset and it can be concluded that competitive results with minimum user interaction have been obtained even without dense disparity estimation.
2014
124 citations
Remote PPG based vital sign measurement using adaptive facial regions
H.E. Tasli, A. Gudi, M. den Uyl
This paper introduces a remote photoplethysmography technique that analyzes human skin color variations to monitor vital signs, such as average heart rate and its variability. Utilizing a non-invasive video camera, the method employs facial appearance modeling to stabilize color variations in selected facial regions during signal acquisition. A novel signal processing approach is presented to extract the periodic components of raw color signals for accurate heart rate estimation. The authors collected a ground truth dataset using a PPG instrument attached to the subject’s skin and demonstrated a strong correlation between the estimated heart rate and the ground truth values.
2014
9 citations
Integrating Remote PPG in Facial Expression Analysis Framework
H.E. Tasli, A. Gudi, M. den Uyl
This demonstration paper presents the FaceReader framework, which analyzes human face images and skin color variations to observe facial expressions and vital signs, including average heart rate , heart rate variability , stress levels, and confidence levels. Remote monitoring of facial and vital signs can be beneficial for a wide range of applications. FaceReader utilizes active appearance modeling for facial analysis and novel signal processing techniques for heart rate and variability estimation. The performance has been objectively evaluated, and psychological guidelines for stress measurements are incorporated into the framework for analysis.
2016
44 citations
Human Pose Estimation in Space and Time using 3D CNN
A. Grinciunaite, A. Gudi, E. Tasli, M. Den Uyl
This paper explores the capabilities of convolutional neural networks to deal with a task that is easily manageable for humans: perceiving 3D pose of a human body from varying angles. However, in our approach, we are restricted to using a monocular vision system. For this purpose, we apply a convolutional neural network approach on RGB videos and extend it to three dimensional convolutions. This is done via encoding the time dimension in videos as the 3rd dimension in convolutional space, and directly regressing to human body joint positions in 3D coordinate space. This research shows the ability of such a network to achieve state-of-the-art performance on the selected Human3.6M dataset, thus demonstrating the possibility of successfully representing temporal data with an additional dimension in the convolutional operation.

Request a free trial

Get your free example report

Get your free whitepaper