Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test
2024
Article
hi
Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and parameter tuning. To overcome these challenges, we propose an easily implemented framework that can directly handle heterogeneous data sources for classification tasks. Our data-versus-data approach automatically quantifies distinctive differences in distributions in a high-dimensional space via kernel two-sample testing between two sets extracted from multimodal data (e.g., images, sounds, haptic signals). We demonstrate the effectiveness of our technique by benchmarking against expertly engineered classifiers for visual-audio-haptic surface recognition due to the industrial relevance, difficulty, and competitive baselines of this application; ablation studies confirm the utility of key components of our pipeline. As shown in our open-source code, we achieve 97.2% accuracy on a standard multi-user dataset with 108 surface classes, outperforming the state-of-the-art machine-learning algorithm by 6% on a more difficult version of the task. The fact that our classifier obtains this performance with minimal data processing in the standard algorithm setting reinforces the powerful nature of kernel methods for learning to recognize complex patterns. Note to Practitioners—We demonstrate how to apply the kernel two-sample test to a surface-recognition task, discuss opportunities for improvement, and explain how to use this framework for other classification problems with similar properties. Automating surface recognition could benefit both surface inspection and robot manipulation. Our algorithm quantifies class similarity and therefore outputs an ordered list of similar surfaces. This technique is well suited for quality assurance and documentation of newly received materials or newly manufactured parts. More generally, our automated classification pipeline can handle heterogeneous data sources including images and high-frequency time-series measurements of vibrations, forces and other physical signals. As our approach circumvents the time-consuming process of feature engineering, both experts and non-experts can use it to achieve high-accuracy classification. It is particularly appealing for new problems without existing models and heuristics. In addition to strong theoretical properties, the algorithm is straightforward to use in practice since it requires only kernel evaluations. Its transparent architecture can provide fast insights into the given use case under different sensing combinations without costly optimization. Practitioners can also use our procedure to obtain the minimum data-acquisition time for independent time-series data from new sensor recordings.
Author(s): | Behnam Khojasteh and Friedrich Solowjow and Sebastian Trimpe and Katherine J. Kuchenbecker |
Journal: | IEEE Transactions on Automation Science and Engineering |
Volume: | 21 |
Number (issue): | 3 |
Pages: | 4432--4447 |
Year: | 2024 |
Month: | July |
Department(s): | Haptic Intelligence |
Research Project(s): |
Surface Interactions as Probability Distributions in Embedding Spaces
|
Bibtex Type: | Article (article) |
Paper Type: | Journal |
DOI: | 10.1109/TASE.2023.3296569 |
State: | Published |
BibTex @article{Khojasteh23-TASE-Recognition, title = {Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test}, author = {Khojasteh, Behnam and Solowjow, Friedrich and Trimpe, Sebastian and Kuchenbecker, Katherine J.}, journal = {IEEE Transactions on Automation Science and Engineering}, volume = {21}, number = {3}, pages = {4432--4447}, month = jul, year = {2024}, doi = {10.1109/TASE.2023.3296569}, month_numeric = {7} } |