Multimodality modeling

Our goal is to obtain realistic strutured models from multimodal -possibly dynamic- data to be used in AR systems for interaction management, visualization or annotation. Two projects are described:

  • Acquisition and modeling the vocal tract from multimodal data.

    Having a realistic augmented head displaying both external and internal articulators could help language learning tech- nology progress. The long term aim of the project is the acquisition of articulatory data and the design of a 3D+t articulatory model from various image modalities: ex- ternal articulators are extracted from stereovision data, the tongue shape is acquired through ultrasound imaging, 3D images of all articulators can be obtained with MRI for sustained sounds, magnetic sensors are used to recover the tip of the tongue.

  • Surgical workflow analysis.

    The focus of this work is the development of statistical methods that permit the modeling and monitoring of surgical processes, based on signals available in the operating room. The work has beeb achieved within N. padoy's PhD thesis in collaboration with the Teschnische University of Munich.

Designing a multimodal acquisition system

  • Design of the system, calibration, synchronization and registration procedures:
    • Coupling electromagnetic sensors and ultrasound images for tongue tracking: acquisition set up and preliminary results, ISSP 2006 pdf
    • Details on the system are available here.
  • Registration and processing of articulatory data:
    • Multimodality acquisition of articulatory data and processing, EUSIPCO 2008 pdf
    • Registration of multimodal data for estimating the parameters of an articulatory model, ICASSP 2009 pdf
    • A video of the dynamic fused data.
    • Extracting the tongue contour in MRI images using shape priors: A shape based framework to segmentation of tongue contours from MRI data, ICASSP2010.
    • Studying MRI acquisition protocols of sustained sounds with a multimodal acquisition system, ISSP 2014 pdf
    • Using a biomechanical model for tongue tracking in ultrasound images, ISBMS 2014 pdf

Building a realistic augmented head

We address the problem of obtaining realistic facial animation within the augmented head application. The main idea of this work is to transfer the dynamics learned on the sparse meshes of the face onto a 3D dense mesh acquired with a scanner.

  • Realistic face animation from sparse stereo meshes, Audiovisual Speech Processing 2007 pdf
  • Realistic face animation for audiovisual speech applications: a densification approach driven by sparse stereo meshes, Mirage 2009, Computer Vision / Computer Graphics Collaboration Techniques and Applications pdf
  • A video showing the realistic animation.

Surgical workflow analysis

The focus of this work is the development of statistical methods that permit the modeling and monitoring of surgical processes, based on signals available in the operating room. The goal is to combine low-level signals with high-level information in order to detect events and trigger pre-defined actions. A main application is the development of context-aware operating rooms, providing adaptive user interfaces, better synchronization within the surgery department and automatic documentation.

  • On-line Recognition of Surgical Activity for Monitoring in the Operating Room, Proceedings of the 20th Conference on Innovative Applications of Artificial Intelligence (IAAI 2008) pdf
  • A video showing the annotation of a surgery from the model learned on exemplary recordings.
CNRS INRIA Université de Lorraine LORIA

copyright INRIA / Photos C. Lebedinsky