A wide range of topics related to sound, vision, multimedia technologies, multimodal interfaces, and many others, are the subject of research in the Multimedia Systems Department. Solutions developed by MSD were presented during a number of exhibitions and many of them were awarded. Most of the developed solutions are also patented. The results of research are published in scientific journals and presented during scientific conferences. The publications, conference papers and patents are listed in the Database of references.

The most important topics of the current research are listed below.

  • Studio technology:
    • audio and video recording
    • signal processing
    • editing and mastering
    • post-production of audio and video
    • virtual reality
    • multichannel sound
  • Sound analysis and processing:
    • restoration of audio recordings
    • speech recognition
    • sound synthesis
    • algorithms for enhancing speech intelligibility and quality
    • spatial filtration and sound source localization
    • solutions assisiting persons with hearing and/or speech disorders
  • Image and video analysis and processing:
    • object recognition in video
    • detection and tracking of moving objects
    • visual speech recognition
    • analysis of images from ToF, thermal and infrared cameras
    • computer character animation
    • motion capture systems
  • Multimodal human-machine interfaces:
    • interfaces for disabled people
    • eye-gazin interfaces for controlling a computer
    • gesture-driven interfaces
    • voice-driven interfaces
    • analysis of brain waves
  • Security technologies:
    • biometric recognition of persons
    • signature validation
    • detection of security threats in cameras
    • detection of audio events
  • Information technology systems:
    • environmental monitoring, noise maps
    • multimedia telemedicine systems – hearing and eyesight diagnosis
  • Implementations of algorithms for multimedia processing
    • digital signal processors (DSP)
    • development boards and embedded systems (Intel Galileo, Raspberry Pi, etc.)
    • parallel processing platforms (e.g GPU)
    • data processing on supercomputers
  • Multimedia applications of machine learning
    • cognitive systems
    • recognition and classification of sound and images
  • Musical acoustics
    • recognition of musical sound and phrases
    • subjective assessment processing
    • listening tests
    • singing voice quality assessment
  • Sound reinforcement:
    • acoustical design of rooms
    • reinfocement systems design
    • acoustic adaptation of rooms
    • acoustic measurements in rooms
  • Mobile technologies:
    • diagnostics and health monitoring
    • novel methods of human communication

Research projects

Multimedia Systems Department participates in many research projects, both European projects and the ones funded by Polish Ministry of Science and Education and other Polish institutions.

Polish projects - ongoing

IDENT – Multimodal, biometric system for verification of bank client identity. The aim of the project is to develop a technology for automatic identity verification, resulting in high accuracy of the verification and increased efficiency of client verification systems. The project assumes developing a multimodal system, consisting of the hardware layer and a specialized software for data acquisition from various sensors, data processing and fusion, leading to a reliable verification of a bank client. The developed technology will be validated on a group of 10 000 persons.

HCIBRAIN – methods of human-machine interaction for diagnosis and stimulation of patients with severe brain damages. The project aims to develop an integrated multimodal system for stimulation of patients with brain damage, and for recording ABR, EEG and ERP signals, as well as gaze tracking. A validated procedure for diagnosis and polysensorial cognitive therapy will be developed, constituting an efficient and widely available approach to diagnosis and rehabilitation of patients who are unable to communicate, mainly those in coma. Six specialized medical centers will take part in evaluation of the developed prototype.

INPREDO – the main aim of the project is to develop an intelligent system for determining the optimal traffic speed limits. A set of tools will be developed for providing guidelines on the allowed traffic speed depending on various conditions. Detailed recommendations concerning criteria and procedures of setting traffic speed limits will also be created. An additional aim is to create Dynamic Maps showing the current state of roads, including the traffic density and the determined speed limits.

ALOFON – the project aims to develop a methodology of an automatic phonetic transcription of English speech, based on the analysis of audio and video. Relationships between the allophonic differentiation in speech and the objective signal parameters, will be researched. The project assumes that a method for detection of small differences in the allophones and accent, will be developed. The automated phonetic transcription method will be used in many solutions, including English language learning (especially in the remote learning), phonetic and phonologic research (language corpora processing), automatic accent recognition, etc.

European projects - finished

COPCAMS (COgnitive & Perceptive CAMeraS) - ARTEMIS project funded under grant agreement No. 332913 (2013-2016). The project consortium consists of 21 partners from seven European countries. COPCAMS leverages recent advances in embedded computing platforms to develop large-scale, integrated vision systems. It aims to exploit new programmable accelerators, particularly many-cores, to power a new generation of greener, low-power smart cameras and gateways. This will be possible owing to a paradigm change: whereas previous generation of systems had simple cameras connected to powerful centralised computing servers through high-bandwidth networking, the COCPAMS vision is to push low-power, high-performance computing on the edge of the system and in the distributed aggregators. These “smart cameras” and “smart aggregators” will process video streams, extract significant semantic information and decide locally whether or not the streams’ content is of interest and is worth propagating. The decentralised, distributed decision-making will save both energy and bandwidth, while opening up opportunities for new distributed applications.
Project page.

ADDPRIV (Automatic Data relevancy Discrimination for a PRIVacy-sensitive video surveillance) – the project seeks to improve public safety by ensuring the individuals' privacy right, enriching the current video surveillance systems through an automatic discrimination of relevant data recorded. The project addresses the challenge of determining through an automatic, accurate and reliable manner which information obtained from a distributed system of surveillance cameras is relevant from the security perspective and which is not, and can be safely deleted. This will limit unnecessary data storage and will protect the citizens’ privacy right. ADDPRIV proposes novel knowledge and developments to limit the storage of unnecessary data throughout existing multicamera networks in order for them to better comply with citizen's privacy rights. ADDPRIV addresses the challenge of determining in a precise and reliable manner private data captured by video surveillance systems that are not relevant from a security perspective. ADDPRIV proposes solutions for automatic discrimination of relevant data recorded on a multicamera network, related to an individual whose suspicious behavior triggered an alert. Relevant data not only corresponds to video scenes capturing individuals' suspicious behavior (smart video surveillance), but also automatically extracting images on these individuals recorded before and after the suspicious event and across the surveillance network.
Project page.

PERFORM - an integrated high-budgeted European project from the domain of telemedicine, coordinated by Siemens. The Multimedia Systems Department is responsible for developing teleinformatic equipment designed for the remote monitoring of the patients that suffer from neurodegenerative diseases (mainly Parkinson).

INDECT - continuation and extension of the SECURITY project. The project is a Europe-wide venture, which will be realized in cooperation of Polish, German and European Police, and many prominent Polish and European technical universities. The Gdansk University of Technology is the initiator and main contributor to the project. The project was approved by the European Commission, who is its founder, in September 2007, and has the budget of several million Euro. SECURITY is the first integrated European project from the domain of security technologies, prepared an coordinated in Poland.

PRESTOSPACE - Preservation towards storage and access, Standardised Practices for Audiovisual Contents in Europe (FP6-IST-707336)
An integrated project of the 6. Framework Program realized in cooperation with such corporations as BBC or RAI. The Gdansk University of Technology was responsible for developing tools for the reconstruction of archive materials (old recordings and films). European archives repositories store nearly 200 millions of audio-visual material, part of which could be prevented from further depreciation through the use of these tools.

DESYME - Development System for Mobile Services, European CELTIC project.
An international project completed in 2007, that enables a self-reliant design and programming of various cellular phone services (formerly exclusively reserved for cellular-phone network operators).

Polish Projects - finished

MODALITY – a project realized by the Audio Acoustics Laboratory, Multimedia Systems Department and Intel Technologies Poland. The project aims to enhance the audio and audiovisual communication with mobile computers. The experiments are aimed at improving the parameters of the audio system on mobile computers and human-machine interfaces. The two main topics are: Smart Sound technology and audiovisual speech recognition.

INNOTECH – a system for spatial recognition of gestures with a feedback. A project realized by the Multimedia Systems Department and Samsung Electronics Poland, co-funded by the National Center for Research and Development (NCBR) as a part of the In-Tech program INNOTECH (INNOTECH-K1_IN1_41_159382_NCBR_12).

MULTIMODAL - A new range of computer multimodal interfaces and their implementation in education and medicine (6 ZR9 2007 C/06828). The aim of the project is to elaborate and implement novel methods of man-computer communication, interacting with a user through other means than a traditional mouse and keyboard. The computer will be able to contact with the user in several ways - through tracking eye movement and visual attention, through an "intelligent ball pen" in the cases were dyslexia therapy is necessary, or through tracking lips movement as a help for people with paralyzed hands.

MAYDAY EURO 2012 - Supercomputing Contextual Analysis Platform Multimedia Data Stream for Identification of Objects, or Hazardous Specified Events. A structural type of project under the Operational Programme Innovative Economy 2007-2013 Priority 2, Infrastructure R & D, Measure 2.3. Investments related to the development of infrastructure of science, Sub-measure 2.3.3. Projects in the development of advanced communication services and applications. Main tasks: construction of the platform CASCADE (Streams Thread for analysis of data from cameras for Applications-defining alarms) with fiber-optic network connections from specific locations in the Tri-city; development of algorithms and analysis services for multimedia streams necessary for the development of three pilot applications: (1) protection of intellectual property, (2) supporting medical research, (3) identification of persons and events; development of repository services for the construction of further applications, and the assessment of those, both qualitative and utilitarian.
Project page.
KASKADA platform fome page.

SYNAT (System for Science and Technics) – a research task realized in 2010-2013 by a network of 16 Polish scientific institutions. The aim of the project was development of a hosting and communication platform for digitized knowledge, utilized by researchers, scientific institutions, students, etc. The project was founded by the National Center for Science and Research (NCBR), SP/I/1/77065/10. Multimedia Systems Department realized three tasks: Semantic methods of data searching in large collections of text documents; Methodology of integration of heterogeneous knowledge sources; Subsystems for the analysis of multimedia repositories, archiving and searching for multimedia data.

SECURITY - Multimedia system assisting in identification and prevention of delinquency, including violence in schools) and terrorism (R00-O0005/3).
The project is supported by the Polish Internal Security Platform. Its results will allow to monitor the degree of security in stadiums, schools and other places threatened with acts of terror. The idea of the project is to design and develop teleinformatic tools that would supplement the functions of already existing audio and video monitoring systems. The extension will be a function of automatic image and sound interpretation, which will let computer systems automatically discover potential threats and generate alerts to appropriate services responsible for public order and security.

NOISE - Methods for monitoring of urban aglomerations using modern information technology solutions and geoinformation technologies (R02 010 01)
The project's aim is to elaborate teleinformatic tools capable of monitoring noise and road traffic in agglomerations. The concept of this project was used by the City Council of Gdansk. Independently, the Gdansk University of Technology signed a license agreement with DGT company concerning the implementation of intelligent wireless monitoring stations in other cities.

APARATY_SŁUCHOWE (HEARING_AIDS) - New Methods of Signal Processing for Hearing Aids Applications (3 T11E02829)
A project dedicated to special non-invasive hearing aids, especially for newborn babies.

LARYNX - New Electronic Device for Patients after Laryngectomy.
An original concept of the artificial larynx for persons after laryngectomy, i.e. after the larynx amputation. A digital larynx and a miniaturized synthesizer was designed and produced in cooperation with "Intech", a company from Gdansk. The project was subsidized by a dedicated grant from the Chief Technical Organization Federation of Scientific and Technical Associations. Now, "Intech" starts serial production of the larynx prosthesis licensed by the Gdansk University of Technology.

CEMET - Centre of Medical Technologies (FP5 Excellence Center)

International Center of Hearing and Speech, PROKSIM, Warsaw-Gdańsk (Excellence Center)

Dithering Strategy Applied to Tinnitus Masking (project co-founded by the Institute of Physiology of Hearing)

VoIP - Hybrid Speech Codec for VoIP Telephony Employing Combined Source and Perceptual Coding (No. 3 T11D 004 28).
A project dedicated to the invention and development of more effective speech coders intended for the Internet telephony.

SDSA - Engineering and introduction to clinical tests prototype series of digital speech prosthesis basing on spectral modification of signals in the auditory feedback loop - a project, whose aim was to miniaturize the speech prosthesis (once invented and developed in the Gdansk University of Technology) dedicated for stuttering persons.

INFOPILOT - Air force digital system for the recording and the restoration of speech (148346/C-T00/2002).
A system that records speech, improves its quality and realizes transmission between ground stations and military aircraft pilots. Implemented in 2005 in a Polish military pilots training school of Deblin.

Expert System for Automatic Classification of Singing Voices (3 T11F 023 30)

New Methods for Forming and Ranking Musical Rhythm Hypotheses in Musical Excerpts (3 T11F02729)

Development and Implementation of the Universal System for the Diagnosis of Environmental Noise (internal University grant)

Rozwój koncepcji i zastosowań inteligentnych technik multimedialnych - w ramach Subsydium dla Uczonych Fundacji na rzecz Nauki Polskiej

Development and implementation of the Universal System for the Diagnosis of Environmental Noise (internal University grant)

4T11D01422 - New methods for searching and discovering multimedia content in telecommunication networks

7T11E05220 - Method for the assessment of cochlear implants efficiency

8T11D00218 - Methods of sound processing for the purpose of multichannel multimedia transmission

8T11D02819 - Perceptual coding of audio employing intelligent decision algorithms

8T11E03415 - New algorithms of digital hearing aids and methods for hearing aid fitting

8T11D02112 - New methods of intelligent filtration and coding of audio

8T11E03310 - New methods for the diagnosis and therapy of hearing impairments employing digital signal processing technology

4PO5D01609 - Correcting of speech impairments basing on signal modification in the auditory feedback loop

8T11C02808 - Applications of artificial intelligence methods to data analysis and processing in acoustics

7TO7B02009 - New methods of digital sound synthesis

8S50302106 - Rough sets applications

8T11D00208 - Development of methods for digital restoration and processing of audio signals

8S50401005 - Digital restoration and processing of audio signals

883169203 - Computer speech recognition system