
Nanyang Technological University School of Computer
Engineering (http://www.ntu.edu.sg/sce)
have recently been working on a number of audio detection projects. The
outcome is a set of algorithms to detect the presence of various types
of sound such as;
The system has been tested with some other sounds, and we have been working on ways to enable the system to be trained by a user for new sounds. The system is now being developed into a small handheld portable unit for the hearing impaired. |
![]() |
|
![]() |
The portable system can alert the user on an LCD display to indicate what is the current sound being heard. Additionally, some urgent or important sounds (like the fire alarm) can cause the unit to vibrate like a pager for instant attention. | |
![]() |
When development is
complete, the portable unit will be small enough to fit into a pocket or
be carried in the palm of one hand.
It will run on rechargeable batteries that should last long enough to allow up to a week of continuous operation. |
|
| The system is being actively developed at the present time. The software is still being finalised, but the propotype hardware has been designed and will be manufactured by the end of April 2001. This prototype can then be put into production in 2002 depending on the availability of industry backing. | ||
Please feel free to make any comments,
ask questions, make suggestions, by contacting;
Assistant Professor Dr Ian McLoughlinemail: asian@ntu.edu.sgtel: 790 6230website: http://www.lintech.org
Detailed Technical Description
The software samples 16-bit audio at around 8kHz, from an electret microphone connected to a CD-quality analogue-to-digital converter. The sampled audio is blocked into frames of usually 512 samples. A set of audio features such as zero-crossing rate (ZCR), average magnitude difference function (AMDF), frequency centroid and others, are computed for the entire frame.
A complex fuzzy neural network, previously trained to recognise such sounds, now attempts to classify the current sound using the features obtained, along with some higher-order statistics regarding those features. It does this nominally twice per second.
The network can classify
into (at present) one of 16 different types of sound. If is has never heard
exactly that type of sound, it would try to generalise by deciding if it is
similar to one of the built-in sounds, for example "speech-like"
or "alarm-like", and inform the user accordingly, but with an indication
that it is not too sure of this sound!
The hardware setup is as follows;
|
Is
driven by a 220MHz 32-bit RISC processor - the StrongARM SA1100. This is
used in other devices such as MP3 players, the HP Jornada PDA and the Corel
Netwinder office internet server etc...
|
|
|
It
runs the Linux operating system, and - with reprogramming - is capable
of functioning as a full electronic organiser, with alarm clock, voice
recorder, MP3 player etc.. Software updates to the unit can be made by
connecting it to a PC for download
|
|
|
It
contains 4MBytes of RAM and 4MBytes of flash EPROM. There are stereo 16-bit
ADC and DAC converters onboard, 2 RS232 ports and support for IrDA and
other connections.
|
|
|
The
hardware comprises a 6-layer 9cm x 5cm sized PCB, and a separate LCD module
|
|
|
The
hardware and software have been designed in conjunction with students of NTU School
of Computer Engineering. It is totally designed and built in Singapore
|