Göm meny

Lab 2


Task

The goal of the lab is to construct a lossy coder and decoder for music data. The coder should meet given demands on distortion and rate when coding the test music.

You should work in groups of 1-2 students.

Demands

To pass you should submit a written report, describing how your coder works and what results you get.
In order to pass you must meet these demands:
  • At a rate of at most 192 kbit/s the signal-to-noise ratio (measured over the whole song) when coding the two test song should be at least 30 dB. The cost of all side information (quantization parameters, code trees, et c.) must be included in the rate.
  • The coder should be general, ie it should be able to code any music and not just the two test songs, giving roughly the same results.
  • The coding must be done in such a way that when decoding you can jump to any position in the song without having to decode all the data up to this position. In order to do this, the data should be coded in blocks of at most 4096 stereo samples (4096 samples from the left and the right channel). This will limit the size of any transforms used and also means that any adaptive coding can only depend on the data inside the block. It is allowed to let any variable length codes and quantization parameters depend on all of the file. These parameters can be seen as header information that is sent once at the beginning of the file.
You don't have to go all the way down to the bitstream level when you're constructing your coder and decoder. It's ok to just count how many bits that you would need if you created an actual bitstream (coded file).

The report should also mention the following things:

  • Listen to the decoded music and give a subjective assessment of how well your coder works. You could also try to code other music besides the two given test songs.
  • What is the lowest rate you can get with your coder without hearing any difference between the original music and the decoded music?
  • What is the lowest rate you can get and still have a decent sound quality?

There will also be top list over everybody's result, sorted in SNR order.

Test data

The music files to be coded can be found here.

The music is in stereo, 16 bits per sample, sampling frequency 44.1 kHz. The raw data rate is thus 1411.2 kbit/s.

The files are available as complete WAV files and also split into smaller parts.

Methods

You can use any coding methods that you learn in this course to solve the problem.

You are free to choose any programming language for your implementation. Matlab is probably the easiest choice, since there are readymade functions for transforms and other matrix operations. Beware though that you might end up with slow programs if you're not careful.

A few things to remember if you choose to use Matlab:

  • Loops are very slow in Matlab. Try to use Matlab's vector operations as much as possible.
  • Avoid adding elements at the end of large vectors so that they change size. Instead you should create sufficiently large vectors at the start (for instance by using zeros) and then change the contents of the vector.

In Matlab you can read WAV files using the function audioread. This gives a matrix where the samples have been scaled to lie between -1 and 1.
To write the dedoded data to a WAV file, use the function audiowrite.

Report

Send an electronic version of your report (in PDF format) to Harald. Give the name, person number and email adress of every group member.

Deadline

Lab results are reported to Ladok at the same time as exam results, three times a year. If you want to get your points early, you should send in your report no later than May 27th.

Acknowledgements

Thanks to Kaye for letting us use her songs "Kiss-think-blink-think" and "Chanson de nuit" as test data for this lab.

Questions?

If you have any questions about the lab, contact Harald.

Informationsansvarig: Harald Nautsch
Senast uppdaterad: 2019-03-25