What's new

Music Decoding (SO close!)

BGNG

New member
In the F-Zero X ROM (North American, Non-byte swapped: Z64), I zeroed all the bytes from offset 0x800000 to 0x810000. This happens to be a chunk of the Mute City music data. When listening to the music in-game, the music simply stops dead silent during the zeroed data and picks back up right where it would have been if I hadn't done anything at all.

This suggests that the encoding scheme for the music uses a very rudimentary method. This also suggests that the music is stored with a constant bitrate.

Copying the data to offsets 0x810000 to 0x820000, however, the music didn't repeat itself. Instead, there was some very loud, distorted noise. This suggests that there is at least some form of compression invloved; the music is not raw PCM data.
__________

To verify my suspicions, I dumped the data from offset 0x800000 to 0x810000 to a file on disk and read it as if it was raw PCM data at varying sample rates. What I found was that the data most certainly is not PCM data at 8 or 16 bits (big or little endian); 1 or 2 channels.

What I did find, however, is that at 8 bits, 1 channel, 11025 hz, you can identify the exact sounds of the music through all the static, albeit somewhat distorted. This could mean a number of things, but I figure it's one of two: 1) The static may represent block headers in the music, where it is only a simple matter of code application to decode it completely; or: 2) The static may represent sequencing information that informs the audio driver to continue playing/looping certain chunks of previous data.

My basis for number 2 is the fact that I was unable to identify any bass in the extracted ROM data, but there's plenty of bass in the decoded music.
__________

So I come to EmuTalk for help. I've prepared a .wav file of the ROM data at 8 bits, 1 channel, 11025 hz. I've included it in a ZIP file with a plain dump of the ROM data and an MP3 I recorded that shows what the music in the extracted chunk sounds like when it's finished decoding. The files are as such:

Data Wave (really loud).wav - The prepared wave file
Real Thing.mp3 - The recorded MP3
ROM Data.bin - The extracted ROM data

CAUTION: When listening to the data wave file, turn your speakers down almost all the way. The static in the file produces incredibly loud noise that may damage your ears if your speaker's volume is set too high. Nonetheless, if you listen to the file, you can hear the music in question rather easily through the noise.

The Real Thing MP3 was recorded off of F-Zero X in a real N64. I cut out other parts of the data, and what's left is reasonably close to the exact decoded data that can be generated by the given ROM data file. Also in that MP3 is the overlaying sound of the machine engine; a sound which will not be present when the ROM data is decoded.
__________

As you may recall: about a year ago, I was asking about MIO0 decoding so I could extract the course texture images for use in a level editor for F-Zero X. I've decided to give that project another try, and I'd eventually like the option to choose which music plays when you play on an edited course that is patched to the ROM. In order to do a music preview, I'll need to know how to decode the music. (I've figured out the MIO0 compression algorithm, by the way)

So I ask for help decoding the music. I offer to anyone who helps me reach my goal the opportunity to beta test the level editor before its first release.
__________

NOTE: The F-Zero X Expansion Kit, released in Japan only for the N64DD, enabled the musics to be played in stereo, while the retail version of the game only gives mono music even if the sound option in the Options menu is set to stereo. I suspect that the data stored in the retail ROM might be decodable to stereo output, but the game mixes the channels to mono before it's outputted from the N64. Bear this in mind if you try to decode the ROM data file I've provided.
 

Azimer

Emulator Developer
Moderator
N64 music is encoded in wavesynthesis. Basically MIDI. You provide sound banks (instruments), and a sound control file. The synthesizer takes care of the rest. It is much more efficient in size than pure PCM or MP3. There isn't a lot of information on the internal formats. I do know most of the sound banks are encoded ADPCM and sound control files are not a standard format (like MIDI).
 
OP
BGNG

BGNG

New member
Thank you for your response, Azimer. ADPCM may provide some insight. However, I do not believe that F-Zero X adheres to the general rule in regards to music.

The N64 supports on-chip the ability to play sampled data as musical notes, but that isn't the way it works in all N64 games. The perfect example of deviance is H2O's Tetrisphere and The New Tetris. The music author, Neil Voss, wrote custom audio drivers to play his music data; it doesn't use the standard sound chip output method for music.

F-Zero X, if you listen to the tunes, has music too advanced to be played as sequenced music notes. I have a few pieces of evidence to back this up:

1) The music sounds very bright and well-mastered. The notes are played in a way that it would be very difficult to sequence it without advanced, dedicated software which is, to date, not to be found on N64 software.

2) When I zeroed out the chunk of data in question and listened to the music in-game, the music stopped completely; all at once. If the notes were sequenced, then the notes that were (at the time) currently playing would have continued playing throughout the remainder of their duration, but this was not the case.

3) There's a GameShark/Action Replay code out there which lets you modify the pitch of the music that plays. For example, at a lower pitch, the music will take longer to play and all the sounds will be lower and distorted. If the music was sequenced, then only one sound or perhaps the tempo would be lowered; but not every sample of the tune.
__________

Although I will give ADPCM a try; thanks for mentioning that. The audio program I was working with (Audacity) doesn't support it, so I didn't try it. But thanks for the heads-up. It just might get me closer to where I want to go.
 

blight

New member
BGNG said:
... My basis for number 2 is the fact that I was unable to identify any bass in the extracted ROM data, but there's plenty of bass in the decoded music....

Sounds much like ADPCM to me ;-)
 
OP
BGNG

BGNG

New member
I don't quite understand. What about the bass tipped you off about it being ADPCM? Is that simply a common "tell-tale sign" of ADPCM?
 
OP
BGNG

BGNG

New member
Okay, I've created an ADPCM codec and I must say: it's an INGENIUS little algorithm. Nevertheless, the algorithm alone couldn't decode the ROM Data I dumped into the Mute City music... quite.

What I have in the ROM data is a vaguely-resemble-able Mute City music with a bunch of static. After ADPCM decompression, I'm left with an easily-recognizable Mute City music in 16-bit with some bass and... still, a bunch of static. The music that I decoded plays a bit slow, which is a dead givaway that there's more data there than just the audio data.

I also discovered during my work with ADPCM that there is no such thing as "random access," which means that the way the music resumed playing after my zeroed-out chunk should theoretically be impossible. The .wav and .aiff file formats account for this by storing samples in blocks of data with short headers, each of which include the "predictor" variable values at that point in the stream as used by the decompressor.
__________

I'm fairly certain I'll eventually uncover the structure of those headers myself, but by chance does anyone know off-hand how Nintendo likes to store ADPCM data in their ROMs?
 

thakis

New member
The ADPCM coded used for the audio frames in the thp video codec (a gamecube video format) is described in yagcd, perhaps it was used on n64 as well (i doubt it, though).
 
OP
BGNG

BGNG

New member
Okay, I've found a "Nintendo Confidential" document that describes some of the lower-level implementations of the N64 music system.

http://n64.icequake.net/doc/n64intro/kantan/step2/3-4.html

It's looking like AIFC ADPCM is the way stuff is compressed in Nintendo's games, so I'll look into that.

EDIT:
I've attached a ZIP file with files describing the AIFC file format and the IMA4 compression scheme as used in the AIFC file format.
 
Last edited:

thakis

New member
The information on compression in the apple spec is not very informative, but the other document included in the spec looks like a really good overview to me, thanks :) (I've been looking for adpcm docs when I was trying to understand thp audio, and I found no good ones - I found the apple spec, but, as I said...)

The confidental document only gives a 404 though. The link was working when I checked it at university this morning, but now that I wanted to download this document, it's gone :-(
 
OP
BGNG

BGNG

New member
It seems to be fine at the moment. But in the event it crashes again, I've copied the file and attached it to this post in a ZIP file.
__________

In other news, I've found that the music data in the F-Zero X ROM almost always has an unrelationally low value between every set of 8 samples, which suggests that the storage is somewhat smaller than the AIFC specification. I'll analyze some of my own ADPCM data to see what's up with that.
 

Top