Notes on Creating Voice Files for DMS ===================================== DMS Voices ========== DMS voices consist of one 256-byte page of "envelope" information followed by some number of 256-byte "waveforms" for the voice. Each byte in the envelope page is a number between 1 and n pointing to one of the n following pages of waveforms. As a note is sounded, the envelope bytes are fetched consecutively, with each being used to generate 256 samples (about 23 milliseconds). During each 256- sample period, the envelope byte indicates which following waveform page to use for sound generation. NOTE THAT NO RANGE CHECKING IS DONE ON THESE VALUES BEFORE THEY ARE USED. PLAYING ANY VOICE WITH AN ENVELOPE VALUE OUTSIDE THE VALID RANGE OF 1..N (WHERE N WAVEFORMS FOLLOW) WILL RESULT IN LOSS OF CONTROL BY DMS, REQUIRING THE MACHINE TO BE RESTARTED. Each byte in a waveform page is a pointer to the page in memory containing the DMS pulse generator for that particular sample value, ranging from $08 to $27. NOTE THAT NO RANGE CHECKING IS DONE ON THESE VALUES BEFORE THEY ARE USED. ANY SAMPLE VALUE OUTSIDE THIS RANGE WILL RESULT IN LOSS OF CONTROL BY DMS, REQUIRING THE MACHINE TO BE RESTARTED. The program to be discussed automatically creates voice files with proper pre-checked values, so that they can be played by DMS. Instrument Samples ================== Since the sample rate for all voices is 11kHz, and each waveform contains a single cycle of the instrument's fundamental frequency, the sampled frequency should be approximately 43Hz, or the key F in octave 1. A good starting point for creating a new voice is a sample of the desired instrument playing an F1, with a sample rate of 11.025kHz, monophonic, with 8-bit unsigned samples. Such a .wav or .snd file can be manipulated to create a DMS voice. Not all instruments are candidates for DMS voices. No frequency variation during the sounding of a note is supported. As a result, any instrument whose intonation involves a fundamental frequency "twang" or a vibrato is unsuitable. You may find it useful to preview or pre-process an instrument sample file using SOUND.EDITOR. It is limited to viewing and editing sounds of less than two seconds duration, but that is enough for many instruments. The instructions for SOUND.EDITOR are included as an Apple text file (SOUND.ED.DOC) on the accompanying disk archive. Converting a Sample to a Voice ============================== I have written an Applesoft program to help with the creation of new voice files, but it should be understood that it is only a tool written for my personal use--it is in no way polished or complete. Anyone well-versed in Applesoft BASIC will be able to understand both how it can help with voice file creation and how far it is from a finished tool! The Applesoft BASIC program GEN.TONAL is included on the accompanying ProDOS disk. The GEN.TONAL program takes an instrument sample file (.wav or .snd) and displays it so that the user can isolate single-cycle waveforms that characterize its attack phase and its sustain or decay phase. As the user moves along the timeline and selects waveforms, the program constructs the envelope table and adds the waveforms to the voice file. Using GEN.TONAL =============== When RUN, GEN.TONAL will ask for the "debug level". You should usually answer zero ("0"). It will then ask for the name of the sample file. (You can see a CATALOG of the current directory by simply typing a RETURN.) GEN.TONAL will then ask "How many subvoices (amplitudes)?", to which you should answer "1" (this feature is not yet implemented). The program will then read in the sample file and display the first 280 samples on the screen, with a grid indicating the 32 quantization levels that will actually be used by DMS and the start and end of the 256-byte waveform sample window. The text display at the bottom of the screen shows the current values of important program variables. (If your sample was saved as a .wav file, the first 64 bytes of the sample will be the .wav prefix, and you should position past it when setting the starting point for waveforms.) When the waveform is displayed, the program waits for keyboard input to set modes and perform actions: =Left and right arrows move the waveform back and forth in the sample window by one sample if pressed alone. If the Open-Apple modifier key is held down while an arrow is pressed, the waveform will be moved by 10 samples. If the Closed-Apple key is held down while an arrow is pressed, the waveform will be moved by one waveform period (about 256 samples). The arrow keys should be used to position the waveform with respect to the fixed left "start" mark for the waveform. Generally, the best starting point is the beginning of a region of several near-zero samples. =The "<" and ">" keys will move the waveform end mark by one sample, and the Open-Apple and Closed-Apple keys will modify the amount of movement in the same way as they modify the arrow keys. The "<" and ">" keys should be used to move the right waveform end mark near the beginning of the next region of several near-zero samples one period after the start mark. =The RETURN key accepts the current settings of waveform marks and causes the program to automatically resample the waveform to 256 samples and appends it to the voice file. It also fills out the envelope table with the previous waveform until the current time point is reached, then stores the pointer to the current waveform. =The "I" key will invert the entire waveform. When selecting waveforms, it is important to remember that each sampled cycle should begin and end near a zero sample value. Some samples will work better if their waveforms are inverted, and this option performs that inversion. =The "F" key will cause the number of samples used for each cycle of the waveform to be "frozen" for all subsequent cycles. Usually, after the first cycle or two, instruments will have settled to a constant period, and this speeds up the remainder of the processing. =The "S" key tells the program that the instrument has reached its final or sustaining amplitude, and causes the voice to be finished by repeating the last waveform for the rest of its duration. This amplitude can be a constant audible amplitude or it can be a zero amplitude, or silence, depending on the instrument. =The "N" key forces the amplitude of the waveform to be "normalized" so that its loudest waveform is the maximum possible amplitude. This is often desirable, though you may prefer that some instrument voices be intrinsically softer than others. In the latter case, the sampled instrument should be set to the desired volume and normalization should not be used. =The "X" key disables "ramping", or the automatic subtraction of the lower envelope of the waveform from the waveform. Ramping subtracts the (approximate) lower envelope of a waveform from all samples, so that the output waveform's negative peaks are essentially tangent to the zero axis. The reason for this transformation, which clearly introduces low-frequency (and usually inaudible) artifacts, is to cause all waveforms to begin and end with a sample value very near zero. This has the aesthetic effect of blending the start of a note with the continuous stream of zero samples that is played by the synthesizer while it is idle, so no "pops" are audible at the beginning of notes. It also satisfies the technical requirement that sample waveforms will frequently approach zero, since DMS will only end a note when it has a (5-bit) sample value between 0 and 3, again so that there is no audible "pop" at the end of a note. In general, ramping should not be disabled, so the "X" key should not be pressed. =The "A" key puts the progam into automatic mode, in which the voice is completed without further interaction. This key may be pressed after it is clear that the period of the waveforms is constant and no further "tweaking" is needed. When the "A" key is pressed, you will be asked whether the voice is sustained at an audible level or decays to silence. If you respond that it is sustained, you will be asked how many more cycles to include before sustaining at a constant level. Each waveform cycle adds 256 bytes to the voice length, so there is a compromise to be made between the number of decay levels and the length of the voice. (You can skip past waveforms when making a voice to save space, as long as the skip will not cause the voice's envelope to sound harsh.) The decoding for keypresses begins at line 800 in GEN.TONAL, if you would like to examine the program and perhaps modify it for your own uses. After all voice waveforms have been selected and the envelope table completed, the program will ask for the General MIDI voice number to save the voice under. The program will construct a voice filename of "V." followed by the General MIDI voice number. This is the standard naming convention used by DMS and the VoicePak Editor. Hints ===== To conserve space, only significantly different waveforms should be selected for inclusion in the voice file. During the attack phase, 3-6 waveforms may be needed. If the instrument has a sustained sound, only one "final" waveform is needed, and the program will automatically extend it's use throughout the remainder of the envelope table. When the program is put into "automatic" mode ("A"), it will select waveforms with significantly differing amplitudes until the final amplitude is reached--either a constant amplitude (sustained) or silence (non-sustained). Note that GEN.TONAL does not support instrument sounds with oscillating amplitude or timbre, though it would certainly be possible to manipulate the envelope table (with another program) to re-use waveforms already in the voice to produce such voices. (DMS cannot support vibrato, or low- frequency variation of the *frequency* of a note.) Conclusion ========== Constructing voices for DMS is not an easy process. I have personally made about thirty voices, and most of them did not work as I had hoped. Perseverance and experience are the best guides to good-sounding voices. I wish you all the best, and encourage you to share your best efforts! -Michael Mahon