-
Notifications
You must be signed in to change notification settings - Fork 45
MWEngine "core actors"
The important "actors" for getting sequenced audio out of MWEngine are just these five:
AudioBuffer, AudioChannel, AudioEvent, Sequencer and Engine.
audiobuffer.h
An AudioBuffer is in essence what the whole shebang boils down to (!) as it is the object that contains the bytes that are written into the audio hardware. An AudioBuffer contains a snippet (or snippets when multichannel) of audio of variable length. Audio is represented as a list of floating point values in the -1.0 to 1.0 range. For the ringbuffer that is used by the render thread (see Engine) the buffer size should be as short as is feasible to ensure low latency response, while it is also possible to have a single AudioBuffer that contains audio that lasts for minutes, hours (or rather for as long you can supply it with RAM).
An AudioBuffer describes the amount of channels it contains and the size of the buffer (which is equal for all channels within the buffer). It supplies methods to retrieve individual channels from the buffer, as well as convenience methods to clone and merge buffer contents.
audiochannel.h
An AudioChannel can be thought of as a track on a mixer. In MWEngine each channel can represent an instrument, either a synthesizer, or a drummachine or sequenced sample player, etc. The AudioChannel holds a vector of AudioEvents which represent musical notes at a given pitch and time. The channel also contains mixer properties ( for instance panning position and output volume ) as well as a processing chain. The processing chain contains effects processors ( let's say oscillated filters or delays for echo-generation, etc. ) which apply to the instrument. When queried by the engine, all AudioEvents that are made elligible for output by the sequencer are written into a single AudioBuffer ( the channel-strip on a mixer ), to which the processors apply their process in series. This AudioBuffer is then written by the engine into the combined, single output (i.e. the current AudioBuffer that will be enqueued into the ring buffer for output).
baseaudioevent.h, sampleevent.h, synthevent.h
Basically a musical instruction for the instrument it corresponds to. An AudioEvent contains :
- a method for mixing its audio input into the channel AudioBuffer
- a property describing the events length ( duration ) in samples
- properties describing the events start and end offset in samples. These describe in a musical context at what part of a measure the note starts sounding and for what duration it lasts.
Depending on the instrument the AudioEvent corresponds to, additional properties are available. For instance an AudioEvent for a synthesizer holds a reference to instrument properties for frequency ( pitch ) and ADSR envelopes to shape the sound.
Certain events are also cacheable ( see the optimization section ) and contain their own cached AudioBuffer. When queried by the sequencer to mix their audio into the channel buffer, the requested segment is simply copied from the cached buffer, omitting the need for resynthesizing a static sound. Additional methods for these events are for invalidating its contents ( and rendered buffer! ) when the corresponding instruments properties change or a global sequencer setting such as tempo is altered.
sequencer.h
Each request "steps" the sequencer position ( you can see it as "playback head" ) by the buffer size. Rather than relying on timers ( DON'T! ), sequences are calculated at the buffer level.
To elaborate in musical terms : say we're looping a single measure at 120 beats per minute in 4/4 time. There are 4 beats per measure, so the entire measure lasts for 4 beats / ( 120 beats per minute / 60 seconds ) = 2 seconds.
In programmatic terms, we calculate time in buffer samples. Let's say we are rendering audio at 44.1 kHz. The amount of samples for a single measure is : round(( 44100 Hz * 60 seconds ) / 120 bpm ) * 4 beats = 88200 samples.
For the sake of argument, let's say that after 10 steps and using a buffer size of 512 samples, the sequencer position is at 5120 samples. The current sequencer-step is looking to gather AudioEvents that are audible in the 5120 - 5632 range of the current measure. Just to return to a musical context : the second sixteenth note of the measure starts at 5512 samples.
The sequencers job is to return each sequenced instrument ( represented by an AudioChannel ) as well as the calculated buffer range to the engine, so it can output the AudioEvents that are elligible for playing. If the step exceeds the end position of the available ( or looping ) measure(s), the sequencer begins counting from the first available position for the current loop. ( i.e. after 172 steps the sequencer will request the 88064 - 88576 range, which exceeds the maximum end position of 88199 by 377 samples. When looping, this means the sequencer performs a request for the 88064 - 88199 range and an additional request for the 0 - 377 range for seamless audio. The new sequencer position is now 377.
native_audio_engine.h
The engine contains actually very little logic and it brain that binds together all of the above actors into a coherent, musical story :)
It runs the output process which continually enqueues an AudioBuffer inside a ring buffer ( enqueuing and dequeuing the same two buffers after each other ) which in turn is fed into the audio hardware for audible output. Each cycle of this thread is executed after the currently enqueued buffer has been dequeued after the former buffer has finished playing. Each cycle writes only the amount of samples available to a single buffer, which should be as small as feasible (as in essence we're writing the audio for the buffer which playback lies in the future).
The engine has no rendering logic (as in synthesis or signal processing) apart from that it mixes all the available AudioChannels into a single output. In other words : the engine is only reading channel buffers and writing into a single output AudioBuffer ( i.e. the "master"-strip on a mixer ). What the render process looks like in detail is as follows:
- engine requests from the Sequencer all AudioChannels, which will contain the AudioEvents that are playing during the given buffer start position to buffer end position-range ( buffer end equals the start position plus the buffer size ).
- engine checks if recording mode is active, if so, records input for a single buffer size.
- engine processes each individual AudioChannel as follows:
- each AudioEvent inside the channel renders its content ( i.e. synthesizes audio/writes its cached sample ) for the given buffer range into the channels buffer.
- if the AudioChannel contains "live" events ( i.e. channel belongs to an instrument playing non-sequenced input ) these are mixed into the channel buffer.
- the AudioChannels processing chain is applied to the channel buffer for application of effects and processing.
- in case caching is enabled, the engine now writes the cache into the channels cached buffer.
- after all AudioChannels buffers are merged into the "master strip"-buffer the engine applies limiting and both a LP and HP filter to prevent output clipping.
- audio is now written into queued output buffer.
- if output recording was enabled, this iteration is now written into the DiskWriter.
- engine increments buffer_start_position with the buffer size for next iteration.
This is where the thread is now complete and locks execution until the ring buffer has finished playing the currently queued buffer and enqueues the just rendered buffer for the next play cycle, once this occurs, the loop continues onto its next cycle.