Table of Contents
- exhale Wiki :: Frequently Asked Questions (FAQ)
- Can the exhale application write or copy metadata to an encoded file?
- How does the bit-rate mode of the exhale application or library work?
- How do I configure foobar2000 for on-the-fly resampling with exhale?
- I get corrupted files when encoding using exhale via foobar2000. Why?
- I compiled exhale into a DLL. How can I use it in my own applications?
- Why is the exhale encoder so slow? Is there a switch for fast encoding?
- Are further encoding options available for experts and for streaming?
- Can I configure the exhale encoder to use a custom audio bandwidth?
exhale Wiki :: Frequently Asked Questions (FAQ)
The following is a list of answers to questions regarding the use of the exhale application or library which are often being asked. If there is a question you do not find an answer to, please notify one of the developers or ask the question in, e.g., the HydrogenAudio forum (you can find an existing discussion thread in the Other Lossy Codecs section).
Can the exhale application write or copy metadata to an encoded file?
If, by metadata, you mean iTunes-style artist/song/cover metadata: no, and there currently is no plan to add support for this. See this question for a work-around. If you are talking about ReplayGain-style peak/loudness metadata: this is taken care of by the dynamic range control (DRC) metadata according to ISO/IEC 23003-4, which is supported since version 1.0.1 of this software. See also the exhale release notes in the include subdirectory. Note, however, that album loudness data is not explicitly supported by the exhale application (since it can only treat each audio file independently). The program loudness is measured according to ITU-R BS.1770-4, except that the analysis block size of 400ms is rounded to integer multiples of the audio frame duration.
How does the bit-rate mode of the exhale application or library work?
As documented by the exhale application when running it without any arguments on a command-line (see the README.md file), the bit-rate mode is a number # between 0 and 9 or, since exhale 1.1.0, a letter between a and g, where a higher number/letter means a higher constrained variable bit-rate (CVBR) configuration of the exhale encoder library. The resulting bit-rate when encoding a particular audio file with a specific CVBR mode depends on the CVBR mode as well as the content of the audio file (signal characteristics, sampling rate, and channel count). Overall, the following mean (and minimum/maximum) bit-rates, in kilobits per second (kbit/s), are obtained for the given mode:
Bit-Rate Mode # |
Mono Audio, 1 Channel | Stereo Audio, 2 Channels |
|---|---|---|
| 0 | 24 (19/29) kbit/s | 48 (38/58) kbit/s |
| 1 | 32 (26/38) kbit/s | 64 (51/77) kbit/s |
| 2 | 40 (32/48) kbit/s | 80 (64/96) kbit/s |
| 3 | 48 (38/58) kbit/s | 96 (77/115) kbit/s |
| 4 | 56 (45/67) kbit/s | 112 (90/134) kbit/s |
| 5 | 64 (51/77) kbit/s | 128 (102/154) kbit/s |
| 6 | 72 (58/86) kbit/s | 144 (115/173) kbit/s |
| 7 | 80 (64/96) kbit/s | 160 (128/192) kbit/s |
| 8 | 88 (70/106) kbit/s | 176 (141/211) kbit/s |
| 9 | 96 (77/115) kbit/s | 192 (154/230) kbit/s |
Since exhale 1.1.0, it is also possible to specify a letter between a and g instead of a number on the application command-line, which will result in encoding with Spectral Band Replication (SBR) enabled in a dual-rate configuration (the core coded signal will run at half the input sampling rate). Internally, when initializing the exhale library, the letters a-g are mapped to the bit-rate modes 0-6, respectively, with a frame length of 2048 instead of 1024; see also exhale's API. The overall mean (and minimum/maximum) bit-rates for these SBR modes are as follows:
| Bit-Rate Mode SBR | Mono Audio, 1 Channel | Stereo Audio, 2 Channels |
|---|---|---|
| a | 18 (14/22) kbit/s | 36 (29/43) kbit/s |
| b | 24 (19/29) kbit/s | 48 (38/58) kbit/s |
| c | 30 (24/36) kbit/s | 60 (48/72) kbit/s |
| d | 36 (29/43) kbit/s | 72 (58/86) kbit/s |
| e | 42 (34/50) kbit/s | 84 (67/101) kbit/s |
| f | 48 (38/58) kbit/s | 96 (77/115) kbit/s |
| g | 54 (43/65) kbit/s | 108 (86/130) kbit/s |
For multichannel audio, the bit-rates increase accordingly. Note that, for audio files containing passages of digital silence, the resulting bit-rates might be lower than the minimum values listed in the table.
How do I configure foobar2000 for on-the-fly resampling with exhale?
If you would like to use exhale for conversion of your audio files to Extended HE-AAC files with a CVBR mode which does not directly support the sampling rate of your files (such as mode 1, which does not accept 48 kHz, for example), you can configure foobar2000 to apply automatic on-the-fly resampling to a sampling rate which exhale accepts. To do so, make sure you select the "Normal" or "Full" type of install during the foobar2000 setup in order to include the standard DSPs (resampler, etc.) in your installation. Then, to convert your audio files with the necessary resampling, load the desired input files into foobar2000's playlist, mark all files to be converted, rightclick on one of them, and select Convert -> .... In the newly opened window click on Output format and choose your exhale preset. After clicking on Back, click on Processing and, under "Available DSPs", click on the + of one of the "Resampler" entries (I recommend the dBpoweramp/SSRC version). Now click on the ... of the newly added "Resampler" entry under "Active DSPs" and enter your target sampling rate (e.g., 32000 Hz for CVBR mode 1), as shown below. After selecting OK and Back, you can start the conversion.
I get corrupted files when encoding using exhale via foobar2000. Why?
The reason is a stdin related incompatibility between foobar2000 and exhale's read-in/write-out routines in older versions of the software. To avoid this issue and to use the most stable versions of these programs, please update your foobar2000 installation to version 1.5.4 or later (see foobar2000's Download page) and your exhale executable to version 1.0.5 or later (see exhale's Releases page).
With older versions of exhale, it was sometimes possible to circumvent the problem by preventing the transfer of metadata tags. This can be achieved by configuring the "Other" section during conversion to Extended HE-AAC as follows:
I compiled exhale into a DLL. How can I use it in my own applications?
This depends on whether you want to use implicit (load-time) or explicit (run-time) linking to the exhale DLL. This page describes the difference between the two linking methods and is a recommended read to anyone new to the subject.
An example of implicit linking to the exhale DLL is actually provided by the exhaleApp source code, but you need to do the following in Visual Studio to configure it that way:
-
Build the DLL as described in the Compilation section of exhale's README.md file. This creates a file "exhaleLib.dll" somewhere (depending on your compiler and its configuration) in the
libsubdirectory of your exhale distribution. -
Open the file "exhaleApp.cpp" of the
exhaleAppproject and uncomment the line starting with// #define USE_EXHALELIB_DLL ...at the top of that source file. Then it should be possible for you to build the exhaleApp project, e.g., by pressingF7, without any errors. -
Before you can run the DLL compatible exhale application that you just created, you need to place the exhale DLL into the application's directory. To do so, copy the "exhaleLib.dll" file located in step 1 into the subdirectory of the
binsubdirectory where the "exhaleApp.exe" executable is located.
For further details on how to use the public functions exported by the exhale library, please consult
- the source code encapsulated by the
USE_EXHALELIB_DLLmacro in the "exhaleApp.cpp" source file - the header file "exhaleDecl.h" in exhale's
includesubdirectory - the API function reference documenting exhale's public functions.
Why is the exhale encoder so slow? Is there a switch for fast encoding?
The exhale encoding library includes a method for joint optimization of the spectral coefficient quantization and band-wise scale factor selection, which is computationally complex and, thus, slows down the encoder quite a bit. This method can be disabled by changing
#define EC_TRELLIS_OPT_CODING 1
to
#define EC_TRELLIS_OPT_CODING 0
in file "entropyCoding.h" in the src/lib subdirectory of the source code and then recompiling (see the README.md file on how to do this), but this is not recommended since it degrades the audio quality of the encoded Extended HE-AAC files.
Are further encoding options available for experts and for streaming?
Apart from speeding up the encoder by defining macro EC_TRELLIS_OPT_CODING as described in the answer to the previous question, exhale since version 1.1.7 allows you to change
#define BA_MORE_CBR 0
to
#define BA_MORE_CBR 1
in line 19 of file "bitAllocation.h" in the src/lib source code subdirectory to produce encoded files with more constant bit-rate (CBR). Using this macro, the maximum short-term deviation from the specified target bit-rate (averaged over roughly 10-20 seconds) remains below approximately 15%. In other words, the resulting average bit-rate of each encoded audio file varies less with the audio content itself and is more limited towards high bit-rates, which may be useful in applications with restrictive buffering conditions. The following figure demonstrates the effect of activating the macro on "difficult" short stereo files from here and here, encoded with o preset 2 (target rate 80 kbit/s without SBR) and x preset b (target rate 48 kbit/s with SBR):
Note that the CBR mode is not recommended and should only be used when absolutely necessary since it leads to audible quality degradation on some audio content. Moreover, the exhale command-line encoder provides an extended set of options for expert usage. Please see this forum post for details.
Note that, when the input audio is sampled at 48 kHz, it is possible to perfectly align the I-frames (immediate playout frames, IPFs) in Extended HE-AAC files with those of modern video codecs, which typically use an I-frame every N pictures (where N is an integer multiple of a power of two). For example, if the video is sampled at 25 or 50 fps and uses I-frames every 96 pictures (i.e., every 1.92 seconds), then exhale can use an IPF every 45 or 90 audio frames. Similarly, if the video is sampled at 30 or 60 fps and has an I-frame every 64 or 128 pictures (i.e. every 1.067 or 2.133 seconds), then exhale can write an IPF every 25 or 50 audio frames.
Can I configure the exhale encoder to use a custom audio bandwidth?
No, not from the command-line application, you would have to modify the source code and recompile exhale. Note, however, that exhale was carefully designed to provide good quality for a variety of input signals and for many people with different levels of hearing acuity. Moreover, there are an input audio bandwidth detector and a psychoacoustic model in the code which optimize the coding bandwidth based on each audio frame. Therefore, in order not to jeopardize exhale's audio quality, it is not recommended to make any such changes.


