Ensoniq AudioPCI ES1371 and the driver bug nobody found
September 19, 2023
Ensoniq’s AudioPCI sound cards were notorious when they came out back in the latter half of the 1990s. A manufacturer best known for synthesizers built the first mass-market PCI sound solution for PCs, with software-based backwards compatibility for older DOS games which had the rest of the industry stuck on the aging ISA bus. This notoriety led industry giant Creative Labs to acquire a struggling Ensoniq at the dawn of 1998, adding AudioPCI cards to their Sound Blaster family, and harnessing the existing backwards compatibility software for the Sound Blaster Live! later that year.
In typical Creative fashion, the AudioPCI line-up eventually became a mess; did you know there was an AudioPCI-based “Sound Blaster 16 PCI” which was a completely different card to the million ISA Sound Blaster 16 models? All sorts of cards were made under different names by different OEMs, using an assortment of chips with Ensoniq and/or Creative logos and model numbers. There’s a lot of misinformation about these chips available online, not helped by their similarity in look, behavior, branding and sometimes drivers; this is a rundown of what I’ve gathered over the years:
Chip | Description |
---|---|
Ensoniq ES1370 | The original, paired with Asahi Kasei AK4531 codec |
Creative 5507 | Rebadged ES1370 |
Ensoniq ES1371 | Replaced AK4531 codec with the then-new AC'97 standard |
Creative ES1373 | Added S/PDIF digital output and I2S digital input |
Creative CT5880 | Added quadraphonic output with a 4-channel AC'97 codec |
Creative EV1938 | CT5880 + 4-channel AC'97 codec combined in one chip |
The aforementioned driver similarity is best seen in the AC’97 sub-family, where all ES1371, ES1373 and first-revision CT5880 chips share the same PCI device ID of 1274:1371
, only changing in their revision codes. This results in Creative/Ensoniq’s Windows drivers being interchangeable when they shouldn’t really be, and this post is all about one way they are not universally compatible.
Accurate bug emulation
I fell down this rabbit hole when we at the 86Box project decided to overhaul our ES1371 emulation, since the existing one (inherited from PCem) had issues with quite a few drivers, especially the DOS compatibility and Linux ones. OBattler went through the datasheet, implementing proper default values, read-only bits and access width checks (“this can only be read as 32 bits”) on all of the chip’s configuration registers, while I added a proper AC’97 codec implementation, and a third contributor fixed up the NMI system used by the DOS driver. After some tweaks, everything seemed fine… except for the latest and often-used 5.12.01 WDM driver for Windows 98 SE and newer.
With that driver, distributed by Creative’s support website to this day on its “Ensoniq AudioPCI” section and perfectly installable on ES1371 cards, the system ends up blue screening almost as soon as it starts playing audio, no matter if it’s running Windows 98 SE, XP or even 7. The crash details are not helpful, just indicating that something somewhere attempted to read from invalid memory address 0x00000000
, and logging on 86Box indicates an interrupt had just been triggered. While this driver which identifies as sbpci.sys
is affected, the older es1371mp.sys
driver shipped with Windows 2000, XP and even Vista works perfectly fine.
We did all the testing we could to try and diagnose this issue without access to a real card. Trying out VMware which emulates an ES1371 card of its own yet is not affected, changing the access width checks, trying other PCI revision codes… and then it hit. Setting the PCI revision code to a value greater than or equal to 0x04
fixed the BSOD with no further ill effects; however, the Linux ALSA driver source code, as well as the appearance of an “enable S/PDIF” option in a Windows 9x driver control panel, suggested that the “fix” was just pretending to be the very similar ES1373 chip instead, and that’s how it actually sat for a few versions of 86Box.
Viable options were running out, but a new one eventually showed up…
Debugging driver DPCs
With the addition of serial passthrough to 86Box, I could finally attach a proper kernel debugger to an emulated machine running Windows XP, reproduce the bug and determine where the fault came from. Nowhere did the sbpci.sys
driver show up in the fault log; however, the stack trace suggested that the KiRetireDpcList
kernel function was attempting to call another function at unmapped address 0x00000000
, triggering a page fault, handled by KiTrap0E
which gets rightfully confused and throws the DRIVER_IRQL_NOT_LESS_OR_EQUAL
bugcheck.
If you’ve ever done pro audio or other real-time work on a modern Windows machine, you may have heard of DPC latency as a metric for how reliably the system can maintain itself in real-time tasks. Deferred Procedure Calls (DPCs) are a mechanism provided by the Windows NT kernel to device drivers, which allows them to schedule work to be done as soon as the CPU is available, instead of, say, hogging the device’s hardware interrupt handler to disastrous performance consequences. A DPC is represented by a KDPC
data structure containing a pointer to the function to be executed.
Armed with this knowledge, I loaded the offending driver into IDA and looked for any calls to the KeInsertQueueDpc
function, which adds a given KDPC
to the execution queue managed by KiRetireDpcList
. There was just one call, right in the interrupt service routine.
The interrupt handler code reads the interrupt status register at I/O offset 0x04
, checks if bit 13 is set (status AND 0x2000
), calls a function then schedules the DPC if it wasn’t already scheduled (the flag at offset 72+36). What do the available AudioPCI datasheets say about status bit 13?
- ES1371: reserved, always reads as 1
- ES1373: GPIO[1] interrupt pending
This looks innocent enough. The ES1373 added a feature to trigger interrupts from logic level changes to its general purpose input pins, and some reserved bits were repurposed to allow software to check for that. The driver sees one specific GPIO interrupt flag, calls the function I’ve named ToggleGPIOInt
to clear it by disabling then reenabling interrupts for that GPIO, and schedules the DPC to do something I can’t quite figure out which involves calling a timer - that’s a second DPC! - which then mutes all analog and digital outputs, messes around with the sample rate converter RAM, and finally unmutes all outputs.
The kicker is that, for some reason, the interrupt handler doesn’t check if GPIO interrupts are actually supported by the chip, and always schedules the DPC if the bit is set, which on paper is always true on ES1371 cards. Even if we set aside the massive performance implication of constantly scheduling a DPC for execution, it should not crash if the KDPC
is initialized properly… right… right?
Can you believe no one tested this?
One of the driver’s initialization functions is in charge of setting up both DPCs, and it only does that if the chip is supposed to have the GPIO interrupt feature, which by its definition is PCI device ID 0x1371
with a revision code greater than or equal to 0x04
, or device ID 0x5880
with any revision, the latter used by all but the first CT5880 chips which were still 1274:1371
with revision 0x07
.
Since the chip model is only checked during initialization and within the DPC functions themselves (returning STATUS_NOT_SUPPORTED
on the latter), if these criteria are not met but the GPIO interrupt condition is set like it always is on ES1371, the first interrupt to come in - usually issued by the chip when it’s done transferring the first chunk of currently-playing audio samples - ends up scheduling an uninitialized DPC with a null function pointer, breaking the kernel the next time it goes to execute DPCs, as kernel-mode drivers have higher expectations of doing their job right.
That 0x04
revision code check explains how 86Box managed to make the driver work properly by pretending to be an ES1373. Changing the specific GPIO interrupt bit to read as 0 naturally also worked fine, though the datasheet table states it should always be 1, and that’s corroborated by the listed default value. We’ve learned to not trust datasheets over the years - I once ran around in circles for days trying to decipher VIA’s documentation to fix a chipset’s IDE controller - however, this situation showed itself to be probably worth the benefit of the doubt, as we’ve also faced a fair amount of actually broken drivers in those same years.
But there’s absolutely no way Creative could ship a driver that is broken like that… right… right… RIGHT?
Stranger than fiction
This was one of those situations that would really benefit from access to the real thing: an actual ES1371 card. However, these were far less common than the earlier ES1370 or newer ES1373 or CT5880, with most online recognition around the ES1371 based on VMware’s emulation, which probably subsequently led PCem and later 86Box to emulate it as well. When we asked around some retro hardware communities, we only got offers for those different cards, or at the closest, an integrated ES1371 chip on a dead motherboard that tells no tales.
It all changed when a non-programming (but still important) contributor to 86Box noticed our frustrations and offered to hook us up with an ES1371 card, as those are plenty available where they live, not the case where any of us live. With the card in hand, they loaded it up into a Windows 7 machine to make logistics easier - we know the driver works just fine there, despite being from around the time of Windows XP’s release - and I remoted in and installed the driver.
Almost immediately, the click sound effect produced by Windows Explorer when changing folders BSODed the system, to the amusement of that contributor who was sitting at the system and watching the situation unfold. As expected, it entered a boot loop from trying and failing to play the startup sound, but at least we now knew that the driver is indeed broken.
Detour through VMware
Through all this effort, I would often joke that there are more VMware users out there than actual ES1371 cards in circulation; rightfully so, as the absolute majority of PCI device listings containing a 1274:1371
revision 0x02
posted in forums, mailing lists and other places are from VMware users. Once this driver bug and its circumstances were figured out, I did some probing into VMware’s emulation to see what it does differently to make this driver work. My pcireg
tool for DOS can be used to locate the ES1371 on the PCI bus and get the base I/O address assigned to its registers by the BIOS, after which, the I/O ports can be accessed through good old DEBUG
.
It turns out the VMware ES1371 implementation is not very accurate to begin with. Most if not all registers are initialized to 0x00000000
instead of their datasheet-specified default values, never triggering the GPIO interrupt code path, and there is no masking of the interrupt status register’s read-only bits, as shown by the successful writing and reading back of different bit patterns (all 1s as well as alternating bits in 0x55AA55AA
and 0xAA55AA55
) above; the access width constraints appear not to be implemented either. In further testing with the real card, all the documented behaviors are indeed present in hardware, and the default values on a cold power-on do match where it matters.
The devil is in the details
Going back to Creative’s website, this mess definitely originates from the WDM driver’s download listing, as its short blurb does list “Sound Blaster Ensoniq AudioPCI” as a supported card, which sounds about right since ES1371 is AudioPCI…
…until you expand the description, which ends in a very interesting note.
It appears as though this driver has a shortlist of supported cards that nobody read. From what I could gather, all listed models have CT5880 or at least ES1373 chips; while low-resolution pictures don’t do any favors, you can easily tell ES1370, ES1371 and early ES1373 chips from a distance by the black-on-white logo etching they use. At least one of these models was pictured in a Creative-branded “Ensoniq AudioPCI” box, while the CT4730 listed as specifically incompatible uses the later EV1938 chip which has its own PCI ID.
This situation is also not helped by the numerous redistributions of this package on retro driver websites, sometimes under different file names, without this note about card compatibility because let’s be real, it works on all of the very common suspects - VMware’s ES1371 as well as actual ES1373 and CT5880 cards - and statistically nobody had actual ES1371 cards.
One working driver, please
While the latest Creative driver is troublesome on ES1371 cards, both in the flesh and in 86Box 4.0 and higher, other options do still exist:
- For Windows 98 SE and Me, the VxD driver for 95/98 FE does still work and provides DOS compatibility; select Windows 95 on the download page and get "Standalone driver for Creative ENSONIQ AudioPCI on Windows 95/98".
- For Windows 2000, XP and Vista, the inbox driver included with those versions does not have this bug and works perfectly fine, as mentioned earlier.
- There's no inbox driver on Windows 7, but the Windows Update driver package for Vista might just work.
After all, one of my most successful posts is also about finding a working driver for an obtuse piece of hardware, so rambling about obtuseness while not including a solution would feel like a disservice.
Last update: Feb 08, 2024