Chapter 12: The Programmer's View of Linux Tools and Skills to Help You Write Assembly Code under a True 32-Bit OS
Where to Now?
Where indeed? If you've followed me this far, you've been exposed to nearly every concept commonly used in assembly language work. As a working environment we've been using MS-DOS, which made a lot of things easier—made most of it possible, in fact. DOS is simple, forgiving, and present in nearly all Windows machines either as a lurker-beneath-the-windows (for Windows 9x) or a very high quality emulation (Windows NT). Either way, it was likely that you had access to DOS if you had a PC anywhere in your life.The trouble is, DOS is the past. At best, it's a training ground for understanding the environments where all the real action is now taking place. And that's basically one of two places these days: Windows and Unix. Most other environments have withered severely and exist primarily as "legacy support"—that is, for people who can't afford the money or effort required to move from where they are to Windows or Unix.On the x86 family of processors (which is what we've been discussing), the undisputed king of Unix implementations is Linux. And where we're going is Linux. It's a true 32-bit protected mode operating system, and it offers the chance to create real 32-bit flat model programs in assembly without a prohibitive amount of head banging. So, what remains of this book will serve to get you started on learning assembly coding for Linux.
Why Not Windows?
The first edition of this book was published in 1992. In the last few years, I've received many letters from readers of the first edition, requesting a second edition that explained how to write Microsoft Windows programs in assembly code. I looked into it. I paled. And I shook my head. Don't go there—you may never come back.The problem is this: A Windows application isn't so much a stand-alone program as a custom-built extension of Windows itself. A DOS assembly program begins at the top, runs down from there, may do some looping back, but eventually it ends. It may touch the operating system from time to time by making system calls, but the nature of those calls is simple: You set up some parameters in registers or on the stack, and you make an INT 21H call into DOS. When DOS does what it must, it returns control to your program. That's about all there is to it.The relationship between Windows and its applications is much closer and far more complex. When a Windows program is running and the user presses a mouse button, Windows intercepts the mouse signal and (in effect) taps your program on the shoulder and whispers: "The user just clicked the right mouse button. What are you going to do about it?" A tremendously complex system of events and responses, of messages passed and messages intercepted, runs through Windows and all of its applications like the threads of water flowing over a rocky streambed. From a distance, it's gorgeous. Up close, it borders on chaotic. And in assembly language, you're up as close as it gets.Just understanding how Windows and Windows applications work at the assembly level could take you months of study. Coding a sizeable app could take a year. Balance against this the fact that a lot of the work in dealing with Windows is always done in precisely the same ways, and you have a tailor-made excuse for drop-in software components and boilerplate code. This is what you get with programming environments like Visual C++, Visual Basic, and Delphi, which basically hand you a generic Windows program with all the infrastructure in place—windows, scroll bars, mouse support, the works—but nothing in the line of specifics. Nonetheless, getting that massive a head start pretty much eliminates any advantage you might have in working in assembly.But what about speed and size? Nothing beats assembly at speed and size, right? Well, nothing beats good assembly at the speed and size game. However . . . you need to keep in mind that when a Windows application is running, much or even most of the time code execution is actually somewhere down in Windows, executing DLLs or other Windows machinery that you have no control over. The parts that you actually write will not likely be what dominate the user's perception of the application's speed.Besides, today's C and Pascal compilers have gotten mighty damned good at generating near-optimal machine code for a specified sequence of high-level language statements. Ace assembly hacks can do better, but it's a little discouraging to ponder just how close to your heels the wolves are snapping.In truth, coding in assembly for Windows is good for one thing and one thing only: to gain a bit-level, way-down-deep under-the-skin understanding of how Windows works. This can be a very good and valuable thing, and if you want to pursue it, I salute you. I also suspect that once you gain that hard-won understanding of Windows internals, you'll run screaming to the most efficient Windows RAD (Rapid Application Development) environment you can find. (For me, that was Delphi.)Only one book to my knowledge has ever been written about coding in assembly for Windows: Windows Assembly Language and Systems Programming, by Barry Kauler (R & D Books, 1997). And for all that it's 400 pages long, it's only a start. Most of what you need to know will have to be found elsewhere, in Microsoft's massive technical documentation.Good luck. Heh-heh. You'll need it.
And Why Linux?
The decision to cover Linux was not automatic. There were actually two other contenders—or maybe a contender and a half. The half-of-a-contender was DOS protected mode, using a 32-bit DOS extender and the DOS Protected Mode Interface, or DPMI. This would have been reasonably simple, and I almost went that way. I turned back because DOS and DPMI just aren't used anymore by anything that isn't legacy. Why make brand-new antiques? No, strike that—the metaphor is inapt; antiques are by definition valuable. Why make brand-new kitsch?Besides, DPMI, for all that it works, is really a crutch under a small and very unpowerful OS. For all the effort you will eventually put into learning assembly technology, you deserve to work with more horsepower than that.The true alternate contender was something called a Windows console application. These are special programs written to be run under Windows NT, in a console—basically, a true 32-bit text-mode window rather than a 16-bit text-mode DOS emulation window. NT console applications are genuine 32-bit programs and are relatively simple to write. They can even do cool Windows-ish things such as display graphical message boxes without a prohibitive amount of fuss. One problem: You must run them under Windows NT, which isn't cheap and currently isn't all that common. On DOS and Windows 9x systems, Windows console applications won't run at all.Ultimately, I chose Linux because it was every bit as powerful as Windows NT (especially in the realm we're discussing in this book) as well as free. Furthermore, there is an immense amount of free code out there on the Internet written for use with Linux. You can install a Linux partition on the same hard disk as a Windows partition, so you don't have to give up your "real work" in Windows to play around with Linux coding.Finally, Linux (as the reigning x86 king of the Unix world) is one of the last places where x86 text-mode programming is still done in a big way. Windows console applications are little-used exceptions to the GUI rule in the Microsoft world. In Linux, text mode is still mainstream.That's where we're going. Let's see what it'll take to get there.