Assembly Language StepbyStep Programming with DOS and Linux 2nd Ed [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Assembly Language StepbyStep Programming with DOS and Linux 2nd Ed [Electronic resources] - نسخه متنی

Jeff Duntemann

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید









Genuflecting to the C Culture


I made it plain in the previous chapter that Linux was a C world from top to bottom. Some people think that by this I mean most of the programs written for Linux are written in C, that the people who created Linux were C people, and so on. True enough—but not enough truth. C was created for Unix, and Unix was created in C. The two evolved together and left indelible marks on one another. Even if Linux or some other species of Unix were reimplemented in Pascal (a very good idea, in my view), the C flavor would still be there, and would have to be there, or what we would have would not be Unix at all.


The Primacy of Libraries


Not all of this C culture is pertinent to assembly language work, but a good part of it is. The part that most affects assembly work, ironically, is the primacy of the standard C libraries. Linux and the standard C libraries are inseparable. The libraries are the way that applications and utilities communicate with the Linux kernel. They stand in place of the DOS INT 21H interface I explained in early chapters.

There are basically three reasons for this:



Portability. This is less important than it used to be, and for those of us who feel that the CPU wars were won by Intel long ago, it may not be important at all. But it's a fact that the standard C libraries were created to make the porting of Unix to other processors easier.



Complexity management. Linux is an order of magnitude (at least) more complex than DOS. It can do more, and can do it (thanks to some of that complexity) with far greater robustness and flexibility. Much of that complexity can be hidden from typical end-user utilities and applications, and the C library is the most important means by which that hiding is done.




Kernel evolution. Linux—like Unix itself—is a work in progress. One reason Unix has had such staying power is that it has been able to evolve to meet the needs of modern users on modern machines, irrespective of its origins on creaking ancient minicomputers with less processor power than a Wal-Mart video game. One reason that this has been possible is that the kernel is not much burdened by layers of "legacy obligations" like those that have made the DOS/Windows 9x chimera such an unholy and crash-prone muddle. The main reason it remains thus unburdened is that the kernel is off limits and not accessed directly by utility and application code. Any legacy burden is borne by the standard C library. The kernel is free to move in the directions that it must, and the standard C libraries are rewritten as necessary so that the same face is presented to utilities and applications.




The INT 80H Kernel Function Interface


This last item brings up a subject I'm asked about a lot: the Linux INT 80H kernel function call interface. Just as there is a software interrupt-based function call interface to DOS, there is a way to call the Linux kernel through software interrupts. Instead of INT 21H it uses INT 80H, but the basic idea is almost identical: You set up parameters in registers and then call INT 80H. There are over 200 kernel primitives that may be called this way. If you keep to these primitives, you don't need the C library.

The INT 80H interface seems to pull at the imaginations of people who have an aversion to C. Many of these are Europeans, on whose continent Pascal still thrives; and being a Pascal guy myself, I can well understand it. That being said, I advise against it, and I won't explain the INT 80H mechanism further in this book. Some information can be found at the Web site of Konstantin Boldyshev at http://lightning.voshod.com/asm. This is a marvelous (and humbling) site, and worth digesting for the context even if you never intend to try some of the tricks he describes.

The INT 80H interface is what the C library uses to communicate with the kernel, and the authors of Linux make it clear that they reserve the right to change the parameters and semantics (that is, what the calls do) of kernel primitives as necessary without notice or apology. If you make use of kernel primitives through INT 80H, your Linux programs will become version-specific. This is not a good thing and will not endear you to users of your software.

If you intend to do any kind of programming at all under Linux, you will have to cut a personal karmic truce with the C language. If you intend to work in assembly, you will have to move beyond an uneasy truce (hey, is there ever an easy truce?) to active and willing collaboration. It can be done. I do it all the time.

Get used to it.


C Calling Conventions


One of the most peculiar things I learned early about Linux programs (peculiar to me, at least) is that the main portion of a Linux program is a subroutine call—called from the startup code linked in at the link stage. That is, when Linux executes a program, it loads that program into memory and runs it—but before your code runs, some standard library code runs, and then executes a CALL instruction to the main: label in the program. (Yes, ye purists and gurus, there is some other grimbling involved). This is the reason that the main program portion of a C program is called the main function. It really is a function, the standard C library code calls it, and it returns control to the standard C library code by executing a RET instruction. I diagrammed this in Figure 12.2 in the previous chapter, and it might be useful to take another look at the figure if this still isn't clear to you.

The way the main program obtains control is therefore the first example you'll see of a set of rules we call the C calling conventions. The C library is nothing if not consistent, and that is its greatest virtue. All C library functions implemented on x86 processors follow these rules. Bake them into your synapses early, and you'll lose a lot less hair than I did trying to figure them out by beating your head against them.

Perforce:



A procedure (which is the more generic term for what C calls a function) must preserve the values of the EBX, ESP, EBP, ESI, and EDI 32-bit registers. That is, although it may use those registers, when it returns control to its caller, the values those registers have must be the same values they had before the function was called. The contents of all other general-purpose registers may be altered at will. (Because Linux is a protected mode operating system, this pointedly does not include the segment registers, which are off limits and should not be altered for any reason.)



A procedure's return value is returned in EAX if it is a value 32 bits in size or smaller. Sixty-four-bit integer values are returned in EDX and EAX, with the low 32 bits in EAX and the high 32 bits in EDX. Floating-point return values are returned at the top of the floating-point stack. (I won't be covering floating-point numerics work in this book.) Strings, structures, and other items larger than 32 bits in size are returned by reference; that is, the procedure returns a pointer to them in EAX.



Parameters passed to procedures are pushed onto the stack in reverse order. That is, given the C function MyFunc(foo, bar, bas), bas is pushed onto the stack first, bar second, and foo last. More on this later.



Procedures do not remove parameters from the stack. The caller must do that after the procedure returns, either by popping the procedures off or (more commonly, since it is usually faster) by adding an offset to the stack pointer ESP. (Again, I'll explain what this means in detail later on, when we actually do it.)



Understanding these rules thoroughly will allow you to make calls to the multitude of functions in the standard C library, as well as other extremely useful libraries such as ncurses, all of which are written in C (either currently or originally) and follow the conventions as I've described them. Much of what I have to teach you about Linux assembly language work involves how to call library functions. Most of the rest of it is no different from DOS—and that you already know!


/ 166