2012年9月15日星期六

《Programming From The Ground Up》摘录



最近阅读《Programming From The Ground Up》一书,虽然书中用的是汇编语言,但是涵盖了许多知识点,讲得很透彻,让人对计算机的认识更加深刻,每读完一章,阅读欲望更加强烈,这应该就是一本好书的魄力所在吧。下面是我阅读此书时做的一些摘录。

The kernel is both an fence and a gate. As a gate, it allows programs to access hardware in a uniform way. As a fence, the kernel prevents programs from accidentally overwriting each other’s data and from accessing files and devices that they don’t have permission to. 

In fact, in a computer, there is no difference between a program and a program's data except how it is used by the computer. They are both stored and accessed the same way.

Think of a register as a place on your desk - it holds things you are currently working on.

The size of a typical register is called a computer's word size.

Addresses which are stored in memory are also called pointers, because instead of having a regular value in them, they point you to a different location in memory.(A pointer is a register or memory word whose value is an address.)

Remember, computers can only store numbers, so letters, pictures, musics, web pages, documents, and anything else are just long sequences of numbers in the computer, which particular programs know how to interpret.

Think of a stack as a pile of papers on your desk which can be added indefinitely. You generally keep the things that you're working on toward the top, and you take things off as you're finished working with them.

Well, we say it's the top, but the "top" of the stack is actually the bottom of the stack's memory. In memory the stack starts at the top of memory and grows downward due to architectural consideration.

The base pointer(%ebp) is a special register used for accessing function parameters and local variables.
Parameter #N           <---  N*4 + 4(%ebp)
...
Parameter 2              <---  12(%ebp)
Parameter 1              <---  8(%ebp)
Return Address         <---  4(%ebp)
Old %ebp                  <---  (%ebp)
Local Variable 1        <---  -4(%ebp)
Local Variable 2        <---  -8(%ebp) and (%esp)

The only difference between the global and static variables is that static variables are only used by one function, while global variables are used by many functions. Assembly language treats them exactly the same, although most other languages distinguish them.

When a function is done executing, it does three things:
  1. It stores its return value in %eax.
  2. It reset the stack to what it was when it's called(it get rid of the current stack frame and put the stack frame of the calling code back into effect).
  3. It returns the control back to wherever it was called from. This is done using the ret instruction, which pops whatever value is at the top of the stack, and sets the instruction pointer, %eip, to that value.
The way that variables are stored and the parameters and return value are transferred by the computer varies from language to language. This variance is known as a a language's calling convention, because it describes how functions expect to get and receive data when they're called.

UNIX files, no matter what program created them, can all be accessed as a sequential stream of bytes.

A buffer is a continuous block of bytes used for bulk data transfer when you request to read a file, the operating system needs to have a place to store the data it reads. That place is called a buffer.

Linux program usually have at least three open file descriptors when they begin. They are: STDIN, STDOUT, STDERR.

Testing isn't just about making sure your program works, it's about making sure your program doesn't break.

..., all of the code was contained within the source file. Such programs are called statically-linked executable, because they contain all of the necessary functionality for the program that wasn't handled by the kernel. When use shared libraries, your program is then dynamically-linked, which means that not all of the code needed to run the program is actually contained within the program file itself, but in external libraries.

The reason that parameters are pushed in the reverse order is because of functions which take a variable number of parameters like printf. The parameters pushed in last will be in a known position relative to the top of of the stack. The program can then use these parameters to determine where on the stack the additional arguments are, and what type they are. For example, printf uses the format string to determine how many other parameters are being sent. If we pushed the known arguments first, you wouldn't be able to tell where they were on the stack.

Every piece of data on the computer not in a register has an address. The address of data which spans several bytes is the same as the address of its first byte.

A computer looks at memory as a long sequence of numbered storage locations.(These storage locations are called bytes.)

Every Memory Address is a Lie.
a).Physical memory refers to actual RAM chips inside your computer and what they contain.
b).Virtual memory is the way your program thinks about memory.

Before loading your program, Linux finds an empty physical memory space large enough to fit your program, and then tells the processor to pretend that this memory is actually at the address 0x08048000 to load your program into.

Virtual memory can be mapped to more that physical memory, it can be mapped to disk as well.[Swap partitions on Linux does this job.]

The pool of memory used by memory managers is commonly referred to as the heap.

The way a computer handles decimals is by storing them at a fixed precision(number of significant bits). A computer stores decimal numbers in two parts - the exponent and the mantissa.For example, 12345.2 is stored as 1.23452 * 10^4. The mantissa is 1.23452 and the exponent is 4. Now, the mantissa and the exponent are only so long, which leads to some interesting problems. For example, when a computer stores an integer, if you add 1 to it, the resulting number is one larger. This does not necessarily happen with floating point numbers. If the number is sufficiently big, like 5.234 * 10^5000, adding 1 to it might not even register in the mantissa (remember, both parts are only so long). This affects several things, especially order of operations. Let’s say that I add 1 to 5.234 * 10^5000 a few billion or trillion times. Guess what - the number won’t change at all. However, if I add one to itself enough times, and then add it to the original number, it might make a dent.

没有评论:

发表评论