=====================================================================
4. Processes, threads and memory
=====================================================================
Processors, ironically, do not execute
processes - they execute
threads.
A process can be considered to be a container for a number of threads, so it needs at least 1 thread to actually do anything.
Simple processes can exist with just 1 thread as they do not need any more, more complex applications may create and destroy threads all the time, and run threads on many CPUs at the same time.
A thread is a piece of code with its own "stack" which is a list to indicate which functions in which processes were called so that it can keep a track of where to return to once a call is complete.
A stack follows the principle of "LIFO" (Last In, First Out).
The typical analogy for a stack is to imagine a stack of plates - the plate currently at the top of the stack must be removed first, so the most recently added plate is always the first to come off.
All threads in a process share the same memory space and priority, so a thread that misbehaves could start screwing up life for its sibling threads.
"Multi-tasking" is the ability for an operating system to run threads psuedo-concurrently.
I use the word "pseudo" as for a single core, non-hypthreading processor there can only be 1 active thread at any given time - what the operating system does is to have multiple threads which it switches between, and each gets a little bit of time to run before switching to the next thread.
Every time the processor stops execution of one thread and starts running another, this is called a "context switch".
A running thread may either use its allotted time slice, put itself into a "waiting" state when an I/O operation is pending, or be "pre-emptively" forced back onto the waiting queue by an
interrupt.
With a multi-CPU (or multi-core) system there is more than one logical (or physical) processor, so true concurrent processing can take place - 2 distinct threads can be running at the same time.
An application can written to be multi-threaded so that it can take advantage of multiple processors, and Windows itself is a Symmetric Multi-Processor (SMP) operating system which means that it can run its threads on any processor with equal priority.
(The alternative to this is to have the OS run on 1 CPU while its applications run on the others.)
To discuss processes properly we need to explain about how they see memory and also the difference between user mode and kernel mode, first a little aside on naming conventions for storage and throughput...
Gigabytes vs gibibytes
Historically computer designers and programmers have referred to kilobytes, megabytes and gigabytes but not meant their absolutely correct scientific definition.
The reasoning behind this is that computers use base 2 so we don't get convenient round figures often and names were assigned which were close to the correct values.
i.e.
a "kilobyte" officially is 1,000 bytes, but 2^10 is 1,024
a "megabyte" officially is 1,000,000 bytes, but 2^20 is 1,048,576
a "gigabyte" officially is 1,000,000,000 bytes, but 2^30 is 1,073,741,824
a "terabyte" officially is 1,000,000,000,000 bytes, but 2^40 is 1,099,511,627,776
The origins of these scientific prefixes are Greek:
"kilo" is derived from "khilioi", meaning 1,000
"mega" is derived from "megas", meaning "great"
"giga" is derived from "gigas", meaning "giant"
"tera" is derived from "teras", meaning "monster"
There was a big fuss over hard disk manufacturers claiming their products' capacities were N "megabytes" but when reported by an operating system they were shown to apparently have less than this.
It was in the disk manufacturers' interest to make their products have the biggest numbers for sales purposes, so they used the official scientific definition of "mega".
Memory is, as far I am aware, still advertised as N megabytes or gigabytes despite meaning the base 2 definition.
As you can see from the above figures, the larger the unit approximation, the larger the deviation gets - a gigabyte vs a gigabyte could be a difference of over 73,000,000 bytes.
Recent years have shown some people trying to relearn the nomenclature used with storage capacity, though so far I have not seen much evidence of it being widely used - maybe when the US and the UK eventually turn from the imperial measurement system we could see a change
The abbreviations and names for the units have been subtly altered to indicate the difference between base 2 and base 10 values:
kB = kilobyte = 1,000
KiB = kibibyte = 1,024
MB = megabyte = 1,000,000
MiB = mebibyte = 1,048,576
GB = gigabyte = 1,000,000,000
GiB = gibibyte = 1,073,741,824
TB = terabyte = 1,000,000,000,000
TiB = tebibyte = 1,099,511,627,776
I try to refer to "KiB", "MiB" and "GiB" where appropriate (and I remember!), although when talking I will still abbreviate tham to "K", "megs" and "gigs" because I would just feel foolish talking about "kibs, mibs and gibs".
The "bit" and the "byte" are the basic unambiguous definitions.
A "nibble" (4 bits or half a byte) I have seen references to in documents, but never used in practice.
The size of a "word" unfortunately changes depending on the platform being discussed, so is also not a good way to standardise (most commonly 16 or 32 bits, but have been used up to 60 bits).
When discussing bitrates (e.g. audio file sampling rate or theoretical network speed) we use the base 10 definitions, so:
28.8kbps = 28.8kbit/s = 28,800 bits per second
10Mbps = 10Mbit/s = 10,000,000 bits per second
Care should be taken when using anything higher than a byte as confusion can arise.
15 bits, 128 bit/s, 72Kbit/s, 10Mbit/s, 128 bytes, 45 byte/s are all unambiguous, but what about
:
48K - is that 48kB (48,000) or 48KiB (49,152)? What about 48k?
88kBps - is that 88 kilobytes per second or 88 kibibytes per second? (capitalised 'B' would imply bytes rather than bits)
12MBps - is that 12 megabytes per second or 12 mebibytes per second? (same with 12MB/s)
A (32-bit) process running on Windows sees, by default, a 4GiB virtual memory space.
The reason for this is that the highest number that can be represented with 32 bits is 2^32=4GiB (4,294,967,296 in decimal, 0xFFFFFFFF in hexadecimal).
The lower memory addresses (0x00000000-0x7FFFFFFF) are for the application itself and represent "user mode".
The upper memory addresses (0x80000000-0xFFFFFFFF) are for private Windows processes & drivers and represent "kernel mode".
Debugging tip:
If tracing a process crash you can spot if a call in the stack or memory access is in user mode or kernel mode by seeing between which of the above boundaries the address falls.
Kernel mode is shared between all user mode applications, because there only needs to be ever 1 copy of it - whereas you could run 1,000 copies of Notepad all with their own unique areas of memory.
If one of those 1,000 instance of Notepad hangs or is terminated, the other 999 remain unaffected.
If there is a problem detected in kernel mode, every single application is affected and so system stability is at risk so Windows takes measures to prevent possible data corruption or loss - you may know this as a bugcheck or "blue screen (of death)".
This addressing system is true regardless of how much memory the system really has - it is what allows Windows programmers to not deal with the trivialities of memory management or system requirements - as far as they are concerned their portion of memory is 2GiB large.
Of course, loading a copy of Notepad does not make Windows try to allocate 2GiB of memory (physical or virtual) - the program can indicate its "working set" preferences, or Windows will just assign pages of memory as the program requires them.
Some applications may want more than 2GiB of memory, and this is where the "/3GB" switch in BOOT.INI comes into play - this moves the split in virtual memory space from 2+2 to 3+1.
User mode gets 3GiB and kernel mode gets just 1GiB - this affects ALL applications, so Windows now has 50% of the memory it had to play with that it had before.
If an application is not written to require or use more than 2GiB then it will get no advantage from this switch (but Windows itself could get a major disadvantage).
To recap, to ensure this point is clear:
An error occurring in user mode is only critical to that application, it cannot crash Windows.
An error occurring in kernel mode is fatal and will cause a bugcheck.
Note: An "error" in this context really means "unhandled exception" - an illegal operation, such as an attempt to read memory outside of the process's owned space. It is possible that data in memory can get corrupted through "errors", but have non-fatal results such as corrupted graphics or sound.
The kernel, besides Windows itself, includes hardware drivers such as disk, graphics and sound, and "filter" drivers such as open file backup agents and anti-virus.
These all share memory and have the potential to corrupt each other which is why great care should be taken when writing code for execution in kernel mode - this is why the most common bugchecks are driver-related and updating drivers can sometimes have a crippling effect.
Debugging tip:
Trying to nail down the cause of bugcheck error codes can be tricky, even if the problem is reproducable (it most often is not).
If the bugcheck is related to memory corruption or leaking ("!analyze -v" in windbg can give you a hint) then you can either:
- use the tool "verifier" to put a watch on all 3rd party drivers
- enable "special pool"
These options will instruct Windows to pay a lot more attention to the drivers or all access to paged and non-paged pool, the bugcheck may change (and become more frequent) as you are now picking up when the corruption occurs rather than when it is detected - so you get a more meaningful memory dump.
Windows has this concept of "virtual memory", where disk space can be used to "page" data or code out of physical memory as needed - this is where PAGEFILE.SYS comes into play.
While some people recommend turning off all swap files on systems with lots of memory, there are a couple of things to take into account:
- some applications will check for a swap file of a certain minimum size, and may refuse to run
- memory dumps when Windows encounters bugchecks require the swap file to be as large as physical memory "plus a bit more" to store the dump temporarily
The
working set of a process is the amount of physical memory it is currently using.
Windows can choose to page to disk some or all of the working set of a process if:
- it runs short on free/zeroed pages (empty physical memory)
- the process has been idle for a period of time and the memory manager does some housekeeping to maximise available memory
- the process is minimised and housekeeping kicks in
If a process requests some data which is not in physical memory but has been paged to disk, then a "page fault" is incurred - the system has to temporarily switch to another process to locate and retrieve the data so the original thread can continue.
Despite its name, a page fault is not an error, it is a normal, expected operation.
The amount of virtual memory requested by a process does not necessarily indicate how much it has actually
committed, just how much is
reserved.
A process which appears to have 200MiB of virtual memory might have nowhere near that much physical or virtual memory used at all, it could be just an estimate based on what the programmer thought it might want to use at peak.
There are also 2 "pools" of memory which are shared across all processes for dynamic use - paged and non-paged.
Paged pool memory can be paged to disk if required, and has an absolute maximum size of 492MiB on Windows 2000, or 650MiB on XP/2003 (the current maximum is calculated at boot time based on the physical memory in the system).
Non-paged pool memory is never paged to disk, so is most commonly used by drivers that cannot incur a page fault during interrupt to request data not in memory - the absolute maximum here is 256MiB.
(The maximums quoted here are based on 32-bit Windows without the /3GB or /USERVA switches used.)
Dynamic Link Libraries (DLLs) are collections of functions which can be called from applications and actually "memory mapped".
If 2 programs refer to the same DLL then they actually use the same instance, it is not loaded twice in memory (unless one of them happens to perform a write operation, in which case a "copy on write" takes place and the DLL is duplicated in read/write mode for this application alone).
One thing to be aware of is that Windows memory management still counts the size of the DLL against every process which refers to it - so in the case of Internet Explorer, for example, the reported memory usage for iexplore.exe is actually less than Task Manager claims as it uses a number of already-loaded system libraries.