Files
pintos_22/doc/codebase.texi
2024-10-01 23:37:39 +01:00

635 lines
23 KiB
Plaintext

@node Task 0--Codebase
@chapter Task 0: Alarm Clock
This task is divided into two parts, a codebase preview and a small coding exercise.
The codebase preview has been designed to help you understand how PintOS is structured.
The exercise requires you to answer a short worksheet (handed out through Scientia)
that contains a few questions to check your understanding of the provided PintOS code.
The coding exercise has been designed to help you understand how PintOS works and is structured.
The exercise is concerned with developing a simple feature in PintOS, called Alarm Clock.
@menu
* Task 1 Background::
* Task 0 Requirements::
@end menu
@node Task 1 Background
@section Background
@menu
* Understanding Threads::
* Task 1 Source Files::
* Task 1 Synchronization::
@end menu
@node Understanding Threads
@subsection Understanding Threads
The first step is to read and understand the code for the initial thread
system.
PintOS already implements thread creation and thread completion,
a simple scheduler to switch between threads, and synchronization
primitives (semaphores, locks, condition variables, and optimization
barriers).
Some of this code might seem slightly mysterious. If
you haven't already compiled and run the base system, as described in
the introduction (@pxref{Introduction}), you should do so now. You
can read through parts of the source code to see what's going
on. If you like, you can add calls to @func{printf} almost
anywhere, then recompile and run to see what happens and in what
order. You can also run the kernel in a debugger and set breakpoints
at interesting spots, single-step through code and examine data, and
so on.
When a thread is created, you are creating a new context to be
scheduled. You provide a function to be run in this context as an
argument to @func{thread_create}. The first time the thread is
scheduled and runs, it starts from the beginning of that function
and executes in that context. When the function returns, the thread
terminates. Each thread, therefore, acts like a mini-program running
inside PintOS, with the function passed to @func{thread_create}
acting like @func{main}.
At any given time, exactly one thread runs and the rest, if any,
become inactive. The scheduler decides which thread to
run next. (If no thread is ready to run
at any given time, then the special ``idle'' thread, implemented in
@func{idle}, runs.)
Synchronization primitives can force context switches when one
thread needs to wait for another thread to do something.
The mechanics of a context switch can be found in @file{threads/switch.S}, which is 80@var{x}86 assembly code.
(You don't have to understand it.)
It is enough to know that it saves the state of the currently running thread and restores the state of the thread we're switching to.
Using the GDB debugger on the PintOS kernel (@pxref{GDB}), you can slowly trace through a context switch to see what happens in the C code.
You can set a breakpoint on @func{schedule} to start out, and then
single-step from there.@footnote{GDB might tell you that
@func{schedule} doesn't exist, which is arguably a GDB bug.
You can work around this by setting the breakpoint by filename and
line number, e.g.@: @code{break thread.c:@var{ln}} where @var{ln} is
the line number of the first declaration in @func{schedule}.} Be sure
to keep track of each thread's address
and state, and what procedures are on the call stack for each thread.
You will notice that when one thread calls @func{switch_threads},
another thread starts running, and the first thing the new thread does
is to return from @func{switch_threads}. You will understand the thread
system once you understand why and how the @func{switch_threads} that
gets called is different from the @func{switch_threads} that returns.
@xref{Thread Switching}, for more information.
@strong{Warning}: In PintOS, each thread is assigned a small,
fixed-size execution stack just under @w{4 kB} in size. The kernel
tries to detect stack overflow, but it cannot do so perfectly. You
may cause bizarre problems, such as mysterious kernel panics, if you
declare large data structures as non-static local variables,
e.g. @samp{int buf[1000];}. Alternatives to stack allocation include
the page allocator and the block allocator (@pxref{Memory Allocation}).
@node Task 1 Source Files
@subsection Source Files
Despite PintOS being a tiny operating system, the code volume can be quite discouraging at first sight.
Don't panic: the Alarm Clock exercise for task 0 will help you understand PintOS by working on a small fragment of the code.
The coding required for the later tasks will be more extensive,
but in general should still be limited to a few hundred lines over only a few files.
Here, the hope is that presenting an overview of all source files will give you a start on what
code to look at.
@menu
* devices code::
* thread code::
* lib files::
@end menu
@node devices code
@subsubsection @file{devices} code
The basic threaded kernel includes the following files in the
@file{devices} directory:
@table @file
@item timer.c
@itemx timer.h
System timer that ticks, by default, 100 times per second. You will
modify this code in this task.
@item vga.c
@itemx vga.h
VGA display driver. Responsible for writing text to the screen.
You should have no need to look at this code. @func{printf}
calls into the VGA display driver for you, so there's little reason to
call this code yourself.
@item serial.c
@itemx serial.h
Serial port driver. Again, @func{printf} calls this code for you,
so you don't need to do so yourself.
It handles serial input by passing it to the input layer (see below).
@item block.c
@itemx block.h
An abstraction layer for @dfn{block devices}, that is, random-access,
disk-like devices that are organized as arrays of fixed-size blocks.
Out of the box, PintOS supports two types of block devices: IDE disks
and partitions. Block devices, regardless of type, won't actually be
used until task 2.
@item ide.c
@itemx ide.h
Supports reading and writing sectors on up to 4 IDE disks.
@item partition.c
@itemx partition.h
Understands the structure of partitions on disks, allowing a single
disk to be carved up into multiple regions (partitions) for
independent use.
@item kbd.c
@itemx kbd.h
Keyboard driver. Handles keystrokes passing them to the input layer
(see below).
@item input.c
@itemx input.h
Input layer. Queues input characters passed along by the keyboard or
serial drivers.
@item intq.c
@itemx intq.h
Interrupt queue, for managing a circular queue that both kernel
threads and interrupt handlers want to access. Used by the keyboard
and serial drivers.
@item rtc.c
@itemx rtc.h
Real-time clock driver, to enable the kernel to determine the current
date and time. By default, this is only used by @file{thread/init.c}
to choose an initial seed for the random number generator.
@item speaker.c
@itemx speaker.h
Driver that can produce tones on the PC speaker.
@item pit.c
@itemx pit.h
Code to configure the 8254 Programmable Interrupt Timer. This code is
used by both @file{devices/timer.c} and @file{devices/speaker.c}
because each device uses one of the PIT's output channel.
@end table
@node thread code
@subsubsection @file{thread} code
Here is a brief overview of the files in the @file{threads}
directory.
@table @file
@item loader.S
@itemx loader.h
The kernel loader. Assembles to 512 bytes of code and data that the
PC BIOS loads into memory and which in turn finds the kernel on disk,
loads it into memory, and jumps to @func{start} in @file{start.S}.
@xref{PintOS Loader}, for details. You should not need to look at
this code or modify it.
@item start.S
Does basic setup needed for memory protection and 32-bit
operation on 80@var{x}86 CPUs. Unlike the loader, this code is
actually part of the kernel. @xref{Low-Level Kernel Initialization},
for details.
@item kernel.lds.S
The linker script used to link the kernel. Sets the load address of
the kernel and arranges for @file{start.S} to be near the beginning
of the kernel image. @xref{PintOS Loader}, for details. Again, you
should not need to look at this code
or modify it, but it's here in case you're curious.
@item init.c
@itemx init.h
Kernel initialization, including @func{main}, the kernel's ``main
program.'' You should look over @func{main} at least to see what
gets initialized. You might want to add your own initialization code
here. @xref{High-Level Kernel Initialization}, for details.
@item thread.c
@itemx thread.h
Basic thread support. @file{thread.h} defines @struct{thread}, which you are likely to modify
in all four tasks. See @ref{struct thread} and @ref{Threads} for
more information.
@item switch.S
@itemx switch.h
Assembly language routine for switching threads. Already discussed
above. @xref{Thread Functions}, for more information.
@item palloc.c
@itemx palloc.h
Page allocator, which hands out system memory in multiples of 4 kB
pages. @xref{Page Allocator}, for more information.
@item malloc.c
@itemx malloc.h
A simple implementation of @func{malloc} and @func{free} for
the kernel. @xref{Block Allocator}, for more information.
@item interrupt.c
@itemx interrupt.h
Basic interrupt handling and functions for turning interrupts on and
off. @xref{Interrupt Handling}, for more information.
@item intr-stubs.S
@itemx intr-stubs.h
Assembly code for low-level interrupt handling. @xref{Interrupt
Infrastructure}, for more information.
@item synch.c
@itemx synch.h
Basic synchronization primitives: semaphores, locks, condition
variables, and optimization barriers. You will need to use these for
synchronization in all
four tasks. @xref{Synchronization}, for more information.
@item io.h
Functions for I/O port access. This is mostly used by source code in
the @file{devices} directory that you won't have to touch.
@item vaddr.h
@itemx pte.h
Functions and macros for working with virtual addresses and page table
entries. These will be more important to you in task 3. For now,
you can ignore them.
@item flags.h
Macros that define a few bits in the 80@var{x}86 ``flags'' register.
Probably of no interest. See @bibref{IA32-v1}, section 3.4.3, ``EFLAGS
Register,'' for more information.
@end table
@node lib files
@subsubsection @file{lib} files
Finally, @file{lib} and @file{lib/kernel} contain useful library
routines. (@file{lib/user} will be used by user programs, starting in
task 2, but it is not part of the kernel.) Here's a few more
details:
@table @file
@item ctype.h
@itemx inttypes.h
@itemx limits.h
@itemx stdarg.h
@itemx stdbool.h
@itemx stddef.h
@itemx stdint.h
@itemx stdio.c
@itemx stdio.h
@itemx stdlib.c
@itemx stdlib.h
@itemx string.c
@itemx string.h
A subset of the standard C library. @xref{C99}, for
information
on a few recently introduced pieces of the C library that you might
not have encountered before. @xref{Unsafe String Functions}, for
information on what's been intentionally left out for safety.
@item debug.c
@itemx debug.h
Functions and macros to aid debugging. @xref{Debugging Tools}, for
more information.
@item random.c
@itemx random.h
Pseudo-random number generator. The actual sequence of random values
may vary from one PintOS run to another.
@item round.h
Macros for rounding.
@item syscall-nr.h
System call numbers. Not used until task 2.
@item kernel/list.c
@itemx kernel/list.h
Doubly linked list implementation. Used all over the PintOS code, and
you'll probably want to use it a few places yourself in task 0 and task 1.
@item kernel/bitmap.c
@itemx kernel/bitmap.h
Bitmap implementation. You can use this in your code if you like, but
you probably won't have any need for it in task 0 or task 1.
@item kernel/hash.c
@itemx kernel/hash.h
Hash table implementation. Likely to come in handy for task 3.
@item kernel/console.c
@itemx kernel/console.h
@item kernel/stdio.h
Implements @func{printf} and a few other functions.
@end table
@node Task 1 Synchronization
@subsection Synchronization
Proper synchronization is an important part of the solutions to these
problems. Any synchronization problem can be easily solved by turning
interrupts off: while interrupts are off, there is no concurrency, so
there's no possibility for race conditions. Therefore, it's tempting to
solve all synchronization problems this way, but @strong{don't}.
Instead, use semaphores, locks, and condition variables to solve the
bulk of your synchronization problems. Read the tour section on
synchronization (@pxref{Synchronization}) or the comments in
@file{threads/synch.c} if you're unsure what synchronization primitives
may be used in what situations.
In the PintOS tasks, the only class of problem best solved by
disabling interrupts is coordinating data shared between a kernel thread
and an interrupt handler. Because interrupt handlers can't sleep, they
can't acquire locks. This means that data shared between kernel threads
and an interrupt handler must be protected within a kernel thread by
turning off interrupts.
This task only requires accessing a little bit of thread state from
interrupt handlers. For the alarm clock, the timer interrupt needs to
wake up sleeping threads. Later, in task 1, the advanced scheduler timer
interrupts will need to access a few global and per-thread variables. When
you access these variables from kernel threads, you will need to disable
interrupts to prevent the timer interrupts from interfering with one-another.
When you do turn off interrupts, take care to do so for the least amount
of code possible, or you can end up losing important things such as
timer ticks or input events. Turning off interrupts also increases the
interrupt handling latency, which can make a machine feel sluggish if
taken too far.
The synchronization primitives themselves in @file{synch.c} are
implemented by disabling interrupts. You may need to increase the
amount of code that runs with interrupts disabled here, but you should
still try to keep it to a minimum.
Disabling interrupts can be useful for debugging, if you want to make
sure that a section of code is not interrupted. You should remove
debugging code before turning in your task. (Don't just comment it
out, because that can make the code difficult to read.)
There should be @strong{no} busy waiting in your submission. A tight loop that
calls @func{thread_yield} is one form of busy waiting.
@page
@node Task 0 Requirements
@section Requirements
@menu
* Codebase Preview::
* Alarm Clock Design Document::
* Alarm Clock::
* FAQ::
@end menu
@node Codebase Preview
@subsection Codebase Preview
@menu
* Source Files::
* Questions::
@end menu
For answering the MCQ AnswerBook test questions in the codebase preview you will be expected to have fully read:
@itemize
@item Section 1
@item Section 2.1.1 and 2.1.3
@item Sections A.2-4
@item Sections C, D, E and F
@end itemize
@node Source Files
@subsubsection Source Files
The source files you will have to fully understand:
@table @file
@item src/threads/thread.c
Contains bulk of threading system code
@item src/threads/thread.h
Header file for threads, contains thread struct
@item src/threads/synch.c
Contains the implementation of major synchronisation primitives like
locks and semaphores
@item src/lib/kernel/list.c
Contains PintOS' list implementation
@end table
@node Questions
@subsubsection Task 0 Questions
@include task0_questions.texi
@node Alarm Clock Design Document
@subsection Design Document
When you submit your work for task 0, you must also submit a completed copy of
@uref{devices.tmpl, , the task 0 design document}.
You can find a template design document for this task in @file{pintos/doc/devices.tmpl} and also on Scientia.
You must submit your design document as a @file{.pdf} file.
We recommend that you read the design document template before you start working on the task.
@xref{Task Documentation}, for a sample design document that goes along with a fictitious task.
@node Alarm Clock
@subsection Coding the Alarm Clock
Reimplement @func{timer_sleep}, defined in @file{devices/timer.c}.
Although a working implementation is provided, it ``busy waits,'' that
is, it spins in a loop checking the current time and calling
@func{thread_yield} until enough time has gone by. Reimplement it to
avoid busy waiting.
@deftypefun void timer_sleep (int64_t @var{ticks})
Suspends execution of the calling thread until time has advanced by at
least @w{@var{x} timer ticks}. Unless the system is otherwise idle, the
thread need not wake up after exactly @var{x} ticks. Just put it on
the ready queue after they have waited for the right amount of time.
@func{timer_sleep} is useful for threads that operate in real-time,
e.g.@: for blinking the cursor once per second.
The argument to @func{timer_sleep} is expressed in timer ticks, not in
milliseconds or any another unit. There are @code{TIMER_FREQ} timer
ticks per second, where @code{TIMER_FREQ} is a macro defined in
@code{devices/timer.h}. The default value is 100. We don't recommend
changing this value, because any change is likely to cause many of
the tests to fail.
@end deftypefun
Separate functions @func{timer_msleep}, @func{timer_usleep}, and
@func{timer_nsleep} do exist for sleeping a specific number of
milliseconds, microseconds, or nanoseconds, respectively, but these will
call @func{timer_sleep} automatically when necessary. You do not need
to modify them.
If your delays seem too short or too long, reread the explanation of the
@option{-r} option to @command{pintos} (@pxref{Debugging versus
Testing}).
The alarm clock implementation is needed for Task 1, but is not needed for any later tasks.
@node FAQ
@subsection FAQ
@table @b
@item How much code will I need to write?
Here's a summary of our reference solution, produced by the
@command{diffstat} program. The final row gives total lines inserted
and deleted; a changed line counts as both an insertion and a deletion.
@verbatim
devices/timer.c | 40 ++++++++++++++++++++++++++++++++++++++--
devices/timer.h | 9 +++++++++
2 files changed, 47 insertions(+), 2 deletions(-)
@end verbatim
The reference solution represents just one possible solution. Many
other solutions are also possible and many of those differ greatly from
the reference solution. Some excellent solutions may not modify all the
files modified by the reference solution, and some may modify files not
modified by the reference solution.
@item What does @code{warning: no previous prototype for `@var{func}'} mean?
It means that you defined a non-@code{static} function without
preceding it by a prototype. Because non-@code{static} functions are
intended for use by other @file{.c} files, for safety they should be
prototyped in a header file included before their definition. To fix
the problem, add a prototype in a header file that you include, or, if
the function isn't actually used by other @file{.c} files, make it
@code{static}.
@item What is the interval between timer interrupts?
Timer interrupts occur @code{TIMER_FREQ} times per second. You can
adjust this value by editing @file{devices/timer.h}. The default is
100 Hz.
We don't recommend changing this value, because any changes are likely
to cause many of the tests to fail.
@item How long is a time slice?
There are @code{TIME_SLICE} ticks per time slice. This macro is
declared in @file{threads/thread.c}. The default is 4 ticks.
We don't recommend changing this value, because any changes are likely
to cause many of the tests to fail.
@item How do I run the tests?
@xref{Testing}.
@item Why do I get a test failure in @func{pass}?
@xref{The pass function fails}.
You are probably looking at a backtrace that looks something like this:
@example
0xc0108810: debug_panic (lib/kernel/debug.c:32)
0xc010a99f: pass (tests/threads/tests.c:93)
0xc010bdd3: test_mlfqs_load_1 (...threads/mlfqs-load-1.c:33)
0xc010a8cf: run_test (tests/threads/tests.c:51)
0xc0100452: run_task (threads/init.c:283)
0xc0100536: run_actions (threads/init.c:333)
0xc01000bb: main (threads/init.c:137)
@end example
This is just confusing output from the @command{backtrace} program. It
does not actually mean that @func{pass} called @func{debug_panic}. In
fact, @func{fail} called @func{debug_panic} (via the @func{PANIC}
macro). GCC knows that @func{debug_panic} does not return, because it
is declared @code{NO_RETURN} (@pxref{Function and Parameter
Attributes}), so it doesn't include any code in @func{fail} to take
control when @func{debug_panic} returns. This means that the return
address on the stack looks like it is at the beginning of the function
that happens to follow @func{fail} in memory, which in this case happens
to be @func{pass}.
@xref{Backtraces}, for more information.
@item How do interrupts get re-enabled in the new thread following @func{schedule}?
Every path into @func{schedule} disables interrupts. They eventually
get re-enabled by the next thread to be scheduled. Consider the
possibilities: the new thread is running in @func{switch_thread} (but
see below), which is called by @func{schedule}, which is called by one
of a few possible functions:
@itemize @bullet
@item
@func{thread_exit}, but we'll never switch back into such a thread, so
it's uninteresting.
@item
@func{thread_yield}, which immediately restores the interrupt level upon
return from @func{schedule}.
@item
@func{thread_block}, which is called from multiple places:
@itemize @minus
@item
@func{sema_down}, which restores the interrupt level before returning.
@item
@func{idle}, which enables interrupts with an explicit assembly STI
instruction.
@item
@func{wait} in @file{devices/intq.c}, whose callers are responsible for
re-enabling interrupts.
@end itemize
@end itemize
There is a special case when a newly created thread runs for the first
time. Such a thread calls @func{intr_enable} as the first action in
@func{kernel_thread}, which is at the bottom of the call stack for every
kernel thread but the first.
@item Do I need to account for timer values overflowing?
Don't worry about the possibility of timer values overflowing. Timer
values are expressed as signed 64-bit numbers, which at 100 ticks per
second should be good for almost 2,924,712,087 years. By then, we
expect PintOS to have been phased out of the @value{coursenumber} curriculum.
@item What should I expect from the Task 0 code-review?
The code-review for this task will be conducted offline, as it would be logistically
impossible to arrange face-to-face sessions with the whole cohort.
Our Task 0 code-review will cover @strong{four} main areas:
functional correctness, efficiency, design quality and general coding style.
@itemize @bullet
@item For @strong{functional correctness}, we will be looking to see if your solution can handle many
threads going to sleep or waking-up at the same time, without any unnecessary delays.
We will also be checking if your code for @func{timer_sleep} and @func{timer_interrupt} is free of any race conditions.
@item For @strong{efficiency}, we will be looking at what steps you have taken to minimise the time spent
inside your timer interrupt handler. Think about how you store sleeping threads and track
how long they must sleep for. We will also be looking at your use of memory.
@item For @strong{design quality}, we will be looking at how your have integrated your alarm-clock code with
the rest of the provided operating system. We want to see clear module boundaries and use of abstraction.
@item For @strong{general coding style}, we will be paying attention to all of the usual elements of good style
that you should be used to from last year (e.g. consistent code layout, appropriate use of comments, avoiding magic numbers, etc.)
as well as your use of git (e.g. commit frequency and commit message quality).
@end itemize
@end table