provided code

This commit is contained in:
LabTS
2024-10-01 23:37:39 +01:00
commit 8724a2641e
697 changed files with 74252 additions and 0 deletions

26
doc/.gitignore vendored Normal file
View File

@@ -0,0 +1,26 @@
*.aux
*.cp
*.dvi
*.fn
*.info*
*.ky
*.log
*.pg
*.toc
*.tp
*.vr
/pintos-ic.fns
/pintos-ic.tps
/pintos-ic.vrs
mlfqs1.pdf
mlfqs1.png
mlfqs2.pdf
mlfqs2.png
pintos-ic.html
pintos-ic.pdf
pintos-ic.ps
pintos.text
pintos-ic_*.html
projects.html
sample.tmpl.texi
task0_sheet.pdf

403
doc/44bsd.texi Normal file
View File

@@ -0,0 +1,403 @@
@node 4.4BSD Scheduler
@appendix 4.4@acronym{BSD} Scheduler
@iftex
@macro tm{TEX}
@math{\TEX\}
@end macro
@macro nm{TXT}
@end macro
@macro am{TEX, TXT}
@math{\TEX\}
@end macro
@end iftex
@ifnottex
@macro tm{TEX}
@end macro
@macro nm{TXT}
@w{\TXT\}
@end macro
@macro am{TEX, TXT}
@w{\TXT\}
@end macro
@end ifnottex
@ifhtml
@macro math{TXT}
\TXT\
@end macro
@end ifhtml
@macro m{MATH}
@am{\MATH\, \MATH\}
@end macro
The goal of a general-purpose scheduler is to balance threads' different
scheduling needs. Threads that perform a lot of I/O require a fast
response time to keep input and output devices busy, but need little CPU
time. On the other hand, CPU-bound threads need to receive a lot of
CPU time to finish their work, but have no requirement for fast response
time. Other threads lie somewhere in between, with periods of I/O
punctuated by periods of computation, and thus have requirements that
vary over time. A well-designed scheduler can often accommodate threads
with all these requirements simultaneously.
For task 1, you must implement the scheduler described in this
appendix. Our scheduler resembles the one described in @bibref{McKusick},
which is one example of a @dfn{multilevel feedback queue} scheduler.
This type of scheduler maintains several queues of ready-to-run threads,
where each queue holds threads with a different priority. At any given
time, the scheduler chooses a thread from the highest-priority non-empty
queue. If the highest-priority queue contains multiple threads, then
they run in ``round robin'' order.
Multiple facets of the scheduler require data to be updated after a
certain number of timer ticks. In every case, these updates should
occur before any ordinary kernel thread has a chance to run, so that
there is no chance that a kernel thread could see a newly increased
@func{timer_ticks} value but old scheduler data values.
The 4.4@acronym{BSD} scheduler does not include priority donation.
@menu
* Thread Niceness::
* Calculating Priority::
* Calculating recent_cpu::
* Calculating load_avg::
* 4.4BSD Scheduler Summary::
* Fixed-Point Real Arithmetic::
@end menu
@node Thread Niceness
@section Niceness
Thread priority is dynamically determined by the scheduler using a
formula given below. However, each thread also has an integer
@dfn{nice} value that determines how ``nice'' the thread should be to
other threads. A @var{nice} of zero does not affect thread priority. A
positive @var{nice}, to the maximum of 20, decreases the priority of a
thread and causes it to give up some CPU time it would otherwise receive.
On the other hand, a negative @var{nice}, to the minimum of -20, tends
to take away CPU time from other threads.
The initial thread starts with a @var{nice} value of zero. Other
threads start with a @var{nice} value inherited from their parent
thread. You must implement the functions described below, which are for
use by test programs. We have provided skeleton definitions for them in
@file{threads/thread.c}.
@deftypefun int thread_get_nice (void)
Returns the current thread's @var{nice} value.
@end deftypefun
@deftypefun void thread_set_nice (int @var{new_nice})
Sets the current thread's @var{nice} value to @var{new_nice} and
recalculates the thread's priority based on the new value
(@pxref{Calculating Priority}). If the running thread no longer has the
highest priority, yields.
@end deftypefun
@node Calculating Priority
@section Calculating Priority
Our scheduler has 64 priorities and thus 64 ready queues, numbered 0
(@code{PRI_MIN}) through 63 (@code{PRI_MAX}). Lower numbers correspond
to lower priorities, so that priority 0 is the lowest priority
and priority 63 is the highest. Thread priority is calculated initially
at thread initialization. It is also recalculated for each thread (if necessary)
on every fourth clock tick.
In either situation, it is determined by the formula:
@center @t{@var{priority} = @code{PRI_MAX} - (@var{recent_cpu} / 4) - (@var{nice} * 2)},
@noindent where @var{recent_cpu} is an estimate of the CPU time the
thread has used recently (see below) and @var{nice} is the thread's
@var{nice} value. The result should be rounded down to the nearest
integer (truncated).
The coefficients @math{1/4} and 2 on @var{recent_cpu}
and @var{nice}, respectively, have been found to work well in practice
but lack deeper meaning. The calculated @var{priority} is always
adjusted to lie in the valid range @code{PRI_MIN} to @code{PRI_MAX}.
This formula gives a thread that has received CPU time recently lower priority
for being reassigned the CPU the next time the scheduler runs.
This is key to preventing starvation:
a thread that has not received any CPU time recently will have a
@var{recent_cpu} of 0, which barring a high @var{nice} value should
ensure that it receives CPU time soon.
This technique is sometimes referred to as "aging" in the literature.
@node Calculating recent_cpu
@section Calculating @var{recent_cpu}
We wish @var{recent_cpu} to measure how much CPU time each process has
received ``recently.'' Furthermore, as a refinement, more recent CPU
time should be weighted more heavily than less recent CPU time. One
approach would use an array of @var{n} elements to
track the CPU time received in each of the last @var{n} seconds.
However, this approach requires O(@var{n}) space per thread and
O(@var{n}) time per calculation of a new weighted average.
Instead, we use an @dfn{exponentially weighted moving average}, which
takes this general form:
@center @tm{x(0) = f(0),}@nm{x(0) = f(0),}
@center @tm{x(t) = ax(t-1) + (1-a)f(t),}@nm{x(t) = a*x(t-1) + (1-a)*f(t),}
@center @tm{a = k/(k+1),}@nm{a = k/(k+1),}
@noindent where @math{x(t)} is the moving average at integer time @am{t
\ge 0, t >= 0}, @math{f(t)} is the function being averaged, and @math{k
> 0} controls the rate of decay. We can iterate the formula over a few
steps as follows:
@center @math{x(0) = f(0)},
@center @am{x(1) = af(0) + (1-a)f(1), x(1) = a*f(0) + (1-a)*f(1)},
@center @am{\vdots, ...}
@center @am{x(4) = a^4f(0) + a^3(1-a)f(1) + a^2(1-a)f(2) + a(1-a)f(3) + (1-a)f(4), x(4) = a**4*f(0) + a**3*(1-a)*f(1) + a**2*(1-a)*f(2) + a*(1-a)*f(3) + (1-a)*f(4)}.
@noindent The value of @math{f(t)} has a weight of @math{(1-a)} at time @math{t},
a weight of @math{a(1-a)} at time @math{t+1}, @am{a^2(1-a), a**2(1-a)} at time
@math{t+2}, and so on. We can also relate @math{x(t)} to @math{k}:
@math{f(t)} has a weight of approximately @math{1/e} at time @math{t+k},
approximately @am{1/e^2, 1/e**2} at time @am{t+2k, t+2*k}, and so on.
From the opposite direction, @math{f(t)} decays to weight @math{w} at around
time @am{t + \log_aw, t + ln(w)/ln(a)}.
The initial value of @var{recent_cpu} is 0 in the first thread
created, or the parent's value in other new threads. Each time a timer
interrupt occurs, @var{recent_cpu} is incremented by 1 for the running
thread only, unless the idle thread is running. In addition, once per
second the value of @var{recent_cpu}
is recalculated for every thread (whether running, ready, or blocked),
using this formula:
@center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}},
@noindent where @var{load_avg} is a moving average of the number of
threads ready to run (see below). If @var{load_avg} is 1, indicating
that a single thread, on average, is competing for the CPU, then the
current value of @var{recent_cpu} decays to a weight of .1 in
@am{\log_{2/3}.1 \approx 6, ln(.1)/ln(2/3) = approx. 6} seconds; if
@var{load_avg} is 2, then decay to a weight of .1 takes @am{\log_{3/4}.1
\approx 8, ln(.1)/ln(3/4) = approx. 8} seconds. The effect is that
@var{recent_cpu} estimates the amount of CPU time the thread has
received ``recently,'' with the rate of decay inversely proportional to
the number of threads competing for the CPU.
Assumptions made by some of the tests require that these recalculations of
@var{recent_cpu} be made exactly when the system tick counter reaches a
multiple of a second, that is, when @code{timer_ticks () % TIMER_FREQ ==
0}, and not at any other time.
The value of @var{recent_cpu} can be negative for a thread with a
negative @var{nice} value. Do not clamp negative @var{recent_cpu} to 0.
You may need to think about the order of calculations in this formula.
We recommend computing the coefficient of @var{recent_cpu} first, then
multiplying. In the past, some students have reported that multiplying
@var{load_avg} by @var{recent_cpu} directly can cause overflow.
You must implement @func{thread_get_recent_cpu}, for which there is a
skeleton in @file{threads/thread.c}.
@deftypefun int thread_get_recent_cpu (void)
Returns 100 times the current thread's @var{recent_cpu} value, rounded
to the nearest integer.
@end deftypefun
@node Calculating load_avg
@section Calculating @var{load_avg}
Finally, @var{load_avg}, often known as the system load average,
estimates the average number of threads ready to run over the past
minute. Like @var{recent_cpu}, it is an exponentially weighted moving
average. Unlike @var{priority} and @var{recent_cpu}, @var{load_avg} is
system-wide, not thread-specific. At system boot, it is initialized to
0. Once per second thereafter, it is updated according to the following
formula:
@center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}},
@noindent where @var{ready_threads} is the number of threads that are
either running or ready to run at time of update (not including the idle
thread).
Because of assumptions made by some of the tests, @var{load_avg} must be
updated exactly when the system tick counter reaches a multiple of a
second, that is, when @code{timer_ticks () % TIMER_FREQ == 0}, and not
at any other time.
You must implement @func{thread_get_load_avg}, for which there is a
skeleton in @file{threads/thread.c}.
@deftypefun int thread_get_load_avg (void)
Returns 100 times the current system load average, rounded to the
nearest integer.
@end deftypefun
@node 4.4BSD Scheduler Summary
@section Summary
The following formulas summarize the calculations required to implement the
scheduler. They are not a complete description of the scheduler's requirements.
Every thread has a @var{nice} value between -20 and 20 directly under
its control. Each thread also has a priority, between 0
(@code{PRI_MIN}) through 63 (@code{PRI_MAX}), which is recalculated (as necessary)
using the following formula:
@center @t{@var{priority} = @code{PRI_MAX} - (@var{recent_cpu} / 4) - (@var{nice} * 2)}.
@var{recent_cpu} measures the amount of CPU time a thread has received
``recently.'' On each timer tick, the running thread's @var{recent_cpu}
is incremented by 1. Once per second, every thread's @var{recent_cpu}
is updated this way:
@center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}}.
@var{load_avg} estimates the average number of threads ready to run over
the past minute. It is initialized to 0 at boot and recalculated once
per second as follows:
@center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}}.
@noindent where @var{ready_threads} is the number of threads that are
either running or ready to run at time of update (not including the idle
thread).
Note that it is important that each of these calculations is based on up-to-date data values.
That is, the calculation of each thread's @var{priority} should be based on the most recent @var{recent_cpu} value
and, similarly, the calculation of @var{recent_cpu} should itself be based on the most recent @var{load_avg} value.
You should take these dependencies into account when implementing these calculations.
You should also think about the efficiency of your calculations.
The more time your scheduler spends working on these calculations,
the less time your actual processes will have to run.
It is important, therefore, to only perform calculations when absolutely necessary.
@node Fixed-Point Real Arithmetic
@section Fixed-Point Real Arithmetic
In the formulas above, @var{priority}, @var{nice}, and
@var{ready_threads} are integers, but @var{recent_cpu} and @var{load_avg}
are real numbers. Unfortunately, PintOS does not support floating-point
arithmetic in the kernel, because it would
complicate and slow the kernel. Real kernels often have the same
limitation, for the same reason. This means that calculations on real
quantities must be simulated using integers. This is not
difficult, but many students do not know how to do it. This
section explains the basics.
The fundamental idea is to treat the rightmost bits of an integer as
representing a fraction. For example, we can designate the lowest 14
bits of a signed 32-bit integer as fractional bits, so that an integer
@m{x} represents the real number
@iftex
@m{x/2^{14}}.
@end iftex
@ifnottex
@m{x/(2**14)}, where ** represents exponentiation.
@end ifnottex
This is called a 17.14 fixed-point number representation, because there
are 17 bits before the decimal point, 14 bits after it, and one sign
bit.@footnote{Because we are working in binary, the ``decimal'' point
might more correctly be called the ``binary'' point, but the meaning
should be clear.} A number in 17.14 format represents, at maximum, a
value of @am{(2^{31} - 1) / 2^{14} \approx, (2**31 - 1)/(2**14) =
approx.} 131,071.999.
Suppose that we are using a @m{p.q} fixed-point format, and let @am{f =
2^q, f = 2**q}. By the definition above, we can convert an integer or
real number into @m{p.q} format by multiplying with @m{f}. For example,
in 17.14 format the fraction 59/60 used in the calculation of
@var{load_avg}, above, is @am{(59/60)2^{14}, 59/60*(2**14)} = 16,110.
To convert a fixed-point value back to an
integer, divide by @m{f}. (The normal @samp{/} operator in C rounds
toward zero, that is, it rounds positive numbers down and negative
numbers up. To round to nearest, add @m{f / 2} to a positive number, or
subtract it from a negative number, before dividing.)
Many operations on fixed-point numbers are straightforward. Let
@code{x} and @code{y} be fixed-point numbers, and let @code{n} be an
integer. Then the sum of @code{x} and @code{y} is @code{x + y} and
their difference is @code{x - y}. The sum of @code{x} and @code{n} is
@code{x + n * f}; difference, @code{x - n * f}; product, @code{x * n};
quotient, @code{x / n}.
Multiplying two fixed-point values has two complications. First, the
decimal point of the result is @m{q} bits too far to the left. Consider
that @am{(59/60)(59/60), (59/60)*(59/60)} should be slightly less than
1, but @tm{16,110\times 16,110}@nm{16,110*16,110} = 259,532,100 is much
greater than @am{2^{14},2**14} = 16,384. Shifting @m{q} bits right, we
get @tm{259,532,100/2^{14}}@nm{259,532,100/(2**14)} = 15,840, or about 0.97,
the correct answer. Second, the multiplication can overflow even though
the answer is representable. For example, 64 in 17.14 format is
@am{64 \times 2^{14}, 64*(2**14)} = 1,048,576 and its square @am{64^2,
64**2} = 4,096 is well within the 17.14 range, but @tm{1,048,576^2 =
2^{40}}@nm{1,048,576**2 = 2**40}, greater than the maximum signed 32-bit
integer value @am{2^{31} - 1, 2**31 - 1}. An easy solution is to do the
multiplication as a 64-bit operation. The product of @code{x} and
@code{y} is then @code{((int64_t) x) * y / f}.
Dividing two fixed-point values has opposite issues. The
decimal point will be too far to the right, which we fix by shifting the
dividend @m{q} bits to the left before the division. The left shift
discards the top @m{q} bits of the dividend, which we can again fix by
doing the division in 64 bits. Thus, the quotient when @code{x} is
divided by @code{y} is @code{((int64_t) x) * f / y}.
This section has consistently used multiplication or division by @m{f},
instead of @m{q}-bit shifts, for two reasons. First, multiplication and
division do not have the surprising operator precedence of the C shift
operators. Second, multiplication and division are well-defined on
negative operands, but the C shift operators are not. Take care with
these issues in your implementation.
The following table summarizes how fixed-point arithmetic operations can
be implemented in C. In the table, @code{x} and @code{y} are
fixed-point numbers, @code{n} is an integer, fixed-point numbers are in
signed @m{p.q} format where @m{p + q = 31}, and @code{f} is @code{1 <<
q}:
@html
<CENTER>
@end html
@multitable @columnfractions .5 .5
@item Convert @code{n} to fixed point:
@tab @code{n * f}
@item Convert @code{x} to integer (rounding toward zero):
@tab @code{x / f}
@item Convert @code{x} to integer (rounding to nearest):
@tab @code{(x + f / 2) / f} if @code{x >= 0}, @*
@code{(x - f / 2) / f} if @code{x <= 0}.
@item Add @code{x} and @code{y}:
@tab @code{x + y}
@item Subtract @code{y} from @code{x}:
@tab @code{x - y}
@item Add @code{x} and @code{n}:
@tab @code{x + n * f}
@item Subtract @code{n} from @code{x}:
@tab @code{x - n * f}
@item Multiply @code{x} by @code{y}:
@tab @code{((int64_t) x) * y / f}
@item Multiply @code{x} by @code{n}:
@tab @code{x * n}
@item Divide @code{x} by @code{y}:
@tab @code{((int64_t) x) * f / y}
@item Divide @code{x} by @code{n}:
@tab @code{x / n}
@end multitable
@html
</CENTER>
@end html

46
doc/Makefile Normal file
View File

@@ -0,0 +1,46 @@
TEXIS = pintos-ic.texi intro.texi codebase.texi threads.texi userprog.texi vm.texi \
license.texi reference.texi 44bsd.texi standards.texi \
doc.texi sample.tmpl.texi devel.texi debug.texi installation.texi \
bibliography.texi localsettings.texi task0_questions.texi localgitinstructions.texi
all: pintos-ic.html pintos-ic.info pintos-ic.dvi pintos-ic.ps pintos-ic.pdf task0_sheet.pdf alarmclock.pdf
pintos-ic.html: $(TEXIS) texi2html
./texi2html -toc_file=$@ -split=chapter -nosec_nav -nomenu -init_file ./pintos-t2h.init $<
pintos-ic.info: $(TEXIS)
makeinfo $<
pintos-ic.text: $(TEXIS)
makeinfo --plaintext -o $@ $<
pintos-ic.dvi: $(TEXIS)
texi2dvi $< -o $@
pintos-ic.ps: pintos-ic.dvi
dvips $< -o $@
pintos-ic.pdf: $(TEXIS)
texi2pdf $< -o $@
task0_sheet.pdf : task0_sheet.texi task0_questions.texi
texi2pdf $< -o $@
alarmclock.pdf : alarmclock.tex
pdflatex -shell-escape alarmclock
%.texi: %
sed < $< > $@ 's/\([{}@]\)/\@\1/g;'
clean:
rm -f *.info* *.html
rm -f *.aux *.cp *.dvi *.fn *.fns *.ky *.log *.pdf *.ps *.toc *.tp *.tps *.vr *.vrs *~
rm -rf WWW
rm -f sample.tmpl.texi
rm -f alarmclock.pdf
dist: pintos-ic.html pintos-ic.pdf
rm -rf WWW
mkdir WWW WWW/specs
cp *.html *.pdf *.css *.tmpl WWW
(cd ../specs && cp -r *.pdf freevga kbd sysv-abi-update.html ../doc/WWW/specs)

385
doc/alarmclock.tex Normal file
View File

@@ -0,0 +1,385 @@
\documentclass[a4paper,11pt]{article}
\setcounter{tocdepth}{3}
\usepackage[margin=1in]{geometry}
\usepackage{amsthm}
\usepackage{url}
\usepackage{microtype}
\usepackage{xcolor}
\usepackage{minted}
\usepackage[tt=false, type1=true]{libertine}
\usepackage[libertine]{newtxmath}
\usepackage[scaled=0.8, lining]{FiraMono}
\usepackage[T1]{fontenc}
\usepackage{setspace}
\usepackage{hyperref}
\setstretch{1.15}
\definecolor[named]{ACMPurple}{cmyk}{0.55,1,0,0.15}
\definecolor[named]{ACMDarkBlue}{cmyk}{1,0.58,0,0.21}
\hypersetup{colorlinks,
linkcolor=ACMPurple,
citecolor=ACMPurple,
urlcolor=ACMDarkBlue,
filecolor=ACMDarkBlue}
\usemintedstyle{xcode}
%inline code styling
\newmintinline[shell]{shell}{fontsize=\normalsize, breaklines}
\newmintinline[asm]{asm}{fontsize=\normalsize, breaklines}
% For illustrations
\usepackage{tikz}
%% Comments
\newif\ifcomment
% Comment this line to remove the comments
\commenttrue
\newcommand{\genericcomment}[2]{
\ifcomment
\begin{center}
\fbox{
\begin{minipage}{4in}
{\bf {#2}'s comment:} {\it #1}
\end{minipage}}
\end{center}
\fi}
\newcommand{\boxit}[1]{
\begin{center}
\fbox{
\begin{minipage}{6in}
#1
\end{minipage}
}
\end{center}
}
\newcommand{\markcomment}[1]{
\genericcomment{#1}{Mark}}
\begin{document}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\title{PintOS Task 0 - Codebase Preview}
\date{}
\author{
COMP50007.1 - Laboratory 2 \\
Department of Computing \\
Imperial College London
}
\maketitle
%%%%%%%%%%%%%%%%%%%%%
\section*{Summary}
%%%%%%%%%%%%%%%%%%%%%
This task is divided into two parts: a codebase preview and a small coding exercise.
The codebase preview has been designed to help you familiarise yourself with how PintOS is structured
and requires you to complete a short MCQ AnswerBook assessment to check your understanding of the provided PintOS code.
The coding exercise has been designed to help you understand how PintOS works
and is concerned with developing a simple feature in PintOS, called Alarm Clock.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{Submit by 19:00 on Wednesday 9th October 2024}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{What To Do:}
%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection*{Getting the files required for the exercise}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
You have each been provided with a Git repository on the department's \shell{GitLab} server that contains the files required for this exercise.
To obtain this skeleton repository you will need to clone it into your local workspace.
You can do this with the following command:
%
\begin{minted}{shell}
prompt> git clone https://gitlab.doc.ic.ac.uk/lab2425_autumn/pintos_task0_<login>.git
\end{minted}
%
replacing \shell{<login>} with your normal college login.
You will be prompted for your normal college username and password.
You can also clone the skeleton repository via SSH (and avoid having to type in your username/password for every future clone, pull and push) if you have set up the required public/private keys on GitLab with the command:
%
\begin{minted}{shell}
prompt> git clone git@gitlab.doc.ic.ac.uk:lab2425_autumn/pintos_task0_<login>.git
\end{minted}
%
again, replacing \shell{<login>} with your normal college login.
Please feel free to ask a member of the lab support team for help with this if you want to access \shell{GitLab} via SSH but are unsure of how to set it up.
Using either of these commands will create a directory in your current location called \shell{pintos_task0_<login>}.
For more details about the contents of this repository see section 1.1.1 of the PintOS manual.
This is generally the way that we will hand out all lab exercises this year, so you should ensure that you are comfortable with the process.
\subsection*{Finding out about PintOS}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Everything that you need to know for the whole PintOS project can be found in the PintOS manual,
so it is a good idea to read it all eventually.
However, for the purposes of this codebase preview it should be sufficient that you carefully read sections 1 and 2
as well as appendicies A, C, D and E.
For some of the MCQ AnswerBook questions, examining the PintOS code-base will also be useful,
particularly \shell{thread.c}, \shell{thread.h} and \shell{synch} in the \shell{src/threads/} directory
and \shell{list.c} in the \shell{src/lib/kernel/} directory.\\
\noindent You can find additional guidance on this Task in section 2 of the PintOS manual: ``Task 0: Alarm Clock''
\subsection*{Working on PintOS}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
You should work on the files in your local workspace, making regular commits and pushes back to your \shell{GitLab} Git repository.
Recall that you will first need to add any new/modified files to your local Git workspace with:
%
\begin{minted}{shell}
prompt> git add <filename>
\end{minted}
%
You can then commit your changes to your local index with:
%
\begin{minted}{shell}
prompt> git commit -m "your *meaningful* commit message here"
\end{minted}
%
Finally you will need to push these changes from your local index to the Git repository with:
%
\begin{minted}{shell}
prompt> git push origin master
\end{minted}
%
You can check that a push succeeded by looking at the state of your repository using the \shell{GitLab} webpages:
\url{https://gitlab.doc.ic.ac.uk/}
\noindent (you will need to login with your normal college username and password).
You are of course free to utilise the more advanced features of Git such as branching and tagging.
Further details can be found in your first year notes and at:
\url{https://workspace.imperial.ac.uk/computing/Public/files/Git-Intro.pdf}.\\
{\bf Important:} Your final submission will be taken from your \shell{pintos_task0_<login>} \shell{GitLab} repository,
so you must understand how to push your work to it correctly.
If in any doubt, come and get help from the TF office (room 306) or during one of the lab sessions.
It is {\bf your} responsibility to ensure that you submit the correct version of your work.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{Part A - Codebase Preview}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
In this part of the task you will be required to answer a series of MCQs (Multiple Choice Questions) that test your understanding of the basic PintOS concepts and the provided PintOS code-base.
If you have completed the pre-reading suggested above, then you should not find the MCQ AnswerBook test particularly challenging.
The MCQ AnswerBook test will be scheduled on Scientia, and the questions will be based on the following areas of PintOS.
The test will be open-book, so you are advised to answer these questions yourelf ahead of time.
\paragraph{Part 1:}
Which Git command should you run to retrieve a copy of your individual repository for PintOS Task 0 in your local directory? \\
(\textit{Hint: be specific to this task and think about ease of use.})
\paragraph{Part 2:}
Why is using the {\tt strcpy()} function to copy strings usually a bad idea? \\
(\textit{Hint: be sure to clearly identify the problem.})
\paragraph{Part 3:}
If test \shell{src/tests/devices/alarm-multiple} fails, where would you find its output and result logs? \\
Provide both paths and filenames. \\
(\textit{Hint: you might want to run this test and find out.})
\paragraph{Part 4:}
In PintOS, a thread is characterized by a struct and an execution stack. \\
(a) What are the limitations on the size of these data structures? \\
(b) Explain how this relates to stack overflow and how PintOS identifies if a stack overflow has occurred.
\paragraph{Part 5:}
Explain how thread scheduling in PintOS currently works in roughly 300 words.
Include the chain of execution of function calls. \\
(\textit{Hint: we expect you to at least mention which functions participate in a context switch, how they interact, how and when the thread state is modified and the role of interrupts.)}
\paragraph{Part 6:}
In PintOS, what is the default length (in ticks \emph{and} in seconds) of a scheduler time slice? \\
(\textit{Hint: read the Task 0 documentation carefully.})
\paragraph{Part 7:}
In PintOS, how would you print an unsigned 64 bit \shell{int}?
(Consider that you are working with C99). \\
Don't forget to state any inclusions needed by your code.
\paragraph{Part 8:}
Explain the property of {\bf reproducibility} and how the lack of reproducibility will affect debugging.
\paragraph{Part 9:}
In PintOS, locks are implemented on top of semaphores.\\
(a) How do the functions in the API of locks relate to those of semaphores?\\
(b) What extra property do locks have that semaphores do not?
\paragraph{Part 10:}
Define what is meant by a {\bf race-condition}. Why is the test \shell{ if(x \!= null) }
insufficient to prevent a segmentation fault from occurring on an attempted access to a structure through pointer \shell{x}?\\
(\textit{Hint: you should assume that the pointer variable is correctly typed, that the structure was successfully initialised earlier in the program
and that there are other threads running in parallel.})
\pagebreak
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{Part B - The Alarm Clock}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
In this part, you are required to implement a simple functionality in PintOS and to answer the design document questions listed below.
\subsection*{Coding the Alarm Clock in PintOS}
Reimplement \shell{timer_sleep()}, defined in '\shell{devices/timer.c}.\\
\noindent Although a working implementation of \shell{timer_sleep()} is provided, it “busy waits,” that is,
it spins in a loop checking the current time and calling \shell{thread_yield()} until enough time has gone by.
You need to reimplement it to avoid busy waiting.
Further instructions and hints can be found in the PintOS manual.\\
\noindent The marks for this part are awarded as follows:
Passing the Automated Tests ({\bf 8 marks}).
Performance in the Code Review ({\bf 12 marks}).
Answering the Design Document Questions below ({\bf 10 marks}).
\subsection*{Task 0 Design Document Questions:}
\subsubsection*{Data Structures}
A1: ({\bf 2 marks}) \\
Copy here the declaration of each new or changed `\shell{struct}' or `\shell{struct}' member,
global or static variable, `\shell{typedef}', or enumeration.
Identify the purpose of each in roughly 25 words.
\subsubsection*{Algorithms}
A2: ({\bf 2 marks}) \\
Briefly describe what happens in a call to \shell{timer_sleep()}, including the actions performed by the timer interrupt handler on each timer tick. \\
\noindent A3: ({\bf 2 marks}) \\
What steps are taken to minimize the amount of time spent in the timer interrupt handler?
\subsubsection*{Synchronization}
A4: ({\bf 1 mark}) \\
How are race conditions avoided when multiple threads call \shell{timer_sleep()} simultaneously? \\
\noindent A5: ({\bf 1 mark}) \\
How are race conditions avoided when a timer interrupt occurs during a call to \shell{timer_sleep()}?
\subsubsection*{Rationale}
A6: ({\bf 2 marks}) \\
Why did you choose this design? \\
In what ways is it superior to another design you considered?
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{Testing}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
As you work, you should \emph{add}, \emph{commit} and \emph{push} your changes to your Git repository, as discussed above.
You should also be carefully testing your work throughout the exercise.
You should be used to regularly testing your code locally on your development machine,
but to help you ensure that your code will compile and run as expected in our testing environment,
we have provided you with the Lab Testing Service: \shell{LabTS}.
\shell{LabTS} will clone your \shell{GitLab} repository and run several automated test processes over your work.
This will happen automatically after the deadline, but can also be requested during the course of the exercise (usually on a sub-set of the final tests).
You can access the \shell{LabTS} webpages at:
\url{https://teaching.doc.ic.ac.uk/labts}
\noindent (note that you will be required to log-in with your normal college username and password.)
If you click through to your \shell{pintos_task0_<login>} repository you will see a list of the different versions of your work that you have pushed.
Next to each commit you will see a button that will allow you to request that this version of your work is run through the automated test process.
If you click this button your work will be tested (this may take a few minutes) and the results will appear in the relevant column.\\
{\bf Important:} It is {\bf your} responsibility to ensure that your code behaves as expected in our automated test environment.
Code that fails to compile/run in this environment will score {\bf zero marks} for implementation correctness.
You should find that this environment behaves like the set-up found on our lab machines.
If you are experiencing any problems in this regard then you should seek help from a lab demonstrator or the lab coordinator at the earliest opportunity.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{Submission}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Your \shell{GitLab} repository should contain the final submission for your alarm clock implementation.
\shell{LabTS} can be used to test any revision of your work that you wish.
However, you will still need to submit a \emph{revision id} to Scientia so that we know which version of your code you consider to be your final submission.
Prior to submission, you should check the state of your \shell{GitLab} repository using the \shell{LabTS} webpages:
\url{https://teaching.doc.ic.ac.uk/labts}
\noindent If you click through to your \shell{pintos_task0_<login>} repository you will see a list of the different versions of your work that you have pushed.
Next to each commit you will see a link to that commit on \shell{GitLab} as well as a button to submit that version of your code for assessment.
Pressing this button will redirect you to Scientia (automatically submitting your revision id)
and prompt you to upload an answers file and a design document under the usual ``original work'' disclaimer.
You should submit to Scientia the version of your code that you consider to be ``final''.
You can change this later by submitting a different version to Scientia as usual.
The submission button on LabTS will be replaced with a green confirmation label if the submission has been successful.
You should submit your Task 0 design document (\shell{designT0.pdf}) and the chosen version of your code to Scientia by 19:00 on Wednesday 9th October 2024.\\
%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{Assessment}
%%%%%%%%%%%%%%%%%%%%%%%%%%%
In total there are {\bf 50 marks} available in this exercise.\\
These are allocated as follows:
%
\begin{center}
\begin{tabular}{l@{\qquad\qquad}l}
Part A: MCQ Answerbook Test & {\bf 20 marks} \\
Part B: Automated Tests & {\bf 8 marks} \\
Part B: Code Review & {\bf 12 marks} \\
Part B: Design Document & {\bf 10 marks} \\
\end{tabular}
\end{center}
%
Any program that does not compile and run will score {\bf 0 marks} for Part B: Automated Tests.\\[-0.8em]
\noindent The marks for Part A will contribute to your COMP50004 Operating Systems coursework grade,
while your marks for Part B will contribute to your COMP50007.1 Laboratory 2 grade.\\
\noindent \textbf{We aim for feedback on this exercise to be returned by Wednesday 25th October 2023.}
\subsubsection*{What should I expect from the Task 0 code-review?}
The code-review for this task will be conducted offline, as it would be logistically
impossible to arrange face-to-face sessions with the whole cohort.
Our Task 0 code-review will cover \textbf{four} main areas:
functional correctness, efficiency, design quality and general coding style.
\begin{itemize}
\item For \textbf{functional correctness}, we will be looking to see if your solution can handle many threads going to sleep or waking-up at the same time, without any unnecessary delays.
We will also be checking if your code for \shell{timer_sleep} and \shell{timer_interrupt} is free of any race conditions.
\item For \textbf{efficiency}, we will be looking at what steps you have taken to minimise the time spent inside your timer interrupt handler. Think about how you store sleeping threads and track how long they must sleep for. We will also be looking at your use of memory.
\item For \textbf{design quality}, we will be looking at how your have integrated your alarm-clock code with the rest of the provided operating system. We want to see clear module boundaries and use of abstraction.
\item For \textbf{general coding style}, we will be paying attention to all of the usual elements of good style that you should be used to from last year (e.g. code layout, appropriate use of comments, avoiding magic numbers, etc.) as well as your use of git (e.g. commit frequency and commit message quality).
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{document}

154
doc/bibliography.texi Normal file
View File

@@ -0,0 +1,154 @@
@node Bibliography
@appendix Bibliography
@macro bibdfn{cite}
@noindent @anchor{\cite\}
[\cite\].@w{ }
@end macro
@menu
* Hardware References::
* Software References::
* Operating System Design References::
@end menu
@node Hardware References
@section Hardware References
@bibdfn{IA32-v1}
IA-32 Intel Architecture Software Developer's Manual Volume 1: Basic
Architecture. Basic 80@var{x}86 architecture and programming
environment. Available via @uref{developer.intel.com}. Section numbers
in this document refer to revision 18.
@bibdfn{IA32-v2a}
IA-32 Intel Architecture Software Developer's Manual
Volume 2A: Instruction Set Reference A-M. 80@var{x}86 instructions
whose names begin with A through M. Available via
@uref{developer.intel.com}. Section numbers in this document refer to
revision 18.
@bibdfn{IA32-v2b}
IA-32 Intel Architecture Software Developer's Manual Volume 2B:
Instruction Set Reference N-Z. 80@var{x}86 instructions whose names
begin with N through Z. Available via @uref{developer.intel.com}.
Section numbers in this document refer to revision 18.
@bibdfn{IA32-v3a}
IA-32 Intel Architecture Software Developer's Manual Volume 3A: System
Programming Guide. Operating system support, including segmentation,
paging, tasks, interrupt and exception handling. Available via
@uref{developer.intel.com}. Section numbers in this document refer to
revision 18.
@bibdfn{FreeVGA}
@uref{specs/freevga/home.htm, , FreeVGA Project}. Documents the VGA video
hardware used in PCs.
@bibdfn{kbd}
@uref{specs/kbd/scancodes.html, , Keyboard scancodes}. Documents PC keyboard
interface.
@bibdfn{ATA-3}
@uref{specs/ata-3-std.pdf, , AT Attachment-3 Interface (ATA-3) Working
Draft}. Draft of an old version of the ATA aka IDE interface for the
disks used in most desktop PCs.
@bibdfn{PC16550D}
@uref{specs/pc16550d.pdf, , National Semiconductor PC16550D Universal
Asynchronous Receiver/Transmitter with FIFOs}. Datasheet for a chip
used for PC serial ports.
@bibdfn{8254}
@uref{specs/8254.pdf, , Intel 8254 Programmable Interval Timer}.
Datasheet for PC timer chip.
@bibdfn{8259A}
@uref{specs/8259A.pdf, , Intel 8259A Programmable Interrupt Controller
(8259A/8259A-2)}. Datasheet for PC interrupt controller chip.
@bibdfn{MC146818A}
@uref{specs/mc146818a.pdf, , Motorola MC146818A Real Time Clock Plus
Ram (RTC)}. Datasheet for PC real-time clock chip.
@node Software References
@section Software References
@bibdfn{ELF1}
@uref{specs/elf.pdf, , Tool Interface Standard (TIS) Executable and
Linking Format (ELF) Specification Version 1.2 Book I: Executable and
Linking Format}. The ubiquitous format for executables in modern Unix
systems.
@bibdfn{ELF2}
@uref{specs/elf.pdf, , Tool Interface Standard (TIS) Executable and
Linking Format (ELF) Specification Version 1.2 Book II: Processor
Specific (Intel Architecture)}. 80@var{x}86-specific parts of ELF.
@bibdfn{ELF3}
@uref{specs/elf.pdf, , Tool Interface Standard (TIS) Executable and
Linking Format (ELF) Specification Version 1.2 Book III: Operating
System Specific (UNIX System V Release 4)}. Unix-specific parts of
ELF.
@bibdfn{SysV-ABI}
@uref{specs/sysv-abi-4.1.pdf, , System V Application Binary Interface:
Edition 4.1}. Specifies how applications interface with the OS under
Unix.
@bibdfn{SysV-i386}
@uref{specs/sysv-abi-i386-4.pdf, , System V Application Binary
Interface: Intel386 Architecture Processor Supplement: Fourth
Edition}. 80@var{x}86-specific parts of the Unix interface.
@bibdfn{SysV-ABI-update}
@uref{specs/sysv-abi-update.html/contents.html, , System V Application Binary
Interface---DRAFT---24 April 2001}. A draft of a revised version of
@bibref{SysV-ABI} which was never completed.
@bibdfn{SUSv3}
The Open Group, @uref{http://www.unix.org/single_unix_specification/,
, Single UNIX Specification V3}, 2001.
@bibdfn{Partitions}
A.@: E.@: Brouwer, @uref{specs/partitions/partition_tables.html, ,
Minimal partition table specification}, 1999.
@bibdfn{IntrList}
R.@: Brown, @uref{http://www.ctyme.com/rbrown.htm, , Ralf Brown's
Interrupt List}, 2000.
@node Operating System Design References
@section Operating System Design References
@bibdfn{Christopher}
W.@: A.@: Christopher, S.@: J.@: Procter, T.@: E.@: Anderson,
@cite{The Nachos instructional operating system}.
Proceedings of the @acronym{USENIX} Winter 1993 Conference.
@uref{http://portal.acm.org/citation.cfm?id=1267307}.
@bibdfn{Dijkstra}
E.@: W.@: Dijkstra, @cite{The structure of the ``THE''
multiprogramming system}. Communications of the ACM 11(5):341--346,
1968. @uref{http://doi.acm.org/10.1145/363095.363143}.
@bibdfn{Hoare}
C.@: A.@: R.@: Hoare, @cite{Monitors: An Operating System
Structuring Concept}. Communications of the ACM, 17(10):549--557,
1974. @uref{http://www.acm.org/classics/feb96/}.
@bibdfn{Lampson}
B.@: W.@: Lampson, D.@: D.@: Redell, @cite{Experience with processes and
monitors in Mesa}. Communications of the ACM, 23(2):105--117, 1980.
@uref{http://doi.acm.org/10.1145/358818.358824}.
@bibdfn{McKusick}
M.@: K.@: McKusick, K.@: Bostic, M.@: J.@: Karels, J.@: S.@: Quarterman,
@cite{The Design and Implementation of the 4.4@acronym{BSD} Operating
System}. Addison-Wesley, 1996.
@bibdfn{Wilson}
P.@: R.@: Wilson, M.@: S.@: Johnstone, M.@: Neely, D.@: Boles,
@cite{Dynamic Storage Allocation: A Survey and Critical Review}.
International Workshop on Memory Management, 1995.
@uref{http://www.cs.utexas.edu/users/oops/papers.html#allocsrv}.

634
doc/codebase.texi Normal file
View File

@@ -0,0 +1,634 @@
@node Task 0--Codebase
@chapter Task 0: Alarm Clock
This task is divided into two parts, a codebase preview and a small coding exercise.
The codebase preview has been designed to help you understand how PintOS is structured.
The exercise requires you to answer a short worksheet (handed out through Scientia)
that contains a few questions to check your understanding of the provided PintOS code.
The coding exercise has been designed to help you understand how PintOS works and is structured.
The exercise is concerned with developing a simple feature in PintOS, called Alarm Clock.
@menu
* Task 1 Background::
* Task 0 Requirements::
@end menu
@node Task 1 Background
@section Background
@menu
* Understanding Threads::
* Task 1 Source Files::
* Task 1 Synchronization::
@end menu
@node Understanding Threads
@subsection Understanding Threads
The first step is to read and understand the code for the initial thread
system.
PintOS already implements thread creation and thread completion,
a simple scheduler to switch between threads, and synchronization
primitives (semaphores, locks, condition variables, and optimization
barriers).
Some of this code might seem slightly mysterious. If
you haven't already compiled and run the base system, as described in
the introduction (@pxref{Introduction}), you should do so now. You
can read through parts of the source code to see what's going
on. If you like, you can add calls to @func{printf} almost
anywhere, then recompile and run to see what happens and in what
order. You can also run the kernel in a debugger and set breakpoints
at interesting spots, single-step through code and examine data, and
so on.
When a thread is created, you are creating a new context to be
scheduled. You provide a function to be run in this context as an
argument to @func{thread_create}. The first time the thread is
scheduled and runs, it starts from the beginning of that function
and executes in that context. When the function returns, the thread
terminates. Each thread, therefore, acts like a mini-program running
inside PintOS, with the function passed to @func{thread_create}
acting like @func{main}.
At any given time, exactly one thread runs and the rest, if any,
become inactive. The scheduler decides which thread to
run next. (If no thread is ready to run
at any given time, then the special ``idle'' thread, implemented in
@func{idle}, runs.)
Synchronization primitives can force context switches when one
thread needs to wait for another thread to do something.
The mechanics of a context switch can be found in @file{threads/switch.S}, which is 80@var{x}86 assembly code.
(You don't have to understand it.)
It is enough to know that it saves the state of the currently running thread and restores the state of the thread we're switching to.
Using the GDB debugger on the PintOS kernel (@pxref{GDB}), you can slowly trace through a context switch to see what happens in the C code.
You can set a breakpoint on @func{schedule} to start out, and then
single-step from there.@footnote{GDB might tell you that
@func{schedule} doesn't exist, which is arguably a GDB bug.
You can work around this by setting the breakpoint by filename and
line number, e.g.@: @code{break thread.c:@var{ln}} where @var{ln} is
the line number of the first declaration in @func{schedule}.} Be sure
to keep track of each thread's address
and state, and what procedures are on the call stack for each thread.
You will notice that when one thread calls @func{switch_threads},
another thread starts running, and the first thing the new thread does
is to return from @func{switch_threads}. You will understand the thread
system once you understand why and how the @func{switch_threads} that
gets called is different from the @func{switch_threads} that returns.
@xref{Thread Switching}, for more information.
@strong{Warning}: In PintOS, each thread is assigned a small,
fixed-size execution stack just under @w{4 kB} in size. The kernel
tries to detect stack overflow, but it cannot do so perfectly. You
may cause bizarre problems, such as mysterious kernel panics, if you
declare large data structures as non-static local variables,
e.g. @samp{int buf[1000];}. Alternatives to stack allocation include
the page allocator and the block allocator (@pxref{Memory Allocation}).
@node Task 1 Source Files
@subsection Source Files
Despite PintOS being a tiny operating system, the code volume can be quite discouraging at first sight.
Don't panic: the Alarm Clock exercise for task 0 will help you understand PintOS by working on a small fragment of the code.
The coding required for the later tasks will be more extensive,
but in general should still be limited to a few hundred lines over only a few files.
Here, the hope is that presenting an overview of all source files will give you a start on what
code to look at.
@menu
* devices code::
* thread code::
* lib files::
@end menu
@node devices code
@subsubsection @file{devices} code
The basic threaded kernel includes the following files in the
@file{devices} directory:
@table @file
@item timer.c
@itemx timer.h
System timer that ticks, by default, 100 times per second. You will
modify this code in this task.
@item vga.c
@itemx vga.h
VGA display driver. Responsible for writing text to the screen.
You should have no need to look at this code. @func{printf}
calls into the VGA display driver for you, so there's little reason to
call this code yourself.
@item serial.c
@itemx serial.h
Serial port driver. Again, @func{printf} calls this code for you,
so you don't need to do so yourself.
It handles serial input by passing it to the input layer (see below).
@item block.c
@itemx block.h
An abstraction layer for @dfn{block devices}, that is, random-access,
disk-like devices that are organized as arrays of fixed-size blocks.
Out of the box, PintOS supports two types of block devices: IDE disks
and partitions. Block devices, regardless of type, won't actually be
used until task 2.
@item ide.c
@itemx ide.h
Supports reading and writing sectors on up to 4 IDE disks.
@item partition.c
@itemx partition.h
Understands the structure of partitions on disks, allowing a single
disk to be carved up into multiple regions (partitions) for
independent use.
@item kbd.c
@itemx kbd.h
Keyboard driver. Handles keystrokes passing them to the input layer
(see below).
@item input.c
@itemx input.h
Input layer. Queues input characters passed along by the keyboard or
serial drivers.
@item intq.c
@itemx intq.h
Interrupt queue, for managing a circular queue that both kernel
threads and interrupt handlers want to access. Used by the keyboard
and serial drivers.
@item rtc.c
@itemx rtc.h
Real-time clock driver, to enable the kernel to determine the current
date and time. By default, this is only used by @file{thread/init.c}
to choose an initial seed for the random number generator.
@item speaker.c
@itemx speaker.h
Driver that can produce tones on the PC speaker.
@item pit.c
@itemx pit.h
Code to configure the 8254 Programmable Interrupt Timer. This code is
used by both @file{devices/timer.c} and @file{devices/speaker.c}
because each device uses one of the PIT's output channel.
@end table
@node thread code
@subsubsection @file{thread} code
Here is a brief overview of the files in the @file{threads}
directory.
@table @file
@item loader.S
@itemx loader.h
The kernel loader. Assembles to 512 bytes of code and data that the
PC BIOS loads into memory and which in turn finds the kernel on disk,
loads it into memory, and jumps to @func{start} in @file{start.S}.
@xref{PintOS Loader}, for details. You should not need to look at
this code or modify it.
@item start.S
Does basic setup needed for memory protection and 32-bit
operation on 80@var{x}86 CPUs. Unlike the loader, this code is
actually part of the kernel. @xref{Low-Level Kernel Initialization},
for details.
@item kernel.lds.S
The linker script used to link the kernel. Sets the load address of
the kernel and arranges for @file{start.S} to be near the beginning
of the kernel image. @xref{PintOS Loader}, for details. Again, you
should not need to look at this code
or modify it, but it's here in case you're curious.
@item init.c
@itemx init.h
Kernel initialization, including @func{main}, the kernel's ``main
program.'' You should look over @func{main} at least to see what
gets initialized. You might want to add your own initialization code
here. @xref{High-Level Kernel Initialization}, for details.
@item thread.c
@itemx thread.h
Basic thread support. @file{thread.h} defines @struct{thread}, which you are likely to modify
in all four tasks. See @ref{struct thread} and @ref{Threads} for
more information.
@item switch.S
@itemx switch.h
Assembly language routine for switching threads. Already discussed
above. @xref{Thread Functions}, for more information.
@item palloc.c
@itemx palloc.h
Page allocator, which hands out system memory in multiples of 4 kB
pages. @xref{Page Allocator}, for more information.
@item malloc.c
@itemx malloc.h
A simple implementation of @func{malloc} and @func{free} for
the kernel. @xref{Block Allocator}, for more information.
@item interrupt.c
@itemx interrupt.h
Basic interrupt handling and functions for turning interrupts on and
off. @xref{Interrupt Handling}, for more information.
@item intr-stubs.S
@itemx intr-stubs.h
Assembly code for low-level interrupt handling. @xref{Interrupt
Infrastructure}, for more information.
@item synch.c
@itemx synch.h
Basic synchronization primitives: semaphores, locks, condition
variables, and optimization barriers. You will need to use these for
synchronization in all
four tasks. @xref{Synchronization}, for more information.
@item io.h
Functions for I/O port access. This is mostly used by source code in
the @file{devices} directory that you won't have to touch.
@item vaddr.h
@itemx pte.h
Functions and macros for working with virtual addresses and page table
entries. These will be more important to you in task 3. For now,
you can ignore them.
@item flags.h
Macros that define a few bits in the 80@var{x}86 ``flags'' register.
Probably of no interest. See @bibref{IA32-v1}, section 3.4.3, ``EFLAGS
Register,'' for more information.
@end table
@node lib files
@subsubsection @file{lib} files
Finally, @file{lib} and @file{lib/kernel} contain useful library
routines. (@file{lib/user} will be used by user programs, starting in
task 2, but it is not part of the kernel.) Here's a few more
details:
@table @file
@item ctype.h
@itemx inttypes.h
@itemx limits.h
@itemx stdarg.h
@itemx stdbool.h
@itemx stddef.h
@itemx stdint.h
@itemx stdio.c
@itemx stdio.h
@itemx stdlib.c
@itemx stdlib.h
@itemx string.c
@itemx string.h
A subset of the standard C library. @xref{C99}, for
information
on a few recently introduced pieces of the C library that you might
not have encountered before. @xref{Unsafe String Functions}, for
information on what's been intentionally left out for safety.
@item debug.c
@itemx debug.h
Functions and macros to aid debugging. @xref{Debugging Tools}, for
more information.
@item random.c
@itemx random.h
Pseudo-random number generator. The actual sequence of random values
may vary from one PintOS run to another.
@item round.h
Macros for rounding.
@item syscall-nr.h
System call numbers. Not used until task 2.
@item kernel/list.c
@itemx kernel/list.h
Doubly linked list implementation. Used all over the PintOS code, and
you'll probably want to use it a few places yourself in task 0 and task 1.
@item kernel/bitmap.c
@itemx kernel/bitmap.h
Bitmap implementation. You can use this in your code if you like, but
you probably won't have any need for it in task 0 or task 1.
@item kernel/hash.c
@itemx kernel/hash.h
Hash table implementation. Likely to come in handy for task 3.
@item kernel/console.c
@itemx kernel/console.h
@item kernel/stdio.h
Implements @func{printf} and a few other functions.
@end table
@node Task 1 Synchronization
@subsection Synchronization
Proper synchronization is an important part of the solutions to these
problems. Any synchronization problem can be easily solved by turning
interrupts off: while interrupts are off, there is no concurrency, so
there's no possibility for race conditions. Therefore, it's tempting to
solve all synchronization problems this way, but @strong{don't}.
Instead, use semaphores, locks, and condition variables to solve the
bulk of your synchronization problems. Read the tour section on
synchronization (@pxref{Synchronization}) or the comments in
@file{threads/synch.c} if you're unsure what synchronization primitives
may be used in what situations.
In the PintOS tasks, the only class of problem best solved by
disabling interrupts is coordinating data shared between a kernel thread
and an interrupt handler. Because interrupt handlers can't sleep, they
can't acquire locks. This means that data shared between kernel threads
and an interrupt handler must be protected within a kernel thread by
turning off interrupts.
This task only requires accessing a little bit of thread state from
interrupt handlers. For the alarm clock, the timer interrupt needs to
wake up sleeping threads. Later, in task 1, the advanced scheduler timer
interrupts will need to access a few global and per-thread variables. When
you access these variables from kernel threads, you will need to disable
interrupts to prevent the timer interrupts from interfering with one-another.
When you do turn off interrupts, take care to do so for the least amount
of code possible, or you can end up losing important things such as
timer ticks or input events. Turning off interrupts also increases the
interrupt handling latency, which can make a machine feel sluggish if
taken too far.
The synchronization primitives themselves in @file{synch.c} are
implemented by disabling interrupts. You may need to increase the
amount of code that runs with interrupts disabled here, but you should
still try to keep it to a minimum.
Disabling interrupts can be useful for debugging, if you want to make
sure that a section of code is not interrupted. You should remove
debugging code before turning in your task. (Don't just comment it
out, because that can make the code difficult to read.)
There should be @strong{no} busy waiting in your submission. A tight loop that
calls @func{thread_yield} is one form of busy waiting.
@page
@node Task 0 Requirements
@section Requirements
@menu
* Codebase Preview::
* Alarm Clock Design Document::
* Alarm Clock::
* FAQ::
@end menu
@node Codebase Preview
@subsection Codebase Preview
@menu
* Source Files::
* Questions::
@end menu
For answering the MCQ AnswerBook test questions in the codebase preview you will be expected to have fully read:
@itemize
@item Section 1
@item Section 2.1.1 and 2.1.3
@item Sections A.2-4
@item Sections C, D, E and F
@end itemize
@node Source Files
@subsubsection Source Files
The source files you will have to fully understand:
@table @file
@item src/threads/thread.c
Contains bulk of threading system code
@item src/threads/thread.h
Header file for threads, contains thread struct
@item src/threads/synch.c
Contains the implementation of major synchronisation primitives like
locks and semaphores
@item src/lib/kernel/list.c
Contains PintOS' list implementation
@end table
@node Questions
@subsubsection Task 0 Questions
@include task0_questions.texi
@node Alarm Clock Design Document
@subsection Design Document
When you submit your work for task 0, you must also submit a completed copy of
@uref{devices.tmpl, , the task 0 design document}.
You can find a template design document for this task in @file{pintos/doc/devices.tmpl} and also on Scientia.
You must submit your design document as a @file{.pdf} file.
We recommend that you read the design document template before you start working on the task.
@xref{Task Documentation}, for a sample design document that goes along with a fictitious task.
@node Alarm Clock
@subsection Coding the Alarm Clock
Reimplement @func{timer_sleep}, defined in @file{devices/timer.c}.
Although a working implementation is provided, it ``busy waits,'' that
is, it spins in a loop checking the current time and calling
@func{thread_yield} until enough time has gone by. Reimplement it to
avoid busy waiting.
@deftypefun void timer_sleep (int64_t @var{ticks})
Suspends execution of the calling thread until time has advanced by at
least @w{@var{x} timer ticks}. Unless the system is otherwise idle, the
thread need not wake up after exactly @var{x} ticks. Just put it on
the ready queue after they have waited for the right amount of time.
@func{timer_sleep} is useful for threads that operate in real-time,
e.g.@: for blinking the cursor once per second.
The argument to @func{timer_sleep} is expressed in timer ticks, not in
milliseconds or any another unit. There are @code{TIMER_FREQ} timer
ticks per second, where @code{TIMER_FREQ} is a macro defined in
@code{devices/timer.h}. The default value is 100. We don't recommend
changing this value, because any change is likely to cause many of
the tests to fail.
@end deftypefun
Separate functions @func{timer_msleep}, @func{timer_usleep}, and
@func{timer_nsleep} do exist for sleeping a specific number of
milliseconds, microseconds, or nanoseconds, respectively, but these will
call @func{timer_sleep} automatically when necessary. You do not need
to modify them.
If your delays seem too short or too long, reread the explanation of the
@option{-r} option to @command{pintos} (@pxref{Debugging versus
Testing}).
The alarm clock implementation is needed for Task 1, but is not needed for any later tasks.
@node FAQ
@subsection FAQ
@table @b
@item How much code will I need to write?
Here's a summary of our reference solution, produced by the
@command{diffstat} program. The final row gives total lines inserted
and deleted; a changed line counts as both an insertion and a deletion.
@verbatim
devices/timer.c | 40 ++++++++++++++++++++++++++++++++++++++--
devices/timer.h | 9 +++++++++
2 files changed, 47 insertions(+), 2 deletions(-)
@end verbatim
The reference solution represents just one possible solution. Many
other solutions are also possible and many of those differ greatly from
the reference solution. Some excellent solutions may not modify all the
files modified by the reference solution, and some may modify files not
modified by the reference solution.
@item What does @code{warning: no previous prototype for `@var{func}'} mean?
It means that you defined a non-@code{static} function without
preceding it by a prototype. Because non-@code{static} functions are
intended for use by other @file{.c} files, for safety they should be
prototyped in a header file included before their definition. To fix
the problem, add a prototype in a header file that you include, or, if
the function isn't actually used by other @file{.c} files, make it
@code{static}.
@item What is the interval between timer interrupts?
Timer interrupts occur @code{TIMER_FREQ} times per second. You can
adjust this value by editing @file{devices/timer.h}. The default is
100 Hz.
We don't recommend changing this value, because any changes are likely
to cause many of the tests to fail.
@item How long is a time slice?
There are @code{TIME_SLICE} ticks per time slice. This macro is
declared in @file{threads/thread.c}. The default is 4 ticks.
We don't recommend changing this value, because any changes are likely
to cause many of the tests to fail.
@item How do I run the tests?
@xref{Testing}.
@item Why do I get a test failure in @func{pass}?
@xref{The pass function fails}.
You are probably looking at a backtrace that looks something like this:
@example
0xc0108810: debug_panic (lib/kernel/debug.c:32)
0xc010a99f: pass (tests/threads/tests.c:93)
0xc010bdd3: test_mlfqs_load_1 (...threads/mlfqs-load-1.c:33)
0xc010a8cf: run_test (tests/threads/tests.c:51)
0xc0100452: run_task (threads/init.c:283)
0xc0100536: run_actions (threads/init.c:333)
0xc01000bb: main (threads/init.c:137)
@end example
This is just confusing output from the @command{backtrace} program. It
does not actually mean that @func{pass} called @func{debug_panic}. In
fact, @func{fail} called @func{debug_panic} (via the @func{PANIC}
macro). GCC knows that @func{debug_panic} does not return, because it
is declared @code{NO_RETURN} (@pxref{Function and Parameter
Attributes}), so it doesn't include any code in @func{fail} to take
control when @func{debug_panic} returns. This means that the return
address on the stack looks like it is at the beginning of the function
that happens to follow @func{fail} in memory, which in this case happens
to be @func{pass}.
@xref{Backtraces}, for more information.
@item How do interrupts get re-enabled in the new thread following @func{schedule}?
Every path into @func{schedule} disables interrupts. They eventually
get re-enabled by the next thread to be scheduled. Consider the
possibilities: the new thread is running in @func{switch_thread} (but
see below), which is called by @func{schedule}, which is called by one
of a few possible functions:
@itemize @bullet
@item
@func{thread_exit}, but we'll never switch back into such a thread, so
it's uninteresting.
@item
@func{thread_yield}, which immediately restores the interrupt level upon
return from @func{schedule}.
@item
@func{thread_block}, which is called from multiple places:
@itemize @minus
@item
@func{sema_down}, which restores the interrupt level before returning.
@item
@func{idle}, which enables interrupts with an explicit assembly STI
instruction.
@item
@func{wait} in @file{devices/intq.c}, whose callers are responsible for
re-enabling interrupts.
@end itemize
@end itemize
There is a special case when a newly created thread runs for the first
time. Such a thread calls @func{intr_enable} as the first action in
@func{kernel_thread}, which is at the bottom of the call stack for every
kernel thread but the first.
@item Do I need to account for timer values overflowing?
Don't worry about the possibility of timer values overflowing. Timer
values are expressed as signed 64-bit numbers, which at 100 ticks per
second should be good for almost 2,924,712,087 years. By then, we
expect PintOS to have been phased out of the @value{coursenumber} curriculum.
@item What should I expect from the Task 0 code-review?
The code-review for this task will be conducted offline, as it would be logistically
impossible to arrange face-to-face sessions with the whole cohort.
Our Task 0 code-review will cover @strong{four} main areas:
functional correctness, efficiency, design quality and general coding style.
@itemize @bullet
@item For @strong{functional correctness}, we will be looking to see if your solution can handle many
threads going to sleep or waking-up at the same time, without any unnecessary delays.
We will also be checking if your code for @func{timer_sleep} and @func{timer_interrupt} is free of any race conditions.
@item For @strong{efficiency}, we will be looking at what steps you have taken to minimise the time spent
inside your timer interrupt handler. Think about how you store sleeping threads and track
how long they must sleep for. We will also be looking at your use of memory.
@item For @strong{design quality}, we will be looking at how your have integrated your alarm-clock code with
the rest of the provided operating system. We want to see clear module boundaries and use of abstraction.
@item For @strong{general coding style}, we will be paying attention to all of the usual elements of good style
that you should be used to from last year (e.g. consistent code layout, appropriate use of comments, avoiding magic numbers, etc.)
as well as your use of git (e.g. commit frequency and commit message quality).
@end itemize
@end table

702
doc/debug.texi Normal file
View File

@@ -0,0 +1,702 @@
@node Debugging Tools
@appendix Debugging Tools
Many tools lie at your disposal for debugging PintOS. This appendix
introduces you to a few of them.
@menu
* printf::
* ASSERT::
* Function and Parameter Attributes::
* Backtraces::
* GDB::
* Triple Faults::
* Debugging Tips::
@end menu
@node printf
@section @code{printf()}
Don't underestimate the value of @func{printf}. The way
@func{printf} is implemented in PintOS, you can call it from
practically anywhere in the kernel, whether it's in a kernel thread or
an interrupt handler, almost regardless of what locks are held.
@func{printf} is useful for more than just examining data.
It can also help figure out when and where something goes wrong, even
when the kernel crashes or panics without a useful error message. The
strategy is to sprinkle calls to @func{printf} with different strings
(e.g.@: @code{"<1>"}, @code{"<2>"}, @dots{}) throughout the pieces of
code you suspect are failing. If you don't even see @code{<1>} printed,
then something bad happened before that point, if you see @code{<1>}
but not @code{<2>}, then something bad happened between those two
points, and so on. Based on what you learn, you can then insert more
@func{printf} calls in the new, smaller region of code you suspect.
Eventually you can narrow the problem down to a single statement.
@xref{Triple Faults}, for a related technique.
@node ASSERT
@section @code{ASSERT}
Assertions are useful because they can catch problems early, before
they'd otherwise be noticed. Ideally, each function should begin with a
set of assertions that check its arguments for validity. (Initializers
for functions' local variables are evaluated before assertions are
checked, so be careful not to assume that an argument is valid in an
initializer.) You can also sprinkle assertions throughout the body of
functions in places where you suspect things are likely to go wrong.
They are especially useful for checking loop invariants.
PintOS provides the @code{ASSERT} macro, defined in @file{<debug.h>},
for checking assertions.
@defmac ASSERT (expression)
Tests the value of @var{expression}. If it evaluates to zero (false),
the kernel panics. The panic message includes the expression that
failed, its file and line number, and a backtrace, which should help you
to find the problem. @xref{Backtraces}, for more information.
@end defmac
@node Function and Parameter Attributes
@section Function and Parameter Attributes
These macros defined in @file{<debug.h>} tell the compiler special
attributes of a function or function parameter. Their expansions are
GCC-specific.
@defmac UNUSED
Appended to a function parameter to tell the compiler that the
parameter might not be used within the function. It suppresses the
warning that would otherwise appear.
@end defmac
@defmac NO_RETURN
Appended to a function prototype to tell the compiler that the
function never returns. It allows the compiler to fine-tune its
warnings and its code generation.
@end defmac
@defmac NO_INLINE
Appended to a function prototype to tell the compiler to never emit
the function in-line. Occasionally useful to improve the quality of
backtraces (see below).
@end defmac
@defmac PRINTF_FORMAT (@var{format}, @var{first})
Appended to a function prototype to tell the compiler that the function
takes a @func{printf}-like format string as the argument numbered
@var{format} (starting from 1) and that the corresponding value
arguments start at the argument numbered @var{first}. This lets the
compiler tell you if you pass the wrong argument types.
@end defmac
@node Backtraces
@section Backtraces
When the kernel panics, it prints a ``backtrace,'' that is, a summary
of how your program got where it is, as a list of addresses inside the
functions that were running at the time of the panic. You can also
insert a call to @func{debug_backtrace}, prototyped in
@file{<debug.h>}, to print a backtrace at any point in your code.
@func{debug_backtrace_all}, also declared in @file{<debug.h>},
prints backtraces of all threads.
The addresses in a backtrace are listed as raw hexadecimal numbers,
which are difficult to interpret. We provide a tool called
@command{backtrace} to translate these into function names and source
file line numbers.
Give it the name of your @file{kernel.o} as the first argument and the
hexadecimal numbers composing the backtrace (including the @samp{0x}
prefixes) as the remaining arguments. It outputs the function name
and source file line numbers that correspond to each address.
If the translated form of a backtrace is garbled, or doesn't make
sense (e.g.@: function A is listed above function B, but B doesn't
call A), then it's a good sign that you're corrupting a kernel
thread's stack, because the backtrace is extracted from the stack.
Alternatively, it could be that the @file{kernel.o} you passed to
@command{backtrace} is not the same kernel that produced
the backtrace.
Sometimes backtraces can be confusing without any corruption.
Compiler optimizations can cause surprising behaviour. When a function
has called another function as its final action (a @dfn{tail call}), the
calling function may not appear in a backtrace at all. Similarly, when
function A calls another function B that never returns, the compiler may
optimize such that an unrelated function C appears in the backtrace
instead of A. Function C is simply the function that happens to be in
memory just after A. In the threads task, this is commonly seen in
backtraces for test failures; see @ref{The pass function fails, ,
@func{pass} fails}, for more information.
@menu
* Backtrace Example::
@end menu
@node Backtrace Example
@subsection Example
Here's an example. Suppose that PintOS printed out this following call
stack, which is taken from an actual PintOS submission:
@example
Call stack: 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319
0xc010325a 0x804812c 0x8048a96 0x8048ac8.
@end example
You would then invoke the @command{backtrace} utility like shown below,
cutting and pasting the backtrace information into the command line.
This assumes that @file{kernel.o} is in the current directory. You
would of course enter all of the following on a single shell command
line, even though that would overflow our margins here:
@example
backtrace kernel.o 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67
0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8
@end example
The backtrace output would then look something like this:
@example
0xc0106eff: debug_panic (lib/debug.c:86)
0xc01102fb: file_seek (filesys/file.c:405)
0xc010dc22: seek (userprog/syscall.c:744)
0xc010cf67: syscall_handler (userprog/syscall.c:444)
0xc0102319: intr_handler (threads/interrupt.c:334)
0xc010325a: intr_entry (threads/intr-stubs.S:38)
0x0804812c: (unknown)
0x08048a96: (unknown)
0x08048ac8: (unknown)
@end example
(You will probably not see exactly the same addresses if you run the
command above on your own kernel binary, because the source code you
compiled and the compiler you used are probably different.)
The first line in the backtrace refers to @func{debug_panic}, the
function that implements kernel panics. Because backtraces commonly
result from kernel panics, @func{debug_panic} will often be the first
function shown in a backtrace.
The second line shows @func{file_seek} as the function that panicked,
in this case as the result of an assertion failure. In the source code
tree used for this example, line 405 of @file{filesys/file.c} is the
assertion
@example
ASSERT (file_ofs >= 0);
@end example
@noindent
(This line was also cited in the assertion failure message.)
Thus, @func{file_seek} panicked because it passed a negative file offset
argument.
The third line indicates that @func{seek} called @func{file_seek},
presumably without validating the offset argument. In this submission,
@func{seek} implements the @code{seek} system call.
The fourth line shows that @func{syscall_handler}, the system call
handler, invoked @func{seek}.
The fifth and sixth lines are the interrupt handler entry path.
The remaining lines are for addresses below @code{PHYS_BASE}. This
means that they refer to addresses in the user program, not in the
kernel. If you know what user program was running when the kernel
panicked, you can re-run @command{backtrace} on the user program, like
so: (typing the command on a single line, of course):
@example
backtrace tests/filesys/extended/grow-too-big 0xc0106eff 0xc01102fb
0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96
0x8048ac8
@end example
The results look like this:
@example
0xc0106eff: (unknown)
0xc01102fb: (unknown)
0xc010dc22: (unknown)
0xc010cf67: (unknown)
0xc0102319: (unknown)
0xc010325a: (unknown)
0x0804812c: test_main (...xtended/grow-too-big.c:20)
0x08048a96: main (tests/main.c:10)
0x08048ac8: _start (lib/user/entry.c:9)
@end example
You can even specify both the kernel and the user program names on
the command line, like so:
@example
backtrace kernel.o tests/filesys/extended/grow-too-big 0xc0106eff
0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c
0x8048a96 0x8048ac8
@end example
The result is a combined backtrace:
@example
In kernel.o:
0xc0106eff: debug_panic (lib/debug.c:86)
0xc01102fb: file_seek (filesys/file.c:405)
0xc010dc22: seek (userprog/syscall.c:744)
0xc010cf67: syscall_handler (userprog/syscall.c:444)
0xc0102319: intr_handler (threads/interrupt.c:334)
0xc010325a: intr_entry (threads/intr-stubs.S:38)
In tests/filesys/extended/grow-too-big:
0x0804812c: test_main (...xtended/grow-too-big.c:20)
0x08048a96: main (tests/main.c:10)
0x08048ac8: _start (lib/user/entry.c:9)
@end example
Here's an extra tip for anyone who read this far: @command{backtrace}
is smart enough to strip the @code{Call stack:} header and @samp{.}
trailer from the command line if you include them. This can save you
a little bit of trouble in cutting and pasting. Thus, the following
command prints the same output as the first one we used:
@example
backtrace kernel.o Call stack: 0xc0106eff 0xc01102fb 0xc010dc22
0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8.
@end example
@node GDB
@section GDB
You can run PintOS under the supervision of the GDB debugger.
First, start PintOS with the @option{--gdb} option, e.g.@:
@command{pintos --gdb -- run mytest}. Second, open a second terminal on
the same machine and
use @command{pintos-gdb} to invoke GDB on
@file{kernel.o}:@footnote{@command{pintos-gdb} is a wrapper around
@command{gdb} (80@var{x}86) that loads the PintOS macros at startup.}
@example
pintos-gdb kernel.o
@end example
@noindent and issue the following GDB command:
@example
target remote localhost:1234
@end example
@noindent or alternatively issue the following GDB macro:
@example
debugpintos
@end example
Now GDB is connected to the simulator over a local
network connection. You can now issue any normal GDB
commands. If you issue the @samp{c} command, the simulated BIOS will take
control, load PintOS, and then PintOS will run in the usual way. You
can pause the process at any point with @key{Ctrl+C}.
@menu
* Using GDB::
* Example GDB Session::
* GDB FAQ::
@end menu
@node Using GDB
@subsection Using GDB
You can read the GDB manual by typing @code{info gdb} at a
terminal command prompt. Here's a few commonly useful GDB commands:
@deffn {GDB Command} c
Continues execution until @key{Ctrl+C} or the next breakpoint.
@end deffn
@deffn {GDB Command} break function
@deffnx {GDB Command} break file:line
@deffnx {GDB Command} break *address
Sets a breakpoint at @var{function}, at @var{line} within @var{file}, or
@var{address}.
(Use a @samp{0x} prefix to specify an address in hex.)
Use @code{break main} to make GDB stop when PintOS starts running.
@end deffn
@deffn {GDB Command} p expression
Evaluates the given @var{expression} and prints its value.
If the expression contains a function call, that function will actually
be executed.
@end deffn
@deffn {GDB Command} l *address
Lists a few lines of code around @var{address}.
(Use a @samp{0x} prefix to specify an address in hex.)
@end deffn
@deffn {GDB Command} bt
Prints a stack backtrace similar to that output by the
@command{backtrace} program described above.
@end deffn
@deffn {GDB Command} p/a address
Prints the name of the function or variable that occupies @var{address}.
(Use a @samp{0x} prefix to specify an address in hex.)
@end deffn
@deffn {GDB Command} diassemble function
Disassembles @var{function}.
@end deffn
We also provide a set of macros specialized for debugging PintOS,
written by Godmar Back @email{gback@@cs.vt.edu}. You can type
@code{help user-defined} for basic help with the macros. Here is an
overview of their functionality, based on Godmar's documentation:
@deffn {GDB Macro} debugpintos
Attach debugger to a waiting pintos process on the same machine.
Shorthand for @code{target remote localhost:1234}.
@end deffn
@deffn {GDB Macro} dumplist list type element
Prints the elements of @var{list}, which must be passed by reference and should be a @code{struct list}
that contains elements of the given @var{type} (without the word
@code{struct}) in which @var{element} is the @struct{list_elem} member
that links the elements.
Example: @code{dumplist &all_list thread allelem} prints all elements of
@struct{thread} that are linked in @code{struct list all_list} using the
@code{struct list_elem allelem} which is part of @struct{thread}.
@end deffn
@deffn {GDB Macro} dumphash hash type element
Similar to @code{dumplist}. Prints the elements of @var{hash}, which must be passed by reference and should be a @code{struct hash}
that contains elements of the given @var{type} (without the word
@code{struct}) in which @var{element} is the @struct{hash_elem} member
that links the elements.
@end deffn
@deffn {GDB Macro} btthread thread
Shows the backtrace of @var{thread}, which is a pointer to the
@struct{thread} of the thread whose backtrace it should show. For the
current thread, this is identical to the @code{bt} (backtrace) command.
It also works for any thread suspended in @func{schedule},
provided you know where its kernel stack page is located.
@end deffn
@deffn {GDB Macro} btthreadlist list element
Shows the backtraces of all threads in @var{list}, which must be passed by reference and is the @struct{list} in which the threads are kept.
Specify @var{element} as the @struct{list_elem} field used inside @struct{thread} to link the threads together.
Example: @code{btthreadlist &all_list allelem} shows the backtraces of
all threads contained in @code{struct list all_list}, linked together by
@code{allelem}. This command is useful to determine where your threads
are stuck when a deadlock occurs. Please see the example scenario below.
@end deffn
@deffn {GDB Macro} btthreadall
Short-hand for @code{btthreadlist all_list allelem}.
@end deffn
@deffn {GDB Macro} hook-stop
GDB invokes this macro every time the simulation stops, which QEMU will
do for every processor exception, among other reasons. If the
simulation stops due to a page fault, @code{hook-stop} will print a
message that says and explains further whether the page fault occurred
in the kernel or in user code.
If the exception occurred from user code, @code{hook-stop} will say:
@example
pintos-debug: a page fault exception occurred in user mode
pintos-debug: hit 'c' to continue, or 's' to step to intr_handler
@end example
In Task 2, a page fault in a user process leads to the termination of
the process. You should expect those page faults to occur in the
robustness tests where we test that your kernel properly terminates
processes that try to access invalid addresses. To debug those, set a
break point in @func{page_fault} in @file{exception.c}, which you will
need to modify accordingly.
In Task 3, a page fault in a user process no longer automatically
leads to the termination of a process. Instead, it may require reading in
data for the page the process was trying to access, either
because it was swapped out or because this is the first time it's
accessed. In either case, you will reach @func{page_fault} and need to
take the appropriate action there.
If the page fault did not occur in user mode while executing a user
process, then it occurred in kernel mode while executing kernel code.
In this case, @code{hook-stop} will print this message:
@example
pintos-debug: a page fault occurred in kernel mode
@end example
Before Task 3, a page fault exception in kernel code is always a bug
in your kernel, because your kernel should never crash. Starting with
Task 3, the situation will change if you use the @func{get_user} and
@func{put_user} strategy to verify user memory accesses
(@pxref{Accessing User Memory}).
@c ----
@c Unfortunately, this does not work with Bochs's gdb stub.
@c ----
@c If you don't want GDB to stop for page faults, then issue the command
@c @code{handle SIGSEGV nostop}. GDB will still print a message for
@c every page fault, but it will not come back to a command prompt.
@end deffn
@node Example GDB Session
@subsection Example GDB Session
This section narrates a sample GDB session, provided by Godmar Back
(modified by Mark Rutland and Feroz Abdul Salam, and updated by Fidelis Perkonigg).
This example illustrates how one might debug a Task 1 solution in
which occasionally a thread that calls @func{timer_sleep} is not woken
up. With this bug, tests such as @code{mlfqs_load_1} get stuck.
Program output is shown in normal type, user input in @strong{strong}
type.
@noindent First, we start PintOS using the QEMU emulator:
@smallexample
@code{$ pintos -v --qemu --gdb -- -q -mlfqs run mlfqs-load-1}
qemu-system-i386 -drive file=/tmp/GKWoGG8QE6.dsk,index=0,media=disk,format=raw -m 4 -net none -nographic -s -S
@end smallexample
@noindent This starts QEMU but pauses the execution of PintOS immediately to
allow us to attach GDB to PintOS. We open a second window in the same directory
on the same machine and start GDB:
@smallexample
$ @strong{pintos-gdb kernel.o}
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
...
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kernel.o...done.
The target architecture is assumed to be i386
@end smallexample
@noindent Then, we tell GDB to attach to the waiting PintOS emulator:
@smallexample
(gdb) @strong{debugpintos}
0x0000fff0 in ?? ()
@end smallexample
@noindent Now we instruct GDB to continue the execution of PintOS by using the
command @code{continue} (or the abbreviation @code{c}):
@smallexample
(gdb) @strong{c}
Continuing.
@end smallexample
@noindent Now PintOS will continue and output:
@smallexample
PiLo hda1
Loading..........
Kernel command line: -q -mlfqs run mlfqs-load-1
PintOS booting with 3,968 kB RAM...
367 pages available in kernel pool.
367 pages available in user pool.
Calibrating timer... 838,041,600 loops/s.
Boot complete.
Executing 'mlfqs-load-1':
(mlfqs-load-1) begin
(mlfqs-load-1) spinning for up to 45 seconds, please wait...
@end smallexample
@noindent
... until it gets stuck due to the bug in the code.
We hit @key{Ctrl+C} in the debugger window to stop PintOS.
@smallexample
Program received signal SIGINT, Interrupt.
intr_get_level () at ../../threads/interrupt.c:66
(gdb)
@end smallexample
@noindent
The thread that was running when we stopped PintOS happened to be the main
thread. If we run @code{backtrace}, it shows this backtrace:
@smallexample
(gdb) @strong{bt}
#0 intr_get_level () at ../../threads/interrupt.c:66
#1 0xc0021103 in intr_enable () at ../../threads/interrupt.c:90
#2 0xc0021150 in intr_set_level (level=INTR_ON) at ../../threads/interrupt.c:83
#3 0xc0023422 in timer_ticks () at ../../devices/timer.c:75
#4 0xc002343d in timer_elapsed (then=23) at ../../devices/timer.c:84
#5 0xc002aabf in test_mlfqs_load_1 () at ../../tests/threads/mlfqs-load-1.c:33
#6 0xc002b610 in run_test (name=0xc0007d4c "mlfqs-load-1") at ../../tests/devices/tests.c:72
#7 0xc00201a2 in run_task (argv=0xc00349e8 <argv+8>) at ../../threads/init.c:290
#8 0xc0020687 in run_actions (argv=0xc00349e8 <argv+8>) at ../../threads/init.c:340
#9 main () at ../../threads/init.c:133
@end smallexample
@noindent
Not terribly useful. What we really like to know is what's up with the
other thread (or threads). Since we keep all threads in a linked list
called @code{all_list}, linked together by a @struct{list_elem} member
named @code{allelem}, we can use the @code{btthreadlist} macro.
@code{btthreadlist} iterates through the list of threads and prints the
backtrace for each thread:
@smallexample
(gdb) @strong{btthreadlist &all_list allelem}
pintos-debug: dumping backtrace of thread 'main' @@0xc000e000
#0 intr_get_level () at ../../threads/interrupt.c:66
#1 0xc0021103 in intr_enable () at ../../threads/interrupt.c:90
#2 0xc0021150 in intr_set_level (level=INTR_ON) at ../../threads/interrupt.c:83
#3 0xc0023422 in timer_ticks () at ../../devices/timer.c:75
#4 0xc002343d in timer_elapsed (then=23) at ../../devices/timer.c:84
#5 0xc002aabf in test_mlfqs_load_1 () at ../../tests/threads/mlfqs-load-1.c:33
#6 0xc002b610 in run_test (name=0xc0007d4c "mlfqs-load-1") at ../../tests/devices/tests.c:72
#7 0xc00201a2 in run_task (argv=0xc00349e8 <argv+8>) at ../../threads/init.c:290
#8 0xc0020687 in run_actions (argv=0xc00349e8 <argv+8>) at ../../threads/init.c:340
#9 main () at ../../threads/init.c:133
pintos-debug: dumping backtrace of thread 'idle' @@0xc0103000
#0 0xc0020dc5 in schedule () at ../../threads/thread.c:579
#1 0xc0020e01 in thread_block () at ../../threads/thread.c:235
#2 0xc0020e6d in idle (idle_started_=0xc000ef7c) at ../../threads/thread.c:414
#3 0xc0020efc in kernel_thread (function=0xc0020e45 <idle>, aux=0xc000ef7c)
at ../../threads/thread.c:439
#4 0x00000000 in ?? ()
@end smallexample
@noindent
In this case, there are only two threads, the main thread and the idle
thread. The kernel stack pages (to which the @struct{thread} points)
are at @t{0xc000e000} and @t{0xc0103000}, respectively. The main thread
was in @func{timer_elapsed}, called from @code{test_mlfqs_load_1} when stopped.
Knowing where threads are can be tremendously useful, for instance
when diagnosing deadlocks or unexplained hangs.
@deffn {GDB Macro} loadusersymbols
You can also use GDB to debug a user program running under PintOS.
To do that, use the @code{loadusersymbols} macro to load the program's
symbol table:
@example
loadusersymbols @var{program}
@end example
@noindent
where @var{program} is the name of the program's executable (in the host
file system, not in the PintOS file system). For example, you may issue:
@smallexample
(gdb) @strong{loadusersymbols tests/userprog/exec-multiple}
add symbol table from file "tests/userprog/exec-multiple" at
.text_addr = 0x80480a0
(gdb)
@end smallexample
After this, you should be
able to debug the user program the same way you would the kernel, by
placing breakpoints, inspecting data, etc. Your actions apply to every
user program running in PintOS, not just to the one you want to debug,
so be careful in interpreting the results: GDB does not know
which process is currently active (because that is an abstraction
the PintOS kernel creates). Also, a name that appears in
both the kernel and the user program will actually refer to the kernel
name. (The latter problem can be avoided by giving the user executable
name on the GDB command line, instead of @file{kernel.o}, and then using
@code{loadusersymbols} to load @file{kernel.o}.)
@code{loadusersymbols} is implemented via GDB's @code{add-symbol-file}
command.
@end deffn
@node GDB FAQ
@subsection FAQ
@table @asis
@item GDB can't connect to QEMU (Error: localhost:1234: Connection refused)
If the @command{target remote} command fails, then make sure that both
GDB and @command{pintos} are running on the same machine by
running @command{hostname} in each terminal. If the names printed
differ, then you need to open a new terminal for GDB on the
machine running @command{pintos}.
@item GDB doesn't recognize any of the macros.
If you start GDB with @command{pintos-gdb}, it should load the PintOS
macros automatically. If you start GDB some other way, then you must
issue the command @code{source @var{pintosdir}/src/misc/gdb-macros},
where @var{pintosdir} is the root of your PintOS directory, before you
can use them.
@item Can I debug PintOS with DDD?
Yes, you can. DDD invokes GDB as a subprocess, so you'll need to tell
it to invokes @command{pintos-gdb} instead:
@example
ddd --gdb --debugger pintos-gdb
@end example
@item Can I use GDB inside Emacs?
Yes, you can. Emacs has special support for running GDB as a
subprocess. Type @kbd{M-x gdb} and enter your @command{pintos-gdb}
command at the prompt. The Emacs manual has information on how to use
its debugging features in a section titled ``Debuggers.''
@end table
@node Triple Faults
@section Triple Faults
When a CPU exception handler, such as a page fault handler, cannot be
invoked because it is missing or defective, the CPU will try to invoke
the ``double fault'' handler. If the double fault handler is itself
missing or defective, that's called a ``triple fault.'' A triple fault
causes an immediate CPU reset.
Thus, if you get yourself into a situation where the machine reboots in
a loop, that's probably a ``triple fault.'' In a triple fault
situation, you might not be able to use @func{printf} for debugging,
because the reboots might be happening even before everything needed for
@func{printf} is initialized.
Currently, the only option is ``debugging by infinite loop.''
Pick a place in the PintOS code, insert the infinite loop
@code{for (;;);} there, and recompile and run. There are two likely
possibilities:
@itemize @bullet
@item
The machine hangs without rebooting. If this happens, you know that
the infinite loop is running. That means that whatever caused the
reboot must be @emph{after} the place you inserted the infinite loop.
Now move the infinite loop later in the code sequence.
@item
The machine reboots in a loop. If this happens, you know that the
machine didn't make it to the infinite loop. Thus, whatever caused the
reboot must be @emph{before} the place you inserted the infinite loop.
Now move the infinite loop earlier in the code sequence.
@end itemize
If you move around the infinite loop in a ``binary search'' fashion, you
can use this technique to pin down the exact spot that everything goes
wrong. It should only take a few minutes at most.
@node Debugging Tips
@section Tips
The page allocator in @file{threads/palloc.c} and the block allocator in
@file{threads/malloc.c} clear all the bytes in memory to
@t{0xcc} at time of free. Thus, if you see an attempt to
dereference a pointer like @t{0xcccccccc}, or some other reference to
@t{0xcc}, there's a good chance you're trying to reuse a page that's
already been freed. Also, byte @t{0xcc} is the CPU opcode for ``invoke
interrupt 3,'' so if you see an error like @code{Interrupt 0x03 (#BP
Breakpoint Exception)}, then PintOS tried to execute code in a freed page or
block.

108
doc/devel.texi Normal file
View File

@@ -0,0 +1,108 @@
@node Development Tools
@appendix Development Tools
Here are some tools that you might find useful while developing code.
@menu
* Tags::
* cscope::
* Git::
@ifset recommendvnc
* VNC::
@end ifset
@ifset recommendcygwin
* Cygwin::
@end ifset
@end menu
@node Tags
@section Tags
Tags are an index to the functions and global variables declared in a
program. Many editors, including Emacs and @command{vi}, can use
them. The @file{Makefile} in @file{pintos-ic/src} produces Emacs-style
tags with the command @code{make TAGS} or @command{vi}-style tags with
@code{make tags}.
In Emacs, use @kbd{M-.} to follow a tag in the current window,
@kbd{C-x 4 .} in a new window, or @kbd{C-x 5 .} in a new frame. If
your cursor is on a symbol name for any of those commands, it becomes
the default target. If a tag name has multiple definitions, @kbd{M-0
M-.} jumps to the next one. To jump back to where you were before
you followed the last tag, use @kbd{M-*}.
@node cscope
@section cscope
The @command{cscope} program also provides an index to functions and
variables declared in a program. It has some features that tag
facilities lack. Most notably, it can find all the points in a
program at which a given function is called.
The @file{Makefile} in @file{pintos-ic/src} produces @command{cscope}
indexes when it is invoked as @code{make cscope}. Once the index has
been generated, run @command{cscope} from a shell command line; no
command-line arguments are normally necessary. Then use the arrow
keys to choose one of the search criteria listed near the bottom of
the terminal, type in an identifier, and hit @key{Enter}.
@command{cscope} will then display the matches in the upper part of
the terminal. You may use the arrow keys to choose a particular
match; if you then hit @key{Enter}, @command{cscope} will invoke the
default system editor@footnote{This is typically @command{vi}. To
exit @command{vi}, type @kbd{: q @key{Enter}}.} and position the
cursor on that match. To start a new search, type @key{Tab}. To exit
@command{cscope}, type @kbd{Ctrl-d}.
Emacs and some versions of @command{vi} have their own interfaces to
@command{cscope}. For information on how to use these interface,
visit @url{http://cscope.sourceforge.net, the @command{cscope} home
page}.
@node Git
@section Git
Git is a version-control system. That is, you can use it to keep
track of multiple versions of files. The idea is that you do some
work on your code and test it, then commit it into the version-control
system. If you decide that the work you've done since your last
commit is no good, you can easily revert to the last version.
Furthermore, you can retrieve any old version of your code
as of some given day and time. The version control logs tell you who
made changes and when.
Whilst Git may not be everyone's preferred version control system, it's
free, has a wealth of documentation, and is easy to install on most
Unix-like environments.
For more information, visit the @uref{https://www.git-scm.com/, , Git
home page}.
@include localgitinstructions.texi
@ifset recommendvnc
@node VNC
@section VNC
VNC stands for Virtual Network Computing. It is, in essence, a remote
display system which allows you to view a computing ``desktop''
environment not only on the machine where it is running, but from
anywhere on the Internet and from a wide variety of machine
architectures. It is already installed on the lab machines.
For more information, look at the @uref{http://www.realvnc.com/, , VNC
Home Page}.
@end ifset
@ifset recommendcygwin
@node Cygwin
@section Cygwin
@uref{http://cygwin.com/, ,Cygwin} provides a Linux-compatible environment
for Windows. It includes ssh client and an X11 server, Cygwin/X. If your
primary work environment is Windows, you will find Cygwin/X extremely
useful for these tasks. Install Cygwin/X, then start the X server
and open a new xterm. The X11 server also allows you to run pintos while
displaying the qemu-emulated console on your Windows desktop.
@end ifset
@localdevelopmenttools{}

51
doc/devices.tmpl Normal file
View File

@@ -0,0 +1,51 @@
+----------------------+
| OS 211 |
| TASK 0: ALARMCLOCK |
| DESIGN DOCUMENT |
+----------------------+
---- PRELIMINARIES ----
>> If you have any preliminary comments on your submission, or notes for the
>> markers, please give them here.
>> Please cite any offline or online sources you consulted while preparing your
>> submission, other than the Pintos documentation, course text, lecture notes
>> and course staff.
ALARM CLOCK
===========
---- DATA STRUCTURES ----
>> A1: (2 marks)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration.
>> Identify the purpose of each in roughly 25 words.
---- ALGORITHMS ----
>> A2: (2 marks)
>> Briefly describe what happens in a call to timer_sleep(), including the
>> actions performed by the timer interrupt handler on each timer tick.
>> A3: (2 marks)
>> What steps are taken to minimize the amount of time spent in the timer
>> interrupt handler?
---- SYNCHRONIZATION ----
>> A4: (1 mark)
>> How are race conditions avoided when multiple threads call timer_sleep()
>> simultaneously?
>> A5: (1 mark)
>> How are race conditions avoided when a timer interrupt occurs during a call
>> to timer_sleep()?
---- RATIONALE ----
>> A6: (2 marks)
>> Why did you choose this design?
>> In what ways is it superior to another design you considered?

59
doc/doc.texi Normal file
View File

@@ -0,0 +1,59 @@
@node Task Documentation
@appendix Task Documentation
This chapter presents a sample assignment and a filled-in design
document for one possible implementation. Its purpose is to give you an
idea of what we expect to see in your own design documents.
@menu
* Sample Assignment::
* Sample Design Document::
@end menu
@node Sample Assignment
@section Sample Assignment
Implement @func{thread_join}.
@deftypefun void thread_join (tid_t @var{tid})
Blocks the current thread until thread @var{tid} exits. If @var{A} is
the running thread and @var{B} is the argument, then we say that
``@var{A} joins @var{B}.''
Incidentally, the argument is a thread id, instead of a thread pointer,
because a thread pointer is not unique over time. That is, when a
thread dies, its memory may be, whether immediately or much later,
reused for another thread. If thread @var{A} over time had two children
@var{B} and @var{C} that were stored at the same address, then
@code{thread_join(@var{B})} and @code{thread_join(@var{C})} would be
ambiguous.
A thread may only join its immediate children. Calling
@func{thread_join} on a thread that is not the caller's child should
cause the caller to return immediately. Children are not ``inherited,''
that is, if @var{A} has child @var{B} and @var{B} has child @var{C},
then @var{A} always returns immediately should it try to join @var{C},
even if @var{B} is dead.
A thread need not ever be joined. Your solution should properly free
all of a thread's resources, including its @struct{thread},
whether it is ever joined or not, and regardless of whether the child
exits before or after its parent. That is, a thread should be freed
exactly once in all cases.
Joining a given thread is idempotent. That is, joining a thread
multiple times is equivalent to joining it once, because it has already
exited at the time of the later joins. Thus, joins on a given thread
after the first should return immediately.
You must handle all the ways a join can occur: nested joins (@var{A}
joins @var{B}, then @var{B} joins @var{C}), multiple joins (@var{A}
joins @var{B}, then @var{A} joins @var{C}), and so on.
@end deftypefun
@node Sample Design Document
@section Sample Design Document
@example
@include sample.tmpl.texi
@end example

157
doc/installation.texi Normal file
View File

@@ -0,0 +1,157 @@
@node Installing PintOS
@appendix Installing PintOS
This chapter explains how to install a PintOS development environment on your own machine.
We assume that you have already cloned your pintos git repo onto your machine.
If you are using a PintOS development environment that has been set up by someone else,
you should not need to read this chapter or follow any of these instructions.
The PintOS development environment is targeted at Unix-like systems.
It has been most extensively tested on GNU/Linux, in particular the Debian and Ubuntu distributions, and Solaris.
It is not designed to install under any form of Windows.
@menu
* Mac Specific Initial Set-up::
* Prerequisites::
* Installation::
@end menu
@node Mac Specific Initial Set-up
@section Mac Specific Initial Set-up
This first section is intended to help Mac users prepare to set-up PintOS so that they can coexist with LabTS and your Linux-using friends,
without any on-going hassle.
If you are not attempting to install PintOS on a Mac, then please skip ahead to the next section.
To prepare your Mac for PintOS, you will need to preform the following initial set-up steps:
@itemize
@item Download and unpack a prepared selection of cross-compiled GCC binaries from:
@uref{https://www.doc.ic.ac.uk/~mjw03/OSLab/mac-i686-elf-gcc-binaries.tar}.@*
You can also install these binaries onto your Mac yourself if you prefer.
@item Add the unpacked GCC binaries to your path with a line like:@*
@code{export PATH=$PATH:/your/local/path/here/mac-i686-elf-gcc-binaries/bin}
@end itemize
You should now be ready to follow the remaining instructions in this appendix.
@node Prerequisites
@section Prerequisites
Before attempting to install a PintOS development environment, you should check that the following prerequisites,
on top of standard Unix utilities, are available on your system:
@itemize @bullet
@item
@strong{Required:} @uref{http://gcc.gnu.org/, GCC}.
Version 5.4 or later is preferred.
Version 4.0 or later should work.
If the host machine has an 80@var{x}86 processor (32-bit or 64-bit), then GCC should be available via the command @command{gcc};
otherwise, an 80@var{x}86 cross-compiler should be available via the command @command{i386-elf-gcc}.
If you need a GCC cross-compiler, but one is not already installed on your system, then you will need to search online for an up-to-date installation guide.
@item
@strong{Required:} @uref{http://www.gnu.org/software/binutils/, GNU binutils}.
PintOS uses the Unix utilities @command{addr2line}, @command{ar}, @command{ld}, @command{objcopy}, and @command{ranlib}.
If the host machine does not have an 80@var{x}86 processor, then versions targeting 80@var{x}86 should be available to install with an @samp{i386-elf-} prefix.
@item
@strong{Required:} @uref{http://www.perl.org, Perl}.
Version 5.20.0 or later is preferred.
Version 5.6.1 or later should work.
@item
@strong{Required:} @uref{http://www.gnu.org/software/make/, GNU make}.
Version 4.0 or later is preferred.
Version 3.80 or later should work.
@item
@strong{Required:} @uref{http://fabrice.bellard.free.fr/qemu/, QEMU}.
The QEMU emulator required to run PintOS is @command{qemu-system-i386} which is part of the @command{qemu-system} package on most modern Unix platforms.
We recommend using version 2.10 or later, but at least version 2.5.
@item
@strong{Recommended:} @uref{http://www.gnu.org/software/gdb/, GDB}.
GDB is helpful in debugging (@pxref{GDB}).
If the host machine is not an 80@var{x}86, a version of GDB targeting 80@var{x}86 should be available as @samp{i386-elf-gdb}.
@item
@strong{Recommended:} @uref{http://www.x.org/, X}.
Being able to use an X server makes the virtual machine feel more like a physical machine, but it is not strictly necessary.
@item
@strong{Optional:} @uref{http://www.gnu.org/software/texinfo/, Texinfo}.
Version 4.5 or later.
Texinfo is required to build the PDF version of the main PintOS documentation.
@item
@strong{Optional:} @uref{http://www.tug.org/, @TeX{}}.
@TeX{} is required to build the PDF versions of the support documentation.
@item
@strong{Optional:} @uref{http://www.vmware.com/, VMware Player}.
This is another emulation platform that can be used to run PintOS instead of QEMU.
You will need to search online for an up-to-date installation guide.
@end itemize
@node Installation
@section Installation
Once you have checked that the prerequisites are available,
follow these instructions to install a PintOS development environment:
@enumerate 1
@item
Compile the PintOS utilities in @file{src/utils}.
To do this, open a terminal in the @file{src/utils} directory of your PintOS project and run @command{make}.
@item
Install scripts from @file{src/utils}.
This easiest way to do this is to reconfigure your system's @env{PATH} to include the @file{src/utils} directory of your PintOS project.
You can instead copy the files
@file{backtrace},
@file{pintos},
@file{pintos-gdb},
@file{pintos-mkdisk},
@file{pintos-set-cmdline},
@file{Pintos.pm}
and
@file{squish-pty}
into your system's default @env{PATH}.
If your Perl is older than version 5.8.0, then you will also need to install @file{setitimer-helper}; otherwise, it is unneeded.
@item
Install the GDB macros from @file{src/misc/gdb-macros}.
The easiest way to do this is to use a text editor to update your previously installed copy of @file{pintos-gdb}
so that the definition of @env{GDBMACROS} points to your local @file{gdb-macros} file.
You can instead copy the @file{pintos-gdb} file into a system directory of your choice,
but you will still need to update the definition of @env{GDBMACROS} in your installed copy of @file{pintos-gdb}.
Test the GDB macro installation by running @command{pintos-gdb} without any arguments.
If it does not complain about missing @file{gdb-macros}, it is installed correctly.
@item
PintOS should now be ready for use.
To test your installation, open a terminal in the @file{src/devices} directory of your PintOS project and run @command{make check}.
This will run the tests for Task 0 and should take no more than a few minutes.
@item
@strong{Optional:} Install alternative emulation software.
To support VMware Player, install @file{squish-unix} (from the @file{src/utils} directory); otherwise it is unneeded.
@item
@strong{Optional:} Build the PintOS documentation.
Open a terminal in the @file{doc} directory of your PintOS project and run @command{make dist}.
This will create a @file{WWW} subdirectory within @file{doc} that contains both HTML and PDF versions of the documentation,
plus the design document templates and various hardware specifications referenced by the documentation.
@end enumerate

564
doc/intro.texi Normal file
View File

@@ -0,0 +1,564 @@
@node Introduction
@chapter Introduction
Welcome to PintOS. PintOS is a simple operating system framework for
the 80@var{x}86 architecture. It supports kernel threads, loading and
running user programs, and a file system, but it implements all of
these in a very simple way. During the PintOS tasks, you and your
group will strengthen its support in two of these areas.
You will also add a virtual memory implementation.
PintOS could, theoretically, run on a regular IBM-compatible PC.
Unfortunately, it is impractical to supply every student
with a dedicated PC for use with PintOS. Therefore, we will be running PintOS
in a system simulator, that is, a program that simulates an 80@var{x}86
CPU and its peripheral devices accurately enough that unmodified operating
systems and software can run under it. In particular, we will be using the
@uref{http://fabrice.bellard.free.fr/qemu/, ,
QEMU} simulator. PintOS has also been tested with the
@uref{http://www.vmware.com/, , VMware Player}.
These tasks are hard. The PintOS exercise have a reputation of taking a lot of
time, and deservedly so. We will do what we can to reduce the workload, such
as providing a lot of support material, but there is plenty of
hard work that needs to be done. We welcome your
feedback. If you have suggestions on how we can reduce the unnecessary
overhead of assignments, cutting them down to the important underlying
issues, please let us know.
This version of the exercise has been adapted for use at Imperial College
London, and is significantly different to the original exercise designed at
Stanford University. It's recommended that you only use the Imperial version
of the documentation to avoid unnecessary confusion.
This chapter explains how to get started working with PintOS. You
should read the entire chapter before you start work on any of the
tasks.
@menu
* Getting Started::
* Testing::
* Submission::
* Grading::
* Legal and Ethical Issues::
* Acknowledgements::
* Trivia::
@end menu
@comment ----------------------------------------------------------------------
@node Getting Started
@section Getting Started
To get started, you'll have to log into a machine that PintOS can be
built on.
@localmachines{}
We will test your code on these machines, and the instructions given
here assume this environment. We do not have the manpower to provide support for installing
and working on PintOS on your own machine, but we provide instructions
for doing so nonetheless (@pxref{Installing PintOS}).
If you are using bash (the default shell for CSG-run machines), several PintOS
utilities will already be in your PATH. If you are not using bash on a CSG-run machine,
you will need to add these utilities manually.
@localpathsetup{}
@menu
* Source Tree Overview::
* Building PintOS::
* Running PintOS::
@end menu
@comment ----------------------------------------------------------------------
@node Source Tree Overview
@subsection Source Tree Overview
For Task 0 each student has been provided with a Git repository on the department's @code{GitLab}
server that contains the files needed for this exercise.
To obtain this initial skeleton repository you will need to clone it into your local workspace.
You can do this with the following command:
@example
git clone @value{localindivgitpath}
@end example
@noindent replacing @code{<login>} with your normal college login.
For the remaining tasks, each group will be provided with a Git repository on the department's @code{GitLab}
server that contains the files needed for the entire PintOS project.
To obtain this skeleton repository you will need to clone it into your local workspace.
You can do this with the following command:
@example
git clone @value{localgitpath}
@end example
@noindent replacing @code{<gnum>} with your group number, which can be found on the @code{GitLab} website.
You should work on the files in your local workspace, making regular commits back to the corresponding Git repository.
Your final submissions will be taken from these @code{GitLab} repositories, so make sure that you push your work to them correctly.
Let's take a look at what's inside the full PintOS repository.
Here's the directory structure that you should see in @file{pintos/src}:
@table @file
@item devices/
Source code for I/O device interfacing: keyboard, timer, disk, etc.
You will modify the timer implementation in task 0. Otherwise
you should have no need to change this code.
@item threads/
Source code for the base kernel, which you will modify in
task 1.
@item userprog/
Source code for the user program loader, which you will modify
in task 2.
@item vm/
An almost empty directory. You will implement virtual memory here in
task 3.
@item filesys/
Source code for a basic file system. You will use this file system
in tasks 2 and 3.
@item lib/
An implementation of a subset of the standard C library. The code in
this directory is compiled into both the PintOS kernel and, starting
from task 2, user programs that run under it. In both kernel code
and user programs, headers in this directory can be included using the
@code{#include <@dots{}>} notation. You should have little need to
modify this code.
@item lib/kernel/
Parts of the C library that are included only in the PintOS kernel.
This also includes implementations of some data types that you are
free to use in your kernel code: bitmaps, doubly linked lists, and
hash tables. In the kernel, headers in this
directory can be included using the @code{#include <@dots{}>}
notation.
@item lib/user/
Parts of the C library that are included only in PintOS user programs.
In user programs, headers in this directory can be included using the
@code{#include <@dots{}>} notation.
@item tests/
Tests for each task. You can modify this code if it helps you test
your submission, but we will replace it with the originals before we run
the tests.
@item examples/
Example user programs for use in tasks 2 and 3.
@item misc/
@itemx utils/
These files may come in handy if you decide to try working with PintOS
on your own machine. Otherwise, you can ignore them.
@end table
@comment ----------------------------------------------------------------------
@node Building PintOS
@subsection Building PintOS
As the next step, build the source code supplied for
the first task. First, @command{cd} into the @file{devices}
directory. Then, issue the @samp{make} command. This will create a
@file{build} directory under @file{devices}, populate it with a
@file{Makefile} and a few subdirectories, and then build the kernel
inside. The entire build should take less than 30 seconds.
@localcrossbuild{}
After the build has completed, you will find the following interesting files in the
@file{build} directory:
@table @file
@item Makefile
A copy of @file{pintos/src/Makefile.build}. It describes how to build
the kernel. @xref{Adding Source Files}, for more information.
@item kernel.o
Object file for the entire kernel. This is the result of linking
object files compiled from each individual kernel source file into a
single object file. It contains debug information, so you can run
GDB (@pxref{GDB}) or @command{backtrace} (@pxref{Backtraces}) on it.
@item kernel.bin
Memory image of the kernel, that is, the exact bytes loaded into
memory to run the PintOS kernel. This is just @file{kernel.o} with
debug information stripped out, which saves a lot of space, which in
turn keeps the kernel from bumping up against a @w{512 kB} size limit
imposed by the kernel loader's design.
@item loader.bin
Memory image for the kernel loader, a small chunk of code written in
assembly language that reads the kernel from disk into memory and
starts it up. It is exactly 512 bytes long, a size fixed by the
PC BIOS.
@end table
Subdirectories of @file{build} contain object files (@file{.o}) and
dependency files (@file{.d}), both produced by the compiler. The
dependency files tell @command{make} which source files need to be
recompiled when other source or header files are changed.
@comment ----------------------------------------------------------------------
@node Running PintOS
@subsection Running PintOS
We've supplied a program for conveniently running PintOS in a simulator,
called @command{pintos}. In the simplest case, you can invoke
@command{pintos} as @code{pintos @var{argument}@dots{}}. Each
@var{argument} is passed to the PintOS kernel for it to act on.
Try it out. First @command{cd} into the newly created @file{build}
directory. Then issue the command @code{pintos run alarm-multiple},
which passes the arguments @code{run alarm-multiple} to the PintOS
kernel. In these arguments, @command{run} instructs the kernel to run a
test and @code{alarm-multiple} is the test to run.
PintOS boots and runs the @code{alarm-multiple} test
program, which outputs a few screenfulls of text.
You can log serial output to a file by redirecting at the
command line, e.g.@: @code{pintos run alarm-multiple > logfile}.
The @command{pintos} program offers several options for configuring the
simulator or the virtual hardware. If you specify any options, they
must precede the commands passed to the PintOS kernel and be separated
from them by @option{--}, so that the whole command looks like
@code{pintos @var{option}@dots{} -- @var{argument}@dots{}}. Invoke
@code{pintos} without any arguments to see a list of available options.
You can run the simulator with a debugger (@pxref{GDB}). You can also set the
amount of memory to give the VM.
The PintOS kernel has commands and options other than @command{run}.
These are not very interesting for now, but you can see a list of them
using @option{-h}, e.g.@: @code{pintos -h}.
@comment ----------------------------------------------------------------------
@page
@node Testing
@section Testing
To help you ensure that your code will compile and run as expected in our testing environment we have provided you with a Lab Testing Service: LabTS. LabTS will clone your git repository and run several automated test process over your work. This will happen automatically when you submit your work, but can also be requested during the course of each task.
You can access the LabTS webspages at @file{https://teaching.doc.ic.ac.uk/labts}.
Note that you will be required to log-in with your normal college username and password.
If you click through to the @code{pintos} exercise you will see a list of the different versions of your work that you have pushed.
Next to each commit you will see a button that will allow you to request that this version of your work is run through the automated test process for the currently viewed milestone. If you click on this button your work will be tested (this may take a few minutes) and the results will appear om the relevant column.
@cartouche
@noindent@strong{Important:} submitted code that fails to compile and run on LabTS will be awarded @strong{0 marks} for the automated tests grade!
You should be periodically (but not continuously) testing your code on LabTS.
If you are experiencing problems with the compilation or execution of your code then please seek help/advice as soon as possible.
@end cartouche
Your automated test result grade will be based on our test suite.
Each task has several tests, each of which has a name beginning with @file{tests}.
To completely test your submission, invoke @code{make check} from the task @file{build} directory.
This will build and run each test and print a ``pass'' or ``fail'' message for each one. When a test fails,
@command{make check} also prints some details of the reason for failure.
After running all the tests, @command{make check} also prints a summary of the test results.
You can run @command{make grade} to see the automated test results output in the same format as will be presented to the markers.
You can also run individual tests one at a time.
A given test @file{@var{t}} writes its output to @file{@var{t}.output},
then a script scores the output as ``pass'' or ``fail'' and writes the verdict to @file{@var{t}.result}.
To run and grade a single test,
@command{make} the @file{.result} file explicitly from the @file{build} directory, e.g.@: @code{make tests/devices/alarm-multiple.result}.
If @command{make} says that the test result is up-to-date, but you want to re-run it anyway,
either run @code{make clean} or delete the @file{.output} file by hand.
By default, each test provides feedback only at completion, not during its run.
If you prefer, you can observe the progress of each test by specifying @option{VERBOSE=1} on the @command{make} command line,
as in @code{make check VERBOSE=1}.
You can also provide arbitrary options to the @command{pintos} run by the tests with @option{PINTOSOPTS='@dots{}'}.
All of the tests and related files can be found in @file{pintos/src/tests}.
Before we test your submission, we will replace the contents of this directory by a pristine, unmodified copy, to ensure that the correct tests are used.
Thus, you can freely modify any of the tests if that helps in your debugging, but we will run our automated tests on the originals.
All software has bugs, so it is possible that some of our tests may be flawed.
If you think a test failure is a bug in the test, not a bug in your code, please point it out.
We will look at it and fix it if necessary.
Please don't try to take advantage of our generosity in giving out the full test suite.
Your code has to work properly in the general case and not just for the test cases we supply.
We will be asking questions about the general case during the code review sessions, so you won't be able to get away with it.
For example, it would be unacceptable to explicitly base the kernel's behaviour on the name of the running test case.
Such attempts to side-step the test cases will be spotted during the code review process and will receive no credit.
If you think your solution may be in a gray area here, please ask us about it.
@menu
* Debugging versus Testing::
@end menu
@comment ----------------------------------------------------------------------
@node Debugging versus Testing
@subsection Debugging versus Testing
The QEMU simulator you will be using to run PintOS only supports real-time
simulations. This has ramifications with regards to both testing and debugging.
Whilst reproducibility is in general extremely useful for debugging, running PintOS in QEMU is not necessarily deterministic.
You should keep this in mind when testing for bugs in your code.
In each run, timer interrupts will come at irregularly spaced intervals, meaning that bugs may appear and disappear with repeated tests.
Therefore, it's very important that you run your tests at a least few times.
No number of runs can guarantee that your synchronisation is perfect,
but the more you do, the more confident you can be that your code doesn't have major flaws.
@cartouche
@noindent@strong{Important:} the PintOS kernel is written for a single-cored CPU,
which helps to limit the possible interleavings of concurrently executing threads.
However, as you have no control over the occurence of timer interrupts,
you will still need to consider the implications of your code being interrupted at almost any arbitrary point.
Much of our assessment will be conducted as a ``demonic'' scheduler that chooses the ``worst-case'' possibilities.
@end cartouche
@comment ----------------------------------------------------------------------
@node Submission
@section Submission
As you work, you should @code{add}, @code{commit} and @code{push} your changes to your git repository.
Your @code{GitLab} repository should contain the source code, header files and make files for your OS.
Prior to submission, you should check the state of your @code{GitLab} repository using the @code{LabTS} webpages at
@file{https://teaching.doc.ic.ac.uk/labts}.
If you click through to the @code{pintos} exercise you will see a list of the different versions of your work that you have pushed to the master branch of your repository.
Next to each commit you will see a link to that commit on @code{GitLab} as well as a button to submit that version of your code for assessment.
You should submit the version of your code that you consider to be "final" for each task.
You can change this later, as usual, by submitting a different version of your code.
The submission button on LabTS will be replaced with a green confirmation message if the submission has been sucessful.
For each @code{pintos} task you will also need to submit a design document (@code{designT#.pdf}) directly to Scientia.
Your submission must be signed off as a group on Scientia in the usual way.
@comment ----------------------------------------------------------------------
@node Grading
@section Grading
We will grade each @code{pintos} task over 3 catagories:
@itemize
@item @strong{automated tests}: your score from the automated test results.
@item @strong{code review}: an assessment of your design quality and efficiency.
@item @strong{design document}: your answers to the task's design document questions.
@end itemize
The marks for each @code{pintos} task will contribute to both your @value{coursenumber} Operating Systems coursework mark and your @value{labnumber} Computing Practical 2 exercises mark.
For @code{pintos} task 0, part A will make up all of the task's Operating Systems coursework grade, while part B will make up all of the task's Computing Practical 2 grade.
For all other @code{pintos} tasks, the automated tests will count for 40% of each task's Computing Practical 2 grade with the code review making up the other 60%.
The design document will count for 100% of the Operating Systems coursework grade for each of these tasks.
Note that some of the automated tests may be zero-weighted. These tests help us to identify likely design issues in your code and will probably affect your code review mark.
The weighting of the @code{pintos} tasks is 10%, 20%, 30% and 40% for each of task 0, task 1, task 2 and task 3 respectively.
JMC students are not assessed on task 3, but have the same relative weighting for task 0, task 1 and task 2.
@menu
* Design::
@end menu
@comment ----------------------------------------------------------------------
@node Design
@subsection Design
We will judge your design based on the design document and the source code that you submit.
We will read your entire design document and much of your source code.
@cartouche
@noindent@strong{Important:} Don't forget that design quality and efficiency will account for 60% of each task's @value{labnumber} Computing Practical 2 grade
and that the design documents will make up your entire @value{coursenumber} Operating Systems coursework mark.
It is, therefore, better to spend a day or two writing a good design document and thinking about the efficiency and edge-cases of your code,
than it is to spend that time trying to get the last 5% of the points for the automated tests
and then having to rush through writing the design document in the last 15 minutes.
@end cartouche
@menu
* Design Document::
* Source Code::
@end menu
@comment ----------------------------------------------------------------------
@node Design Document
@subsubsection Design Document
We will provide you with a design document template for each task.
For each significant part of a task, the template asks questions in four areas:
@table @strong
@item Data Structures
The instructions for this section are always the same:
@quotation
Copy here the declaration of each new or changed @code{struct} or @code{struct} member, global or static variable, @code{typedef}, or enumeration.
Identify the purpose of each in roughly 25 words.
@end quotation
The first part is mechanical.
Just copy new or modified declarations into the design document to highlight for us the actual changes to data structures.
Each declaration should include the comment that should accompany it in the source code (see below).
We also ask for a very brief description of the purpose of each new or changed data structure.
The suggestion of 25 words is a guideline intended to save your time and avoid duplication with later areas of the design document.
@item Algorithms
This is where you tell us how your code works, through questions that probe your understanding of your code.
We might not be able to easily figure it out from the code alone, because many creative solutions exist for most OS problems.
Help us out a little.
Your answers should be at a level below the high level description of requirements given in the assignment.
We have read the assignment too, so it is unnecessary to repeat or rephrase what is stated there.
On the other hand, your answers should be at a level above the low level of the code itself.
Don't give a line-by-line run-down of what your code does.
Instead, use your answers to explain how your code works to implement the requirements.
@item Synchronization
An operating system kernel is a complex, multithreaded program, in which synchronizing multiple threads can be difficult.
This section asks about how you chose to synchronize this particular type of activity.
@item Rationale
Whereas the other sections primarily ask ``what'' and ``how,'' the rationale section concentrates on ``why''.
This is where we ask you to justify some of your design decisions, by explaining why the choices you made are better than alternatives you considered.
You may be able to state these justifications in terms of time and space complexity, which can be made as rough or informal arguments (formal language or proofs are unnecessary).
@end table
Any incomplete, evasive, or non-responsive answers to design document questions or those that stray from the provided template without good reason may be penalised.
Additionally, any design docuement that does not match the reality of your implementation may be penalised unless any descrepencies are clearly stated and explained.
Incorrect capitalization, punctuation, spelling, or grammar may also cost points if this impedes our reading of your design document.
@xref{Task Documentation}, for an example design document for a fictitious task.
@cartouche
@noindent@strong{Important:} You should carefully read the design document for a task before you begin writing any code.
The questions we ask should help you identify some of the tricky corner cases that your implementation will be expected to handle.
@end cartouche
@comment ----------------------------------------------------------------------
@node Source Code
@subsubsection Source Code
Your design will also be judged by reviewing your source code with you during interactive code review sessions.
We will typically look at the differences between the original PintOS source tree and your submission,
based on the output of a command like @code{diff -urpb pintos.orig pintos.submitted} or reviewing the Git commits directly on @code{GitLab}.
We will try to match up your description of the design with the code submitted.
Important discrepancies between the description and the actual code will be penalised, as will be any bugs we find by spot checks during the code review sessions.
The most important aspects of source code design are those that specifically relate to the operating system issues at stake in the task.
It is important that you consider the efficiency of your operating system design choices, but other issues are much more important.
For example, multiple PintOS design problems call for a ``priority queue'', that is,
a dynamic collection from which the minimum (or maximum) item can quickly be extracted.
Fast priority queues can be implemented many ways, but we do not expect you to build a fancy data structure even if it might improve performance.
Instead, you are welcome to use a linked list (and PintOS even provides one with convenient functions for sorting and finding minimums and maximums).
PintOS is written in a consistent style.
Your additions and modifications do not have to be in the same style as the existing PintOS source files,
but you should ensure that your code style is self-consistent.
There should not be a patchwork of different styles that makes it obvious that three or four different people wrote the code.
Use horizontal and vertical white space to make code readable.
Add a brief comment on every structure, structure member, global or static variable, typedef, enumeration, and function definition.
Use phase-level comments within fuctions to help explain longer, or more complicated, behaviour.
Update existing comments as you modify code.
Don't comment out or use the preprocessor to ignore blocks of code (instead, remove it entirely - remember you have Git if you need to get it back).
Use assertions to document key invariants.
Decompose code into functions for clarity.
Code that is difficult to understand because it violates these or other ``common sense'' software engineering practices will be penalised during the code review sessions.
In the end, remember your audience.
Code is written primarily to be read by humans.
It has to be acceptable to the compiler too, but the compiler doesn't care about how it looks or how well it is written.
@xref{Coding Standards} for additional guidance.
@comment ----------------------------------------------------------------------
@page
@node Legal and Ethical Issues
@section Legal and Ethical Issues
PintOS is distributed under a liberal license that allows free use, modification, and distribution of this material.
Students and others who work on PintOS own the code that they write and may use it for any purpose.
PintOS comes with NO WARRANTY, not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
@xref{License}, for details of the license and lack of warranty.
@localhonorcodepolicy{}
@comment ----------------------------------------------------------------------
@node Acknowledgements
@section Acknowledgements
The PintOS core kernel and this documentation were originally written by Ben
Pfaff @email{blp@@cs.stanford.edu}.
Additional features were contributed by Anthony Romano
@email{chz@@vt.edu}.
The GDB macros supplied with PintOS were written by Godmar Back
@email{gback@@cs.vt.edu}, and their documentation is adapted from his
work.
The original structure and form of PintOS was inspired by the Nachos
instructional operating system from the University of California,
Berkeley (@bibref{Christopher}).
The PintOS tasks and documentation originated with those designed for
Nachos by current and former CS 140 teaching assistants at Stanford
University, including at least Yu Ping, Greg Hutchins, Kelly Shaw, Paul
Twohey, Sameer Qureshi, and John Rector.
Example code for monitors (@pxref{Monitors}) is
from classroom slides originally by Dawson Engler and updated by Mendel
Rosenblum.
@localcredits{}
@comment ----------------------------------------------------------------------
@node Trivia
@section Trivia
PintOS originated as a replacement for Nachos with a similar design.
Since then PintOS has greatly diverged from the Nachos design.
PintOS differs from Nachos in two important ways:
@itemize
@item First, PintOS runs on real or simulated 80@var{x}86 hardware, but Nachos runs as a process on a host operating system.
@item Second, PintOS is written in C like most real-world operating systems, but Nachos is written in C++.
@end itemize
@noindent@strong{So, why the name ``PintOS''?}
@itemize
@item First, like nachos, pinto beans are a common Mexican food.
@item Second, PintOS is small and a ``pint'' is a small amount.
@item Third, like drivers of the eponymous car, students are likely to have trouble with blow-ups.
@end itemize

62
doc/license.texi Normal file
View File

@@ -0,0 +1,62 @@
@node License
@appendix License
PintOS, including its documentation, is subject to the following
license:
@quotation
Copyright @copyright{} 2004, 2005, 2006 Board of Trustees, Leland
Stanford Jr.@: University. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
``Software''), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@end quotation
A few individual files in PintOS were originally derived from other
projects, but they have been extensively modified for use in PintOS.
The original code falls under the original license, and modifications
for PintOS are additionally covered by the PintOS license above.
In particular, code derived from Nachos is subject to the following
license:
@quotation
Copyright @copyright{} 1992-1996 The Regents of the University of California.
All rights reserved.
Permission to use, copy, modify, and distribute this software
and its documentation for any purpose, without fee, and
without written agreement is hereby granted, provided that the
above copyright notice and the following two paragraphs appear
in all copies of this software.
IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO
ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE
AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA
HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN ``AS IS''
BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATION TO
PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR
MODIFICATIONS.
@end quotation

View File

@@ -0,0 +1,96 @@
@c
@c Instructions on how to set up a group environment, permissions,
@c Git repository, dealing with issues etc.
@c
@c While some of the discussion may apply to more than one environment,
@c no attempt was made to untangle and split the discussion.
@c
@menu
* Setting Up Git::
* Using Git::
@end menu
@node Setting Up Git
@subsection Setting Up Git
To make your Git logs easier to read, you should set the user.name and
user.email variables in your .gitconfig file:
@verbatim
[user]
name = Firstname Surname
email = example@doc.ic.ac.uk
@end verbatim
Note that we will be checking your Git logs as part of your submission.
You should ensure that you use meaningful commit messages and make it clear
who was responsible for each commit (especially if you are using pair-programming).
To work on the source code, you must create a clone of your group's provided @file{GitLab} repository.
This can be done by running the following command:
@example
git clone @value{localgitpath}
@end example
replacing @code{<gnum>} with your group number, which can be found on the @code{Gitlab} website.
@node Using Git
@subsection Using Git
Once you've cloned the repository, you can start working in your clone
straight away. At any point you can see what files you've modified with
@samp{git status}, and check a file in greater detail with
@samp{git diff @var{filename}}. You view more detailed information using
tools such as @samp{tig}
Git uses an intermediary area between the working filesystem and the actual
repository, known as the staging area (or index). This allows you to perform
tasks such as committing only a subset of your changes, without modifying your
copy of the filesystem. Whilst the uses of the staging area are outside the
scope of this guide, it is important that you are aware of its existence.
When you want to place your modifications into the repository, you must
first update the staging area with your changes (@samp{git add @var{filename}},
and then use this to update the repository, using @samp{git commit}. Git
will open a text editor when committing, allowing you to provide a description
of your changes. This can be useful later for reviewing the repository,
so be sensible with your commit messages.
When you want to share your changes with the rest of your group you will need to
run @samp{git push} to send your commits back to the shared @file{labranch} repository.
Note that the very first time you push you will need to run the command:
@example
git push origin master
@end example
to tell Git which branch of your local repository to push to which remote repository
(in this case from branch @code{master} to the @code{origin} repository).
Sometimes your group members may make confliting changes to your repository,
which Git is unable to solve.
These problems can be solved using @samp{git mergetool},
but its use is outside the scope of this dicussion.
You can view the history of a file @var{foo} in your working directory,
including the log messages, with @samp{git log @var{foo}}.
You can give a particular set of file versions a name called a
@dfn{tag}. Simply execute @samp{git tag @var{name}}. It's best
to have no local changes in the working copy when you do this, because
the tag will not include uncommitted changes. To recover the tagged
commit later, simply execute @samp{git checkout @var{tag}}.
If you add a new file to the source tree, you'll need to add it to the
repository with @samp{git add @var{file}}. This command does not have
lasting effect until the file is committed later with @samp{git
commit}.
To remove a file from the source tree, first remove it from the file
system with @samp{git rm @var{file}}. Again, only @samp{git commit}
will make the change permanent.
To discard your local changes for a given file, without committing
them, use @samp{git checkout @var{file} -f}.
For more information, visit the @uref{https://www.git-scm.com/, , Git
home page}.

73
doc/localsettings.texi Normal file
View File

@@ -0,0 +1,73 @@
@c Local settings
@set coursenumber COMP50004
@set labnumber COMP50007/500012
@set localpintosgitpath /vol/lab/secondyear/osexercise/pintos.git
@set localpintosbindir /vol/lab/secondyear/bin/
@set localgitpath https://gitlab.doc.ic.ac.uk/lab2425_autumn/pintos_<gnum>.git
@set localindivgitpath https://gitlab.doc.ic.ac.uk/lab2425_autumn/pintos_task0_<login>.git
@set recommendvnc
@clear recommendcygwin
@macro localmachines{}
The machines officially supported for PintOS development are
the Linux machines in the labs managed by CSG, as described on
the @uref{http://www.doc.ic.ac.uk/csg/facilities/lab/workstations, ,
CSG webpage}.
@end macro
@macro localpathsetup{}
The PintOS utilities can be located at @value{localpintosbindir} on CSG-run
lab machines.
@end macro
@macro localcrossbuild{}
Watch the commands executed during the build.
On the Linux machines, the ordinary system tools are used.
@end macro
@macro localhonorcodepolicy{}
Please respect the college's plagiarism policy by refraining from reading any coursework solutions available online or elsewhere.
You must also refrain from posting any of your code or solutions publically online (such as on GitHub) or sharing them with your classmates.
Reading the source code for other operating system kernels, such as Linux or FreeBSD, is allowed,
but do not copy code from them literally.
You must cite any code that inspired your own in your design documentation.
@end macro
@macro localcredits{}
Additional modifications have been made to the documentation, code and task structure when adapting and developing the material
for use at Imperial College London by Mark Rutland, Feroz Abdul Salam, Mark Wheelhouse, Fabio Luporini and Fidelis Perkonigg.
A number of DoC students have also made valuable contributions to the ongoing development of the @code{pintos}
project@footnote{If you have suggestions for improving the @code{pintos} tasks, spot and fix a bug or contribute new material to the project, then you too could be added to the above list. If interested, please discuss with Dr Mark Wheelhouse.},
and we thank them here:
@itemize
@item Dragos Dumitrache (2015) - created the first version of the ``pintos for Mac'' guide.
@item Nandor Licker (2015) - spotted an upcoming change to QEMU in 2015 that would have interfered with the @code{pintos} shutdown code.
@item Levente Kurusa (2017) - updated the ``pintos for Mac'' guide for more recent versions of MacOS and created the original Mac installation patch files (now merged into the main repo).
@item Emma Gospodinova (2019) - spotted the need to add the @code{-fno-ptr} and @code{-fno-pie} flags to @code{gcc} calls in the @code{pintos} makefiles, which would otherwise lead to key parts of the user-programs code being optimised out.
@item Moritz Langenstein (2019) - identified the need for additional explicit type conversions in numerous test cases to support stronger @code{gcc} warning flags.
@item Alex Tasos (2021) - provided a linker script @code{loaderfix.ld} to fix a bug in linking the @code{pintos} loader (@code{loader.bin}) on Arch Linux systems.
@item Bartlomiej Cieslar (2021) - spotted a potential double-free in the @code{load_segment} function in @code{src/userprog/process.c}.
@item Charlie Lidbury (2021) - spotted a potential issue with setting the writable flag for overlapping pages in the @code{load_segment} function in @code{src/userprog/process.c}.
@item Chun Wong (2023) - spotted a potential race-condition on @code{ready_list} in the @code{threads_ready} function in @code{src/threads/thread.c}.
@item Luke Moran (2023) - suggested adding div-by-zero checks to Task 2 (including @code{multi-oom}).
@item Reuben Cartwright (2023) - spotted a counter increment bug in the @code{pt-overflowstk} test.
@end itemize
@end macro
@macro localgitpolicy{}
Instead, we recommend integrating your team's changes early and often
using Git (@pxref{Git}).
This is less likely to produce surprises, because everyone can see everyone else's code as it is written, instead of just when it is finished.
Version control also makes it possible to review changes and, when a change introduces a bug, drop back to working versions of code.
@end macro
@macro localdevelopmenttools{}
@c Descriptions of additional, local development tools can be inserted here
@end macro

92
doc/pintos-ic.texi Normal file
View File

@@ -0,0 +1,92 @@
\input texinfo @c -*- texinfo -*-
@c %**start of header
@setfilename pintos-ic.info
@settitle PintOS Tasks
@c %**end of header
@c @bibref{} macro
@iftex
@macro bibref{cite}
[\cite\]
@end macro
@afourpaper
@end iftex
@ifinfo
@ifnotplaintext
@macro bibref{cite}
@ref{\cite\}
@end macro
@end ifnotplaintext
@ifplaintext
@macro bibref{cite}
[\cite\]
@end macro
@end ifplaintext
@end ifinfo
@ifhtml
@macro bibref{cite}
[@ref{\cite\}]
@end macro
@end ifhtml
@macro func{name}
@code{\name\()}
@end macro
@macro struct{name}
@code{struct \name\}
@end macro
@finalout
@titlepage
@title PintOS (Imperial College Edition)
Version 2.5.1
@author Originally by Ben Pfaff
@end titlepage
@shortcontents
@contents
@ifnottex
@node Top, Introduction, (dir), (dir)
@top PintOS Tasks
@end ifnottex
@menu
* Introduction::
* Task 0--Codebase::
* Task 1--Threads::
* Task 2--User Programs::
* Task 3--Virtual Memory::
* Reference Guide::
* 4.4BSD Scheduler::
* Coding Standards::
* Task Documentation::
* Debugging Tools::
* Development Tools::
* Installing PintOS::
* Bibliography::
* License::
@end menu
@c institution-local settings
@include localsettings.texi
@include intro.texi
@include codebase.texi
@include threads.texi
@include userprog.texi
@include vm.texi
@include reference.texi
@include 44bsd.texi
@include standards.texi
@include doc.texi
@include debug.texi
@include devel.texi
@include installation.texi
@include bibliography.texi
@include license.texi
@bye

16
doc/pintos-t2h.init Normal file
View File

@@ -0,0 +1,16 @@
sub T2H_InitGlobals
{
# Set the default body text, inserted between <BODY ... >
$T2H_BODYTEXT = '';
# text inserted after <BODY ...>
$T2H_AFTER_BODY_OPEN = '';
#text inserted before </BODY>
$T2H_PRE_BODY_CLOSE = '';
# this is used in footer
$T2H_ADDRESS = "<I>$T2H_USER</I> " if $T2H_USER;
$T2H_ADDRESS .= "on <I>$T2H_TODAY</I>";
# this is added inside <HEAD></HEAD> after <TITLE> and some META NAME stuff
# can be used for <style> <script>, <meta> tags
$T2H_EXTRA_HEAD = "<LINK REL=\"stylesheet\" HREF=\"pintos.css\">";
}
1;

76
doc/pintos.css Normal file
View File

@@ -0,0 +1,76 @@
body {
background: white;
color: black;
padding: 0em 1em 0em 3em;
margin: 0;
margin-left: auto;
margin-right: auto;
max-width: 8in;
text-align: justify
}
body>p {
margin: 0pt 0pt 0pt 0em;
text-align: justify
}
body>p + p {
margin: .75em 0pt 0pt 0pt
}
H1 {
font-size: 150%;
margin-left: -1.33em
}
H2 {
font-size: 125%;
font-weight: bold;
margin-left: -.8em
}
H3 {
font-size: 100%;
font-weight: bold;
margin-left: -.5em }
H4 {
font-size: 100%;
margin-left: 0em
}
H1, H2, H3, H4, H5, H6 {
font-family: sans-serif;
color: blue
}
H1, H2 {
text-decoration: underline
}
html {
margin: 0;
font-weight: lighter
}
tt, code {
font-family: sans-serif
}
b, strong {
font-weight: bold
}
a:link {
color: blue;
text-decoration: none;
}
a:visited {
color: gray;
text-decoration: none;
}
a:active {
color: black;
text-decoration: none;
}
a:hover {
text-decoration: underline
}
address {
font-size: 90%;
font-style: normal
}
HR {
display: none
}

2616
doc/reference.texi Normal file

File diff suppressed because it is too large Load Diff

104
doc/sample.tmpl Normal file
View File

@@ -0,0 +1,104 @@
+-----------------+
| CS 140 |
| SAMPLE TASK |
| DESIGN DOCUMENT |
+-----------------+
---- GROUP ----
Ben Pfaff <blp@stanford.edu>
---- PRELIMINARIES ----
>> If you have any preliminary comments on your submission, notes for
>> the TAs, or extra credit, please give them here.
(This is a sample design document.)
>> Please cite any offline or online sources you consulted while
>> preparing your submission, other than the PintOS documentation,
>> course text, and lecture notes.
None.
JOIN
====
---- DATA STRUCTURES ----
>> Copy here the declaration of each new or changed `struct' or `struct'
>> member, global or static variable, `typedef', or enumeration.
>> Identify the purpose of each in 25 words or less.
A "latch" is a new synchronization primitive. Acquires block
until the first release. Afterward, all ongoing and future
acquires pass immediately.
/* Latch. */
struct latch
{
bool released; /* Released yet? */
struct lock monitor_lock; /* Monitor lock. */
struct condition rel_cond; /* Signaled when released. */
};
Added to struct thread:
/* Members for implementing thread_join(). */
struct latch ready_to_die; /* Release when thread about to die. */
struct semaphore can_die; /* Up when thread allowed to die. */
struct list children; /* List of child threads. */
list_elem children_elem; /* Element of `children' list. */
---- ALGORITHMS ----
>> Briefly describe your implementation of thread_join() and how it
>> interacts with thread termination.
thread_join() finds the joined child on the thread's list of
children and waits for the child to exit by acquiring the child's
ready_to_die latch. When thread_exit() is called, the thread
releases its ready_to_die latch, allowing the parent to continue.
---- SYNCHRONIZATION ----
>> Consider parent thread P with child thread C. How do you ensure
>> proper synchronization and avoid race conditions when P calls wait(C)
>> before C exits? After C exits? How do you ensure that all resources
>> are freed in each case? How about when P terminates without waiting,
>> before C exits? After C exits? Are there any special cases?
C waits in thread_exit() for P to die before it finishes its own
exit, using the can_die semaphore "down"ed by C and "up"ed by P as
it exits. Regardless of whether whether C has terminated, there
is no race on wait(C), because C waits for P's permission before
it frees itself.
Regardless of whether P waits for C, P still "up"s C's can_die
semaphore when P dies, so C will always be freed. (However,
freeing C's resources is delayed until P's death.)
The initial thread is a special case because it has no parent to
wait for it or to "up" its can_die semaphore. Therefore, its
can_die semaphore is initialized to 1.
---- RATIONALE ----
>> Critique your design, pointing out advantages and disadvantages in
>> your design choices.
This design has the advantage of simplicity. Encapsulating most
of the synchronization logic into a new "latch" structure
abstracts what little complexity there is into a separate layer,
making the design easier to reason about. Also, all the new data
members are in `struct thread', with no need for any extra dynamic
allocation, etc., that would require extra management code.
On the other hand, this design is wasteful in that a child thread
cannot free itself before its parent has terminated. A parent
thread that creates a large number of short-lived child threads
could unnecessarily exhaust kernel memory. This is probably
acceptable for implementing kernel threads, but it may be a bad
idea for use with user processes because of the larger number of
resources that user processes tend to own.

195
doc/standards.texi Normal file
View File

@@ -0,0 +1,195 @@
@node Coding Standards
@appendix Coding Standards
All of you should be familiar with good coding standards by now.
This project will be much easier to complete and grade if you maintain a consistent code style and employ sensible variable naming policies.
Code style makes up a significant part of your final grade for this work, and will be scrutinised carefully.
We want to stress that aside from the fact that we are explicitly basing part of your grade on this,
good coding practices will also improve the quality of your code.
This makes it easier for your team-mates to interact with it and, ultimately, will improve your chances of having a good working program.
The rest of this appendix will discuss the coding standards used in the existing PintOS codebase and how we expect you to interact with this.
@menu
* Coding Style::
* C99::
* Unsafe String Functions::
@end menu
@comment ----------------------------------------------------------------------
@node Coding Style
@section Style
Style, for the purposes of our grading, refers to how readable your
code is. At minimum, this means that your code is well formatted, your
variable names are descriptive and your functions are decomposed and
well commented. Any other factors which make it hard (or easy) for us
to read or use your code will be reflected in your style grade.
The existing PintOS code is written in the GNU style and largely
follows the @uref{http://www.gnu.org/prep/standards_toc.html, , GNU
Coding Standards}. We encourage you to follow the applicable parts of
them too, especially chapter 5, ``Making the Best Use of C.'' Using a
different style won't cause actual problems so long as you are self-consistent in your additions.
It is ugly to see gratuitous differences in style from one function to another.
If your code is too ugly, it will cost you points.
Please limit C source file lines to at most 80 characters long.
PintOS comments sometimes refer to external standards or
specifications by writing a name inside square brackets, like this:
@code{[IA32-v3a]}. These names refer to the reference names used in
this documentation (@pxref{Bibliography}).
If you remove existing PintOS code, please delete it from your source
file entirely. Don't just put it into a comment or a conditional
compilation directive, because that makes the resulting code hard to
read. Version control software will allow you to recover the code if
necessary later.
We're only going to do a compile in the directory for the task being
submitted. You don't need to make sure that the previous tasks also
compile.
Task code should be written so that all of the subproblems for the
task function together, that is, without the need to rebuild with
different macros defined, etc.
If you decide to do any work beyond the spec that
changes normal PintOS behaviour so as to interfere with grading, then
you must implement it so that it only acts that way when given a
special command-line option of the form @option{-@var{name}}, where
@var{name} is a name of your choice. You can add such an option by
modifying @func{parse_options} in @file{threads/init.c}.
The introduction section (@pxref{Source Code}) describes some additional high-level coding style requirements.
@comment ----------------------------------------------------------------------
@page
@node C99
@section C99
The PintOS source code uses a few features of the ``C99'' standard
library that were not in the original 1989 standard for C.
Many programmers are unaware of these features, so we will describe them.
The new features used in PintOS are mostly in new headers:
@table @file
@item <stdbool.h>
Defines macros @code{bool}, a 1-bit type that takes on only the values
0 and 1, @code{true}, which expands to 1, and @code{false}, which
expands to 0.
@item <stdint.h>
On systems that support them, this header defines types
@code{int@var{n}_t} and @code{uint@var{n}_t} for @var{n} = 8, 16, 32,
64, and possibly other values. These are 2's complement signed and unsigned
types, respectively, with the given number of bits.
On systems where it is possible, this header also defines types
@code{intptr_t} and @code{uintptr_t}, which are integer types big
enough to hold a pointer.
On all systems, this header defines types @code{intmax_t} and
@code{uintmax_t}, which are the system's signed and unsigned integer
types with the widest ranges.
For every signed integer type @code{@var{type}_t} defined here, as well
as for @code{ptrdiff_t} defined in @file{<stddef.h>}, this header also
defines macros @code{@var{TYPE}_MAX} and @code{@var{TYPE}_MIN} that
give the type's range. Similarly, for every unsigned integer type
@code{@var{type}_t} defined here, as well as for @code{size_t} defined
in @file{<stddef.h>}, this header defines a @code{@var{TYPE}_MAX}
macro giving its maximum value.
@item <inttypes.h>
@file{<stdint.h>} provides no straightforward way to format
the types it defines with @func{printf} and related functions. This
header provides macros to help with that. For every
@code{int@var{n}_t} defined by @file{<stdint.h>}, it provides macros
@code{PRId@var{n}} and @code{PRIi@var{n}} for formatting values of
that type with @code{"%d"} and @code{"%i"}. Similarly, for every
@code{uint@var{n}_t}, it provides @code{PRIo@var{n}},
@code{PRIu@var{n}}, @code{PRIx@var{n}}, and @code{PRIX@var{n}}.
You use these something like this, taking advantage of the fact that
the C compiler concatenates adjacent string literals:
@example
#include <inttypes.h>
@dots{}
int32_t value = @dots{};
printf ("value=%08"PRId32"\n", value);
@end example
@noindent
(note that the @code{%08} format string above pads the output int to 8 significant figures).
The @samp{%} is not supplied by the @code{PRI} macros.
As shown above, you supply it yourself and follow it by any flags, field width, etc.
@item <stdio.h>
The @func{printf} function has some new type modifiers for printing
standard types:
@table @samp
@item j
For @code{intmax_t} (e.g.@: @samp{%jd}) or @code{uintmax_t} (e.g.@:
@samp{%ju}).
@item z
For @code{size_t} (e.g.@: @samp{%zu}).
@item t
For @code{ptrdiff_t} (e.g.@: @samp{%td}).
@end table
PintOS @func{printf} also implements a nonstandard @samp{'} flag that
groups large numbers with commas to make them easier to read.
@end table
@comment ----------------------------------------------------------------------@page
@node Unsafe String Functions
@section Unsafe String Functions
A few of the string functions declared in the standard
@file{<string.h>} and @file{<stdio.h>} headers are notoriously unsafe.
The worst offenders are intentionally not included in the PintOS C
library:
@table @code
@item strcpy()
When used carelessly this function can overflow the buffer reserved
for its output string. Use @func{strlcpy} instead. Refer to
comments in its source code in @code{lib/string.c} for documentation.
@item strncpy()
This function can leave its destination buffer without a null string
terminator. It also has performance problems. Again, use
@func{strlcpy}.
@item strcat()
Same issue as @func{strcpy}. Use @func{strlcat} instead.
Again, refer to comments in its source code in @code{lib/string.c} for
documentation.
@item strncat()
The meaning of its buffer size argument is surprising.
Again, use @func{strlcat}.
@item strtok()
Uses global data, so it is unsafe in threaded programs such as
kernels. Use @func{strtok_r} instead, and see its source code in
@code{lib/string.c} for documentation and an example.
@item sprintf()
Same issue as @func{strcpy}. Use @func{snprintf} instead. Refer
to comments in @code{lib/stdio.h} for documentation.
@item vsprintf()
Same issue as @func{strcpy}. Use @func{vsnprintf} instead.
@end table
If you try to use any of these functions, the error message will give
you a hint by referring to an identifier like
@code{dont_use_sprintf_use_snprintf}.

84
doc/task0_questions.texi Normal file
View File

@@ -0,0 +1,84 @@
@strong{Part A - Codebase Preview: Questions to Prepare for the MCQ AnswerBook Test}
@enumerate
@item Which Git command should you run to retrieve a copy of your individual repository for PintOS Task 0 in your local directory?
(@dfn{Hint: be specific to this task and think about ease of use.})
@item Why is using the strcpy() function to copy strings usually a bad idea?
(@dfn{Hint: be sure to clearly identify the problem.})
@item If test @file{src/tests/devices/alarm-multiple} fails, where would you find its output and result logs? Provide both paths and file names.
(@dfn{Hint: you might want to run this test to find out.})
@item In PintOS, a thread is characterized by a struct and an execution stack.
(a) What are the limitations on the size of these data structures?
(b) Explain how this relates to stack overflow and how PintOS identifies it.
@item Explain how thread scheduling in PintOS currently works in roughly 300 words. Include the chain of execution of function calls.
(@dfn{Hint: we expect you to at least mention which functions participate in a context switch, how they interact, how and when the thread state is modified and the role of interrupts.})
@item In PintOS, what is the default length (in ticks and in seconds) of a scheduler time slice?
(@dfn{Hint: read the Task 0 documentation carefully.})
@item In PintOS, how would you print an unsigned 64 bit @code{int}?
(Consider that you are working with C99). Don't forget to state any inclusions needed by your code.
@item Explain the property of @strong{reproducibility} and how the lack of reproducibility will affect debugging.
@item In PintOS, locks are implemented on top of semaphores.
(a) How do the functions in the API of locks relate to those of semaphores?
(b) What extra property do locks have that semaphores do not?
@item Define what is meant by a @strong{race-condition}. Why is the test @code{if(x != null)} insufficient to prevent a segmentation fault from occurring on an attempted access to a structure through the pointer @code{x}?
(@dfn{Hint: you should assume that the pointer variable is correctly typed, that the structure was successfully initialised earlier in the program and that there are other threads running in parallel.})
@end enumerate
@strong{Part B - The Alarm Clock}
Reimplement @code{timer_sleep()}, defined in @file{devices/timer.c}.
(@b{30 marks})
Although a working implementation of @code{timer_sleep} is provided, it "busy waits", that is, it spins in a loop checking the current time and calling @code{thread_yield()} until enough time has gone by.
You need to reimplement it to avoid busy waiting.
Further instructions and hints can be found in the PintOS manual.
The marks for this question are awarded as follows:
Passing the Automated Tests (@b{8 marks}).
Performance in the Code Review (@b{12 marks}).
Answering the Design Document Questions below (@b{10 marks}).
@itemize @w{}
@item @b{Data Structures}
A1: Copy here the declaration of each new or changed `@code{struct}' or `@code{struct}' member, global or static variable, `@code{typedef}', or enumeration. Identify the purpose of each in roughly 25 words. (@b{2 marks})
@item @b{Algorithms}
A2: Briefly describe what happens in a call to @code{timer_sleep()}, including the actions performed by the timer interrupt handler on each timer tick. (@b{2 marks})
A3: What steps are taken to minimize the amount of time spent in the timer interrupt handler? (@b{2 marks})
@item @b{Synchronization}
A4: How are race conditions avoided when multiple threads call @code{timer_sleep()} simultaneously? (@b{1 mark})
A5: How are race conditions avoided when a timer interrupt occurs during a call to @code{timer_sleep()}? (@b{1 mark})
@item @b{Rationale}
A6: Why did you choose this design?
In what ways is it superior to another design you considered? (@b{2 marks})
@end itemize

11
doc/task0_sheet.texi Normal file
View File

@@ -0,0 +1,11 @@
\input texinfo @c -*- texinfo -*-
@c %**start of header
@setfilename task0_sheet.info
@settitle PintOS Task 0
@c %**end of header
@chapter PintOS-IC Task 0 Questions
@include task0_questions.texi
@bye

6346
doc/texi2html Executable file

File diff suppressed because it is too large Load Diff

6976
doc/texinfo.tex Normal file

File diff suppressed because it is too large Load Diff

501
doc/threads.texi Normal file
View File

@@ -0,0 +1,501 @@
@node Task 1--Threads
@chapter Task 1: Scheduling
In this assignment, we give you a minimally functional thread system.
Your job is to extend the functionality of this system to gain a
better understanding of synchronization problems.
You will be working primarily in the @file{threads} directory for
this assignment. Compilation should be done in the @file{threads} directory.
Before you read the description of this task, you should read all of
the following sections: @ref{Introduction}, @ref{Task 0--Codebase}, @ref{Coding Standards},
@ref{Debugging Tools}, and @ref{Development Tools}. You should at least
skim the material from @ref{PintOS Loading} through @ref{Memory
Allocation}, especially @ref{Synchronization}. To complete this task
you will also need to read @ref{4.4BSD Scheduler}.
You must build task 1 on top of a working task 0 submission
as some of the task 1 test rely on a non-busy waiting implementation of @code{timer_sleep()}.
@menu
* Background::
* Development Suggestions::
* Task 1 Requirements::
* Task 1 FAQ::
@end menu
@node Background
@section Background
Now that you've become familiar with PintOS and its thread package, it's time to work on one of the most critical component of an operating system: the scheduler.
Working on the scheduler requires you to have grasped the main concepts of both the threading system and synchronization primitives. If you still feel uncertain about these topics, you are warmly invited to refer back to @ref{Understanding Threads} and @ref{Synchronization} and to carefully read the code in the corresponding source files.
@node Development Suggestions
@section Development Suggestions
In the past, many groups divided the assignment into pieces, then each
group member worked on his or her piece until just before the
deadline, at which time the group reconvened to combine their code and
submit. @strong{This is a bad idea. We do not recommend this
approach.} Groups that do this often find that two changes conflict
with each other, requiring lots of last-minute debugging. Some groups
who have done this have turned in code that did not even compile or
boot, much less pass any tests.
@localgitpolicy{}
You should expect to run into bugs that you simply don't understand
while working on this and subsequent tasks. When you do,
reread the appendix on debugging tools, which is filled with
useful debugging tips that should help you to get back up to speed
(@pxref{Debugging Tools}). Be sure to read the section on backtraces
(@pxref{Backtraces}), which will help you to get the most out of every
kernel panic or assertion failure.
@node Task 1 Requirements
@section Requirements
@menu
* Task 1 Design Document::
* Setting and Inspecting Priorities::
* Priority Scheduling::
* Priority Donation::
* Advanced Scheduler::
@end menu
@node Task 1 Design Document
@subsection Design Document
When you submit your work for task 1, you must also submit a completed copy of
@uref{threads.tmpl, , the task 1 design document}.
You can find a template design document for this task in @file{pintos/doc/threads.tmpl} and also on CATe.
You are free to submit your design document as either a @file{.txt} or @file{.pdf} file.
We recommend that you read the design document template before you start working on the task.
@xref{Task Documentation}, for a sample design document that goes along with a fictitious task.
@node Setting and Inspecting Priorities
@subsection Setting and Inspecting Priorities
Implement the following functions that allow a thread to examine and modify its own priority.
Skeletons for these functions are provided in @file{threads/thread.c}.
@deftypefun int thread_get_priority (void)
Returns the current thread's effective priority.
@end deftypefun
@deftypefun void thread_set_priority (int @var{new_priority})
Sets the current thread's priority to @var{new_priority}.
If the current thread no longer has the highest priority, yields.
@end deftypefun
@node Priority Scheduling
@subsection Priority Scheduling
Implement priority scheduling in PintOS.
When a thread is added to the ready list that has a higher priority
than the currently running thread, the current thread should
immediately yield the processor to the new thread. Similarly, when
threads are waiting for a lock, semaphore, or condition variable, the
highest priority waiting thread should be awakened first. A thread
may raise or lower its own priority at any time, but lowering its
priority such that it no longer has the highest priority must cause it
to immediately yield the CPU. In both the priority scheduler and the
advanced scheduler you will write later, the running thread should
be that with the highest priority.
Thread priorities range from @code{PRI_MIN} (0) to @code{PRI_MAX} (63).
Lower numbers correspond to lower priorities, so that priority 0
is the lowest priority and priority 63 is the highest.
The initial thread priority is passed as an argument to
@func{thread_create}. If there's no reason to choose another
priority, new threads should use @code{PRI_DEFAULT} (31). The @code{PRI_} macros are
defined in @file{threads/thread.h}, and you should not change their
values.
@node Priority Donation
@subsection Priority Donation
One issue with priority scheduling is ``priority inversion''. Consider
high, medium, and low priority threads @var{H}, @var{M}, and @var{L},
respectively. If @var{H} needs to wait for @var{L} (for instance, for a
lock held by @var{L}), and @var{M} is on the ready list, then @var{H}
will never get the CPU because the low priority thread will not get any
CPU time. A partial fix for this problem is for @var{H} to ``donate''
its priority to @var{L} while @var{L} is holding the lock, then recall
the donation once @var{L} releases (and thus @var{H} acquires) the lock.
Implement priority donation. You will need to account for all different
situations in which priority donation is required. In particular, be sure to handle:
@itemize
@item @strong{multiple donations}: multiple priorities can be donated to a single thread.
@item @strong{nested donations}: if @var{H} is waiting on
a lock that @var{M} holds and @var{M} is waiting on a lock that @var{L}
holds, then both @var{M} and @var{L} should be boosted to @var{H}'s
priority. If necessary, you may impose a reasonable limit on depth of
nested priority donation, such as 8 levels.
@end itemize
You must implement priority donation for locks.
You do not need to implement priority donation for the other PintOS synchronization constructs.
However, you do need to implement priority scheduling in all cases.
Finally, you should review your implementations of @code{thread_get_priority} and @code{thread_set_priority}
to make sure that they exhibit the correct behaviour in the presence of dontations.
In particular, in the presence of priority donations @code{thread_get_priority} must return the highest donated priority.
You do not need to provide any interface to allow a thread to directly modify other threads' priorities.
The priority scheduler is also not used in any later task.
@node Advanced Scheduler
@subsection Advanced Scheduler
Implement a multilevel feedback queue scheduler similar to the
4.4@acronym{BSD} scheduler to
reduce the average response time for running jobs on your system.
@xref{4.4BSD Scheduler}, for detailed requirements.
Like the priority scheduler, the advanced scheduler chooses the thread
to run based on priorities. However, the advanced scheduler does not do
priority donation. Thus, we recommend that you have the priority
scheduler working, except possibly for priority donation, before you
start work on the advanced scheduler.
You must write your code to allow us to choose a scheduling algorithm
policy at PintOS startup time. By default, the priority scheduler
must be active, but we must be able to choose the 4.4@acronym{BSD}
scheduler
with the @option{-mlfqs} kernel option. Passing this
option sets @code{thread_mlfqs}, declared in @file{threads/thread.h}, to
true when the options are parsed by @func{parse_options}, which happens
early in @func{main}.
When the 4.4@acronym{BSD} scheduler is enabled, threads no longer
directly control their own priorities. The @var{priority} argument to
@func{thread_create} should be ignored, as well as any calls to
@func{thread_set_priority}, and @func{thread_get_priority} should return
the thread's current priority as set by the scheduler.
The advanced scheduler is not used in any later task.
@node Task 1 FAQ
@section FAQ
@table @b
@item How much code will I need to write?
Here's a summary of our reference solution, produced by the
@command{diffstat} program. The final row gives total lines inserted
and deleted; a changed line counts as both an insertion and a deletion.
@verbatim
threads/fixed-point.h | 120 ++++++++++++++++++
threads/synch.c | 88 ++++++++++++-
threads/thread.c | 196 ++++++++++++++++++++++++++----
threads/thread.h | 19 ++
4 files changed, 397 insertions(+), 26 deletions(-)
@end verbatim
The reference solution represents just one possible solution. Many
other solutions are also possible and many of those differ greatly from
the reference solution. Some excellent solutions may not modify all the
files modified by the reference solution, and some may modify files not
modified by the reference solution.
@file{fixed-point.h} is a new file added by the reference solution.
@item Do we need a working Task 0 to implement Task 1?
Yes.
@item How do I update the @file{Makefile}s when I add a new source file?
@anchor{Adding Source Files}
To add a @file{.c} file, edit the top-level @file{Makefile.build}.
Add the new file to variable @samp{@var{dir}_SRC}, where
@var{dir} is the directory where you added the file. For this
task, that means you should add it to @code{threads_SRC} or
@code{devices_SRC}. Then run @code{make}. If your new file
doesn't get
compiled, run @code{make clean} and then try again.
When you modify the top-level @file{Makefile.build} and re-run
@command{make}, the modified
version should be automatically copied to
@file{threads/build/Makefile}. The converse is
not true, so any changes will be lost the next time you run @code{make
clean} from the @file{threads} directory. Unless your changes are
truly temporary, you should prefer to edit @file{Makefile.build}.
A new @file{.h} file does not require editing the @file{Makefile}s.
@item What does @code{warning: no previous prototype for `@var{func}'} mean?
It means that you defined a non-@code{static} function without
preceding it by a prototype. Because non-@code{static} functions are
intended for use by other @file{.c} files, for safety they should be
prototyped in a header file included before their definition. To fix
the problem, add a prototype in a header file that you include, or, if
the function isn't actually used by other @file{.c} files, make it
@code{static}.
@item What is the interval between timer interrupts?
Timer interrupts occur @code{TIMER_FREQ} times per second. You can
adjust this value by editing @file{devices/timer.h}. The default is
100 Hz.
We don't recommend changing this value, because any changes are likely
to cause many of the tests to fail.
@item How long is a time slice?
There are @code{TIME_SLICE} ticks per time slice. This macro is
declared in @file{threads/thread.c}. The default is 4 ticks.
We don't recommend changing this value, because any changes are likely
to cause many of the tests to fail.
@item How do I run the tests?
@xref{Testing}.
@item Why do I get a test failure in @func{pass}?
@anchor{The pass function fails}
You are probably looking at a backtrace that looks something like this:
@example
0xc0108810: debug_panic (lib/kernel/debug.c:32)
0xc010a99f: pass (tests/threads/tests.c:93)
0xc010bdd3: test_mlfqs_load_1 (...threads/mlfqs-load-1.c:33)
0xc010a8cf: run_test (tests/threads/tests.c:51)
0xc0100452: run_task (threads/init.c:283)
0xc0100536: run_actions (threads/init.c:333)
0xc01000bb: main (threads/init.c:137)
@end example
This is just confusing output from the @command{backtrace} program. It
does not actually mean that @func{pass} called @func{debug_panic}. In
fact, @func{fail} called @func{debug_panic} (via the @func{PANIC}
macro). GCC knows that @func{debug_panic} does not return, because it
is declared @code{NO_RETURN} (@pxref{Function and Parameter
Attributes}), so it doesn't include any code in @func{fail} to take
control when @func{debug_panic} returns. This means that the return
address on the stack looks like it is at the beginning of the function
that happens to follow @func{fail} in memory, which in this case happens
to be @func{pass}.
@xref{Backtraces}, for more information.
@item How do interrupts get re-enabled in the new thread following @func{schedule}?
Every path into @func{schedule} disables interrupts. They eventually
get re-enabled by the next thread to be scheduled. Consider the
possibilities: the new thread is running in @func{switch_thread} (but
see below), which is called by @func{schedule}, which is called by one
of a few possible functions:
@itemize @bullet
@item
@func{thread_exit}, but we'll never switch back into such a thread, so
it's uninteresting.
@item
@func{thread_yield}, which immediately restores the interrupt level upon
return from @func{schedule}.
@item
@func{thread_block}, which is called from multiple places:
@itemize @minus
@item
@func{sema_down}, which restores the interrupt level before returning.
@item
@func{idle}, which enables interrupts with an explicit assembly STI
instruction.
@item
@func{wait} in @file{devices/intq.c}, whose callers are responsible for
re-enabling interrupts.
@end itemize
@end itemize
There is a special case when a newly created thread runs for the first
time. Such a thread calls @func{intr_enable} as the first action in
@func{kernel_thread}, which is at the bottom of the call stack for every
kernel thread but the first.
@item What should I expect from the Task 1 code-review?
The code-review for this task will be conducted with each group in-person.
Our Task 1 code-review will cover @strong{four} main areas:
functional correctness, efficiency, design quality and general coding style.
@itemize @bullet
@item For @strong{functional correctness}, we will be looking to see if your implementation of priority scheduling strictly obeys the rule that "the highest priority ready thread will always be running" and that all cases of priority inversion are being correctly handled by your system for priority donations.
We will also be checking if your updated code for locks is free of any race conditions, paying specific attention to the @func{lock_acquire} and @func{lock_release} functions, as well as the interplay between them.
@item For @strong{efficiency}, we will be looking at the complexity characteristics of your modified code for semaphores, as well
the steps you have taken to minimise the time spent inside your timer interrupt handler.
@item For @strong{design quality}, we will be looking at the stability and robustness of any changes you have made to the core PintOS kernel (e.g. @func{thread_block}, @func{thread_unblock} and @func{thread_yield}) and the accuracy of the priority updates in your BSD scheduler.
@item For @strong{general coding style}, we will be paying attention to all of the usual elements of good style
that you should be used to from last year (e.g. consistent code layout, appropriate use of comments, avoiding magic numbers, etc.)
as well as your use of git (e.g. commit frequency and commit message quality).
In this task, we will be paying particular attention to the readability of your fixed-point mathematics abstraction within your BSD scheduler.
@end itemize
@end table
@menu
* Priority Scheduling FAQ::
* Advanced Scheduler FAQ::
@end menu
@node Priority Scheduling FAQ
@subsection Priority Scheduling FAQ
@table @b
@item Doesn't priority scheduling lead to starvation?
Yes, strict priority scheduling can lead to starvation
because a thread will not run if any higher-priority thread is runnable.
The advanced scheduler introduces a mechanism for dynamically
changing thread priorities.
Strict priority scheduling is valuable in real-time systems because it
offers the programmer more control over which jobs get processing
time. High priorities are generally reserved for time-critical
tasks. It's not ``fair,'' but it addresses other concerns not
applicable to a general-purpose operating system.
@item What thread should run after a lock has been released?
When a lock is released, the highest priority thread waiting for that
lock should be unblocked and put on the list of ready threads.
The scheduler should then run the highest priority thread on the ready list.
@item If the highest-priority thread yields, does it continue running?
Yes. If there is a single highest-priority thread, it continues
running until it blocks or finishes, even if it calls
@func{thread_yield}.
If multiple threads have the same highest priority,
@func{thread_yield} should switch among them in ``round robin'' order.
@item What happens to the priority of a donating thread?
Priority donation only changes the priority of the donee
thread. The donor thread's priority is unchanged.
Priority donation is not additive: if thread @var{A} (with priority 5) donates
to thread @var{B} (with priority 3), then @var{B}'s new priority is 5, not 8.
@item Can a thread's priority change while it is on the ready queue?
Yes. Consider a ready, low-priority thread @var{L} that holds a lock.
High-priority thread @var{H} attempts to acquire the lock and blocks,
thereby donating its priority to ready thread @var{L}.
@item Can a thread's priority change while it is blocked?
Yes. While a thread that has acquired lock @var{L} is blocked for any
reason, its priority can increase by priority donation if a
higher-priority thread attempts to acquire @var{L}. This case is
checked by the @code{priority-donate-sema} test.
@item Can a thread added to the ready list preempt the processor?
Yes. If a thread added to the ready list has higher priority than the
running thread, the correct behaviour is to immediately yield the
processor. It is not acceptable to wait for the next timer interrupt.
The highest priority thread should run as soon as it is runnable,
preempting whatever thread is currently running.
@item How does @func{thread_set_priority} affect a thread receiving donations?
It sets the thread's base priority. The thread's effective priority
becomes the higher of the newly set priority or the highest donated
priority. When the donations are released, the thread's priority
becomes the one set through the function call. This behaviour is checked
by the @code{priority-donate-lower} test.
@item Doubled test names in output make them fail.
Suppose you are seeing output in which some test names are doubled,
like this:
@example
(alarm-priority) begin
(alarm-priority) (alarm-priority) Thread priority 30 woke up.
Thread priority 29 woke up.
(alarm-priority) Thread priority 28 woke up.
@end example
What is happening is that output from two threads is being
interleaved. That is, one thread is printing @code{"(alarm-priority)
Thread priority 29 woke up.\n"} and another thread is printing
@code{"(alarm-priority) Thread priority 30 woke up.\n"}, but the first
thread is being preempted by the second in the middle of its output.
This problem indicates a bug in your priority scheduler. After all, a
thread with priority 29 should not be able to run while a thread with
priority 30 has work to do.
Normally, the implementation of the @code{printf()} function in the
PintOS kernel attempts to prevent such interleaved output by acquiring
a console lock during the duration of the @code{printf} call and
releasing it afterwards. However, the output of the test name,
e.g., @code{(alarm-priority)}, and the message following it is output
using two calls to @code{printf}, resulting in the console lock being
acquired and released twice.
@end table
@node Advanced Scheduler FAQ
@subsection Advanced Scheduler FAQ
@table @b
@item How does priority donation interact with the advanced scheduler?
It doesn't have to. We won't test priority donation and the advanced
scheduler at the same time.
@item Can I use one queue instead of 64 queues?
Yes. In general, your implementation may differ from the description,
as long as its behaviour is the same.
@item Some scheduler tests fail and I don't understand why. Help!
If your implementation mysteriously fails some of the advanced
scheduler tests, try the following:
@itemize
@item
Read the source files for the tests that you're failing, to make sure
that you understand what's going on. Each one has a comment at the
top that explains its purpose and expected results.
@item
Double-check your fixed-point arithmetic routines and your use of them
in the scheduler routines.
@item
Consider how much work your implementation does in the timer
interrupt. If the timer interrupt handler takes too long, then it
will take away most of a timer tick from the thread that the timer
interrupt preempted. When it returns control to that thread, it
therefore won't get to do much work before the next timer interrupt
arrives. That thread will therefore get blamed for a lot more CPU
time than it actually got a chance to use. This raises the
interrupted thread's recent CPU count, thereby lowering its priority.
It can cause scheduling decisions to change. It also raises the load
average.
@end itemize
@end table

108
doc/threads.tmpl Normal file
View File

@@ -0,0 +1,108 @@
+----------------------+
| OS 211 |
| TASK 1: SCHEDULING |
| DESIGN DOCUMENT |
+----------------------+
---- GROUP ----
>> Fill in the names and email addresses of your group members.
FirstName LastName <email@domain.example>
FirstName LastName <email@domain.example>
FirstName LastName <email@domain.example>
---- PRELIMINARIES ----
>> If you have any preliminary comments on your submission, or notes for the
>> markers, please give them here.
>> Please cite any offline or online sources you consulted while preparing your
>> submission, other than the PintOS documentation, course text, lecture notes
>> and course staff.
PRIORITY SCHEDULING
===================
---- DATA STRUCTURES ----
>> A1: (2 marks)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration.
>> Identify the purpose of each in roughly 25 words.
>> A2: (4 marks)
>> Draw a diagram that illustrates a nested donation in your structure and
>> briefly explain how this works.
---- ALGORITHMS ----
>> A3: (3 marks)
>> How do you ensure that the highest priority waiting thread wakes up first for
>> a (i) semaphore, (ii) lock, or (iii) condition variable?
>> A4: (3 marks)
>> Describe the sequence of events when a call to lock_acquire() causes a
>> priority donation.
>> How is nested donation handled?
>> A5: (3 marks)
>> Describe the sequence of events when lock_release() is called on a lock that
>> a higher-priority thread is waiting for.
---- SYNCHRONIZATION ----
>> A6: (2 marks)
>> How do you avoid a race condition in thread_set_priority() when a thread
>> needs to recompute its effective priority, but the donated priorities
>> potentially change during the computation?
>> Can you use a lock to avoid the race?
---- RATIONALE ----
>> A7: (3 marks)
>> Why did you choose this design?
>> In what ways is it superior to another design you considered?
ADVANCED SCHEDULER
==================
---- DATA STRUCTURES ----
>> B1: (2 marks)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration.
>> Identify the purpose of each in roughly 25 words.
---- ALGORITHMS ----
>> B2: (3 marks)
>> Suppose threads A, B, and C have nice values 0, 1, and 2 and each has a
>> recent_cpu value of 0.
>> Fill in the table below showing the scheduling decision, the priority and the
>> recent_cpu values for each thread after each given number of timer ticks:
timer recent_cpu priority thread
ticks A B C A B C to run
----- -- -- -- -- -- -- ------
0
4
8
12
16
20
24
28
32
36
>> B3: (2 marks)
>> Did any ambiguities in the scheduler specification make values in the table
>> uncertain?
>> If so, what rule did you use to resolve them?
---- RATIONALE ----
>> B4: (3 marks)
>> Briefly critique your design, pointing out advantages and disadvantages in
>> your design choices.

1241
doc/userprog.texi Normal file

File diff suppressed because it is too large Load Diff

116
doc/userprog.tmpl Normal file
View File

@@ -0,0 +1,116 @@
+-------------------------+
| OS 211 |
| TASK 2: USER PROGRAMS |
| DESIGN DOCUMENT |
+-------------------------+
---- GROUP ----
>> Fill in the names and email addresses of your group members.
FirstName LastName <email@domain.example>
FirstName LastName <email@domain.example>
FirstName LastName <email@domain.example>
---- PRELIMINARIES ----
>> If you have any preliminary comments on your submission, or notes for the
>> markers, please give them here.
>> Please cite any offline or online sources you consulted while preparing your
>> submission, other than the PintOS documentation, course text, lecture notes
>> and course staff.
ARGUMENT PASSING
================
---- DATA STRUCTURES ----
>> A1: (1 mark)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration.
>> Identify the purpose of each in roughly 25 words.
---- ALGORITHMS ----
>> A2: (2 marks)
>> How does your argument parsing code avoid overflowing the user's stack page?
>> What are the efficiency considerations of your approach?
---- RATIONALE ----
>> A3: (2 marks)
>> PintOS does not implement strtok() because it is not thread safe.
>> Explain the problem with strtok() and how strtok_r() avoids this issue.
>> A4: (3 marks)
>> In PintOS, the kernel separates commands into an executable name and arguments.
>> In Unix-like systems, the shell does this separation.
>> Identify three advantages of the Unix approach.
SYSTEM CALLS
============
---- DATA STRUCTURES ----
>> B1: (6 marks)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration.
>> Identify the purpose of each in roughly 25 words.
---- ALGORITHMS ----
>> B2: (2 marks)
>> Describe how your code ensures safe memory access of user provided data from
>> within the kernel.
>> B3: (3 marks)
>> Suppose that we choose to verify user provided pointers by validating them
>> before use (i.e. using the first method described in the spec).
>> What is the least and the greatest possible number of inspections of the page
>> table (e.g. calls to pagedir_get_page()) that would need to be made in the
>> following cases?
>> a) A system call that passes the kernel a pointer to 10 bytes of user data.
>> b) A system call that passes the kernel a pointer to a full page
>> (4,096 bytes) of user data.
>> c) A system call that passes the kernel a pointer to 4 full pages
>> (16,384 bytes) of user data.
>> You must briefly explain the checking tactic you would use and how it applies
>> to each case to generate your answers.
>> B4: (2 marks)
>> When an error is detected during a system call handler, how do you ensure
>> that all temporarily allocated resources (locks, buffers, etc.) are freed?
>> B5: (8 marks)
>> Describe your implementation of the "wait" system call and how it interacts
>> with process termination for both the parent and child.
---- SYNCHRONIZATION ----
>> B6: (2 marks)
>> The "exec" system call returns -1 if loading the new executable fails, so it
>> cannot return before the new executable has completed loading.
>> How does your code ensure this?
>> How is the load success/failure status passed back to the thread that calls
>> "exec"?
>> B7: (5 marks)
>> Consider parent process P with child process C.
>> How do you ensure proper synchronization and avoid race conditions when:
>> i) P calls wait(C) before C exits?
>> ii) P calls wait(C) after C exits?
>> iii) P terminates, without waiting, before C exits?
>> iv) P terminates, without waiting, after C exits?
>> Additionally, how do you ensure that all resources are freed regardless of
>> the above case?
---- RATIONALE ----
>> B8: (2 marks)
>> Why did you choose to implement safe access of user memory from the kernel in
>> the way that you did?
>> B9: (2 marks)
>> What advantages and disadvantages can you see to your design for file
>> descriptors?

835
doc/vm.texi Normal file
View File

@@ -0,0 +1,835 @@
@node Task 3--Virtual Memory
@chapter Task 3: Virtual Memory
By now you should have some familiarity with the inner workings of
PintOS. Your OS can properly handle multiple threads of execution with proper
synchronization, and can load multiple user programs at once. However,
the number and size of programs that can run is limited by the machine's
main memory size. In this assignment, you will remove that limitation.
You will build this assignment on top of the last one. Test programs
from task 2 should also work with task 3. You should take care to
fix any bugs in your task 2 submission before you start work on
task 3, because those bugs will most likely cause the same problems
in task 3.
You will continue to handle PintOS disks and file systems the same way
you did in the previous assignment (@pxref{Using the File System}).
@menu
* Task 3 Background::
* Task 3 Suggested Order of Implementation::
* Task 3 Requirements::
* Task 3 FAQ::
@end menu
@node Task 3 Background
@section Background
@menu
* Task 3 Source Files::
* Memory Terminology::
* Resource Management Overview::
* Managing the Supplemental Page Table::
* Managing the Frame Table::
* Accessed and Dirty Bits::
* Managing the Table of File Mappings::
* Managing the Swap Partition::
* Managing Memory Mapped Files Back::
@end menu
@node Task 3 Source Files
@subsection Source Files
You will work in the @file{vm} directory for this task. The
@file{vm} directory contains only @file{Makefile}s. The only
change from @file{userprog} is that this new @file{Makefile} turns on
the setting @option{-DVM}. All code you write will be in new
files or in files introduced in earlier tasks.
You will probably be encountering just a few files for the first time:
@table @file
@item devices/block.h
@itemx devices/block.c
Provides sector-based read and write access to block devices.
@item devices/swap.h
@itemx devices/swap.c
Provides page-based read and write access to the swap partition.
You will use this interface to access the swap partition, as wrapper around a block device.
@end table
@node Memory Terminology
@subsection Memory Terminology
Careful definitions are needed to keep discussion of virtual memory from
being confusing. Thus, we begin by presenting some terminology for
memory and storage. Some of these terms should be familiar from task
2 (@pxref{Virtual Memory Layout}), but much of it is new.
@menu
* Pages::
* Frames::
* Page Tables::
* Swap Slots::
@end menu
@node Pages
@subsubsection Pages
A @dfn{page}, sometimes called a @dfn{virtual page}, is a continuous
region of virtual memory 4,096 bytes (the @dfn{page size}) in length. A
page must be @dfn{page-aligned}, that is, start on a virtual address
evenly divisible by the page size. Thus, a 32-bit virtual address can
be divided into a 20-bit @dfn{page number} and a 12-bit @dfn{page
offset} (or just @dfn{offset}), like this:
@example
@group
31 12 11 0
+-------------------+-----------+
| Page Number | Offset |
+-------------------+-----------+
Virtual Address
@end group
@end example
Each process has an independent set of @dfn{user virtual pages}, which
are those pages below virtual address @code{PHYS_BASE}, typically
@t{0xc0000000} (3 GB). The set of @dfn{kernel virtual pages}, on the
other hand, is global, remaining the same regardless of what thread or
process is active. The kernel may access both user virtual and kernel virtual pages,
but a user process may access only its own user virtual pages. @xref{Virtual
Memory Layout}, for more information.
PintOS provides several useful functions for working with virtual
addresses. @xref{Virtual Addresses}, for details.
@node Frames
@subsubsection Frames
A @dfn{frame}, sometimes called a @dfn{physical frame} or a @dfn{page
frame}, is a continuous region of physical memory. Like pages, frames
must be page-size and page-aligned. Thus, a 32-bit physical address can
be divided into a 20-bit @dfn{frame number} and a 12-bit @dfn{frame
offset} (or just @dfn{offset}), like this:
@example
@group
31 12 11 0
+-------------------+-----------+
| Frame Number | Offset |
+-------------------+-----------+
Physical Address
@end group
@end example
The 80@var{x}86 doesn't provide any way to directly access memory at a
physical address. PintOS works around this by mapping kernel virtual
memory directly to physical memory: the first page of kernel virtual
memory is mapped to the first frame of physical memory, the second page
to the second frame, and so on. Thus, frames can be accessed through
kernel virtual memory.
PintOS provides functions for translating between physical addresses and
kernel virtual addresses. @xref{Virtual Addresses}, for details.
@node Page Tables
@subsubsection Page Tables
In PintOS, a @dfn{page table} is a data structure that the CPU uses to
translate a virtual address to a physical address, that is, from a page
to a frame. The page table format is dictated by the 80@var{x}86
architecture. PintOS provides page table management code in
@file{pagedir.c} (@pxref{Page Table}).
The diagram below illustrates the relationship between pages and frames.
The virtual address, on the left, consists of a page number and an
offset. The page table translates the page number into a frame number,
which is combined with the unmodified offset to obtain the physical
address, on the right.
@example
@group
+----------+
.--------------->|Page Table|-----------.
/ +----------+ |
31 | 12 11 0 31 | 12 11 0
+---------+----+ +---------+----+
|Page Nr | Ofs| |Frame Nr | Ofs|
+---------+----+ +---------+----+
Virt Addr | Phys Addr |
\_______________________________________/
@end group
@end example
@node Swap Slots
@subsubsection Swap Slots
A @dfn{swap slot} is a continuous, page-size region of disk space in the
swap partition. Although hardware limitations dictating the placement of
slots are looser than for pages and frames, swap slots should be
page-aligned because there is no downside in doing so.
@node Resource Management Overview
@subsection Resource Management Overview
You will need to design the following data structures:
@table @asis
@item Supplemental page table
Enables page fault handling by supplementing the page table.
@xref{Managing the Supplemental Page Table}.
@item Frame table
Allows efficient implementation of eviction policy.
@xref{Managing the Frame Table}.
@item Table of file mappings
Processes may map files into their virtual memory space. You need a
table to track which files are mapped into which pages.
@end table
You do not necessarily need to implement three completely distinct data
structures: it may be convenient to wholly or partially merge related
resources into a unified data structure.
For each data structure, you need to determine what information each
element should contain. You also need to decide on the data structure's
scope, either local (per-process) or global (applying to the whole
system), and how many instances are required within its scope.
To simplify your design, you may store these data structures in
non-pageable memory (i.e. kernel space).
That means that you can be sure that pointers among them will remain valid.
Possible choices of data structures include arrays, lists, bitmaps, and
hash tables. An array is often the simplest approach, but a sparsely
populated array wastes memory. Lists are also simple, but traversing a
long list to find a particular position wastes time. Both arrays and
lists can be resized, but lists more efficiently support insertion and
deletion in the middle.
PintOS includes a bitmap data structure in @file{lib/kernel/bitmap.c}
and @file{lib/kernel/bitmap.h}. A bitmap is an array of bits, each of
which can be true or false. Bitmaps are typically used to track usage
in a set of (identical) resources: if resource @var{n} is in use, then
bit @var{n} of the bitmap is true. PintOS bitmaps are fixed in size,
although you could extend their implementation to support resizing.
PintOS also includes a hash table data structure (@pxref{Hash Table}).
PintOS hash tables efficiently support insertions and deletions over a
wide range of table sizes.
Although more complex data structures may yield performance or other
benefits, they may also needlessly complicate your implementation.
Thus, we do not recommend implementing any advanced data structure
(e.g.@: a balanced binary tree) as part of your design.
@node Managing the Supplemental Page Table
@subsection Managing the Supplemental Page Table
The @dfn{supplemental page table} extends the page table with
additional data about each page. It is needed because of the
limitations imposed by the page table's format. Such a data structure
is also often referred to as a ``page table''; we add the word ``supplemental''
to reduce confusion.
The supplemental page table is used for at least two purposes. Most
importantly, on a page fault, the kernel looks up the virtual page that
faulted in the supplemental page table to find out what data should be
there. Second, the kernel consults the supplemental page table when a
process terminates, to decide what resources to free.
You may organize the supplemental page table as you wish. There are at
least two basic approaches to its organization: in terms of segments or
in terms of pages. Optionally, you may use the page table itself as an
index to track the members of the supplemental page table. You will
have to modify the PintOS page table implementation in @file{pagedir.c}
to do so. We recommend this approach for advanced students only.
@xref{Page Table Entry Format}, for more information.
The most important user of the supplemental page table is the page fault
handler. In task 2, a page fault always indicated a bug in the
kernel or a user program. In task 3, this is no longer true. Now, a
page fault might only indicate that the page must be brought in from a
file or swap. You will have to implement a more sophisticated page
fault handler to handle these cases. Your page fault handler, which you
should implement by modifying @func{page_fault} in
@file{userprog/exception.c}, needs to do roughly the following:
@enumerate 1
@item
Locate the page that faulted in the supplemental page table. If the
memory reference is valid, use the supplemental page table entry to
locate the data that goes in the page, which might be in the file
system, or in a swap slot, or it might simply be an all-zero page. When
you implement sharing, the page's data might even already be in a page
frame, but not in the page table.
If the supplemental page table indicates that the user process should
not expect any data at the address it was trying to access, or if the
page lies within kernel virtual memory, or if the access is an attempt
to write to a read-only page, then the access is invalid. Any invalid
access terminates the process and thereby frees all of its resources.
@item
Obtain a frame to store the page. @xref{Managing the Frame Table}, for
details.
When you implement sharing, the data you need may already be in a frame,
in which case you must be able to locate that frame.
@item
Fetch the data into the frame, by reading it from the file system or
swap, zeroing it, etc.
When you implement sharing, the page you need may already be in a frame,
in which case no action is necessary in this step.
@item
Point the page table entry for the faulting virtual address to the frame.
You can use the functions in @file{userprog/pagedir.c}.
@end enumerate
@node Managing the Frame Table
@subsection Managing the Frame Table
The @dfn{frame table} contains one entry for each frame that contains a
user page. Each entry in the frame table contains a pointer to the
page, if any, that currently occupies it, and other data of your choice.
The frame table allows PintOS to efficiently implement an eviction
policy, by choosing a page to evict when no frames are free.
The frames used for user pages should be obtained from the ``user
pool,'' by calling @code{palloc_get_page(PAL_USER)}. You must use
@code{PAL_USER} to avoid allocating from the ``kernel pool,'' which
could cause some test cases to fail unexpectedly (@pxref{Why
PAL_USER?}). If you modify @file{palloc.c} as part of your frame table
implementation, be sure to retain the distinction between the two pools.
The most important operation on the frame table is obtaining an unused
frame. This is easy when a frame is free. When none is free, a frame
must be made free by evicting some page from its frame.
If no frame can be evicted without allocating a swap slot, but swap is
full, panic the kernel. Real OSes apply a wide range of policies to
recover from or prevent such situations, but these policies are beyond
the scope of this task.
The process of eviction comprises roughly the following steps:
@enumerate 1
@item
Choose a frame to evict, using your page replacement algorithm. The
``accessed'' and ``dirty'' bits in the page table, described below, will
come in handy.
@item
Remove references to the frame from any page table that refers to it.
Until you have implemented sharing, only a single page should refer to
a frame at any given time.
@item
If necessary, write the page to the file system or to swap.
@end enumerate
The evicted frame may then be used to store a different page.
@node Accessed and Dirty Bits
@subsubsection Accessed and Dirty Bits
80@var{x}86 hardware provides some assistance for implementing page
replacement algorithms, through a pair of bits in the page table entry
(PTE) for each page. On any read or write to a page, the CPU sets the
@dfn{accessed bit} to 1 in the page's PTE, and on any write, the CPU
sets the @dfn{dirty bit} to 1. The CPU never resets these bits to 0,
but the OS may do so.
You need to be aware of @dfn{aliases}, that is, two (or more) pages that
refer to the same frame. When an aliased frame is accessed, the
accessed and dirty bits are updated in only one page table entry (the
one for the page used for access). The accessed and dirty bits for the
other aliases are not updated.
In PintOS, every user virtual page is aliased to its kernel virtual
page. You must manage these aliases somehow. For example, your code
could check and update the accessed and dirty bits for both addresses.
Alternatively, the kernel could avoid the problem by only accessing user
data through the user virtual address.
Other aliases should only arise once you implement sharing, or if there is a bug in your code.
@xref{Page Table Accessed and Dirty Bits}, for details of the functions
to work with accessed and dirty bits.
@node Managing the Table of File Mappings
@subsection Managing the Table of File Mappings
In order to implement sharing of read-only executable pages, you will need to track which files are mapped to which page.
We suggest that you create a table, or nested data-structure, to store this information.
This table only needs to store details about read-only executable pages.
Do not confuse this with memory-mapped files, which you will probably want to manage separately.
There are a couple of functions in @file{filesys/file.c} that you might find very helpful when impelementing sharing.
The @func{file_compare} function can be used to check if two file structs (@code{file1} and @code{file2})
are referencing the same underlying file (i.e. inode).
The @func{file_hash} function is a hashing function that also operates on the internal underlying file representation.
@node Managing the Swap Partition
@subsection Managing the Swap Partition
PintOS provides a complete impelentation of a swap partition manager, that wraps around a block device.
This includes an internal swap table to track in-use and free swap slots.
The @func{swap_out} function can be used to pick an unused swap slot when evicting a page @code{vaddr} from its frame 'out' to the swap partition.
The @func{swap_in} function can be used to restore a page 'in' to memory at @code{vaddr} and free its swap @code{slot}.
You can also use the @func{swap_drop} function to free a swap @code{slot}, for example when the owning process is terminated.
Internally, the swap partition makes use of the @code{BLOCK_SWAP} block device for swapping.
It obtainins the @struct{block} that represents it by calling @func{block_get_role}.
From the @file{vm/build} directory, use the command @code{pintos-mkdisk swap.dsk
--swap-size=@var{n}} to create an disk named @file{swap.dsk} that contains a @var{n}-MB swap partition.
Afterward, @file{swap.dsk} will automatically be attached as an extra disk when you run @command{pintos}.
Alternatively, you can tell @command{pintos} to use a temporary @var{n}-MB swap disk for a single
run with @option{--swap-size=@var{n}}.
Swap slots should be allocated lazily, that is, only when they are actually required by eviction.
Reading data pages from the executable and writing them to swap immediately at process startup is not lazy.
Swap slots should also not be reserved to store particular pages.
@node Managing Memory Mapped Files Back
@subsection Managing Memory Mapped Files
The file system is most commonly accessed with @code{read} and
@code{write} system calls. A secondary interface is to ``map'' the file
into virtual pages, using the @code{mmap} system call. The program can
then use memory instructions directly on the file data.
Suppose file @file{foo} is @t{0x1000} bytes (4 kB, or one page) long.
If @file{foo} is mapped into memory starting at address @t{0x5000}, then
any memory accesses to locations @t{0x5000}@dots{}@t{0x5fff} will access
the corresponding bytes of @file{foo}.
Here's a program that uses @code{mmap} to print a file to the console.
It opens the file specified on the command line, maps it at virtual
address @t{0x10000000}, writes the mapped data to the console (fd 1),
and unmaps the file.
@example
#include <stdio.h>
#include <syscall.h>
int main (int argc UNUSED, char *argv[])
@{
void *data = (void *) 0x10000000; /* @r{Address at which to map.} */
int fd = open (argv[1]); /* @r{Open file.} */
mapid_t map = mmap (fd, data); /* @r{Map file.} */
write (1, data, filesize (fd)); /* @r{Write file to console.} */
munmap (map); /* @r{Unmap file (optional).} */
return 0;
@}
@end example
A similar program with full error handling is included as @file{mcat.c}
in the @file{examples} directory, which also contains @file{mcp.c} as a
second example of @code{mmap}.
Your submission must be able to track what memory is used by memory
mapped files. This is necessary to properly handle page faults in the
mapped regions and to ensure that mapped files do not overlap any other
segments within the process.
@node Task 3 Suggested Order of Implementation
@section Suggested Order of Implementation
We suggest the following initial order of implementation:
@enumerate 1
@item
Frame table (@pxref{Managing the Frame Table}). Change @file{process.c}
to use your frame table allocator.
Do not implement swapping yet. If you run out of frames, fail the
allocator or panic the kernel.
After this step, your kernel should still pass all the task 2 test
cases.
@item
Supplemental page table and page fault handler (@pxref{Managing the Supplemental Page Table}).
Change @file{process.c} to lazy-load a process's executable
by recording the necessary information for each page in the supplemental page table during @code{load_segment}.
Implement the actual loading of these code and data segments in the page fault handler.
For now, consider only valid accesses.
After this step, your kernel should pass all of the task 2
functionality test cases, but only some of the robustness tests.
@item
From here, you can implement stack growth, mapped files, sharing and page
reclamation on process exit in parallel.
@item
The next step is to implement eviction (@pxref{Managing the Frame
Table}). Initially you could choose the page to evict randomly. At
this point, you need to consider how to manage accessed and dirty bits
and aliasing of user and kernel pages. Synchronization is also a
concern: how do you deal with it if process A faults on a page whose
frame process B is in the process of evicting?
@end enumerate
@node Task 3 Requirements
@section Requirements
This assignment is an open-ended design problem. We are going to say as
little as possible about how to do things. Instead we will focus on
what functionality we require your OS to support. We will expect
you to come up with a design that makes sense. You will have the
freedom to choose how to handle page faults, how to organize the swap
partition, how to implement paging, etc.
@menu
* Task 3 Design Document::
* Paging::
* Stack Growth::
* Memory Mapped Files::
@end menu
@node Task 3 Design Document
@subsection Design Document
When you submit your work for task 3, you must also submit a completed copy of
@uref{vm.tmpl, , the task 3 design document template}.
You can find a template design document for this task in @file{pintos/doc/vm.tmpl} and also on CATe.
You are free to submit your design document as either a @file{.txt} or @file{.pdf} file.
We recommend that you read the design document template before you start working on the task.
@xref{Task Documentation}, for a sample design document that goes along with a fictitious task.
@node Paging
@subsection Paging
Implement paging for segments loaded from executables. All of these
pages should be loaded lazily, that is, only as the kernel intercepts
page faults for them. Upon eviction, pages modified since load (e.g. data
segment pages), as indicated by the ``dirty bit'', should be written to swap.
Unmodified pages, including read-only pages, should never be written to
swap because they can always be read back from the executable.
Your design should allow for parallelism. If one page fault requires
I/O, in the meantime processes that do not fault should continue
executing and other page faults that do not require I/O should be able
to complete. This will require some synchronization effort.
You'll need to modify the core of the program loader, which is the loop
in @func{load_segment} in @file{userprog/process.c}. Each time around
the loop, @code{page_read_bytes} receives the number of bytes to read
from the executable file and @code{page_zero_bytes} receives the number
of bytes to initialize to zero following the bytes read. The two always
sum to @code{PGSIZE} (4,096). The handling of a page depends on these
variables' values:
@itemize @bullet
@item
If @code{page_read_bytes} equals @code{PGSIZE}, the page should be demand
paged from the underlying file on its first access.
@item
If @code{page_zero_bytes} equals @code{PGSIZE}, the page does not need to
be read from disk at all because it is all zeroes. You should handle
such pages by creating a new page consisting of all zeroes at the
first page fault.
@item
Otherwise, neither @code{page_read_bytes} nor @code{page_zero_bytes}
equals @code{PGSIZE}. In this case, an initial part of the page is to
be read from the underlying file and the remainder zeroed.
@end itemize
Watch out for executable segments that share a page in memory, and thus overlap in the page-table.
The provided code in @file{userprog/process.c} already handles this by checking during @code{load_segment} if any @code{upage} has already been installed. In such a case, rather than allocating/installing a new page of memory, the existing page is updated instead.
You will need to do something similar in your supplemental page table.
@node Stack Growth
@subsection Stack Growth
Implement stack growth.
In task 2, the stack was limited a single page at the top of the user virtual address space and user programs would crash if they exceeded this limit.
Now, if the stack grows past its current size, you should allocate additional pages as necessary.
You should allocate additional stack pages only if the corresponding page fault ``appears'' to be a stack access.
To this end, you will need to devise a heuristic that attempts to distinguish stack accesses from other accesses.
User programs are buggy if they write to the stack below the stack
pointer, because typical real OSes may interrupt a process at any time
to deliver a ``signal,'' which pushes data on the stack.@footnote{This rule is
common but not universal. One modern exception is the
@uref{http://www.x86-64.org/documentation/abi.pdf, @var{x}86-64 System V
ABI}, which designates 128 bytes below the stack pointer as a ``red
zone'' that may not be modified by signal or interrupt handlers.}
However, the 80@var{x}86 @code{PUSH} instruction checks access
permissions before it adjusts the stack pointer, so it may cause a page
fault 4 bytes below the stack pointer. (Otherwise, @code{PUSH} would
not be restartable in a straightforward fashion.) Similarly, the
@code{PUSHA} instruction pushes 32 bytes at once, so it can fault 32
bytes below the stack pointer.
You will need to be able to obtain the current value of the user
program's stack pointer. Within a system call or a page fault generated
by a user program, you can retrieve it from the @code{esp} member of the
@struct{intr_frame} passed to @func{syscall_handler} or
@func{page_fault}, respectively. If you verify user pointers before
accessing them (@pxref{Accessing User Memory}), these are the only cases
you need to handle. On the other hand, if you depend on page faults to
detect invalid memory access, you will need to handle another case,
where a page fault occurs in the kernel. Since the processor only
saves the stack pointer when an exception causes a switch from user
to kernel mode, reading @code{esp} out of the @struct{intr_frame}
passed to @func{page_fault} would yield an undefined value, not the
user stack pointer. You will need to arrange another way, such as
saving @code{esp} into @struct{thread} on the initial transition
from user to kernel mode.
You should impose some absolute limit on stack size, as do most OSes.
Some OSes make the limit user-adjustable, e.g.@: with the
@command{ulimit} command on many Unix systems.
On many GNU/Linux systems, the default limit is 8 MB.
The first stack page need not be allocated lazily. You can allocate
and initialize it with the command line arguments at load time, with
no need to wait for it to be faulted in.
All stack pages should be candidates for eviction. An evicted stack
page should be written to swap.
@node Memory Mapped Files
@subsection Memory Mapped Files
Implement memory mapped files, including the following system calls.
@deftypefn {System Call} mapid_t mmap (int @var{fd}, void *@var{addr})
Maps the file open as @var{fd} into the process's virtual address
space. The entire file is mapped into consecutive virtual pages
starting at @var{addr}.
Your VM system must lazily load pages in @code{mmap} regions and use the
@code{mmap}ed file itself as backing store for the mapping. That is,
evicting a page mapped by @code{mmap} writes it back to the file it was
mapped from.
If the file's length is not a multiple of @code{PGSIZE}, then some
bytes in the final mapped page ``stick out'' beyond the end of the
file. Set these bytes to zero when the page is faulted in from the
file system,
and discard them when the page is written back to disk.
If successful, this function returns a ``mapping ID'' that
uniquely identifies the mapping within the process. On failure,
it must return -1, which otherwise should not be a valid mapping id,
and the process's mappings must be unchanged.
A call to @code{mmap} may fail if the file open as @var{fd} has a
length of zero bytes. It must fail if @var{addr} is not page-aligned
or if the range of pages mapped overlaps any existing set of mapped
pages, including the space reserved for the stack
or pages mapped at executable load time.
It must also fail if @var{addr} is 0, because some PintOS code assumes
virtual page 0 is not mapped. Finally, file descriptors 0 and 1,
representing console input and output, are not mappable.
@end deftypefn
@deftypefn {System Call} void munmap (mapid_t @var{mapping})
Unmaps the mapping designated by @var{mapping}, which must be a
mapping ID returned by a previous call to @code{mmap} by the same
process that has not yet been unmapped.
@end deftypefn
All mappings are implicitly unmapped when a process exits, whether via
@code{exit} or by any other means. When a mapping is unmapped, whether
implicitly or explicitly, all pages written to by the process are
written back to the file, and pages not written must not be. The pages
are then removed from the process's list of virtual pages.
Closing or removing a file does not unmap any of its mappings. Once
created, a mapping is valid until @code{munmap} is called or the process
exits, following the Unix convention. @xref{Removing an Open File}, for
more information. You should use the @code{file_reopen} function to
obtain a separate and independent reference to the file for each of
its mappings.
If two or more processes map the same file, there is no requirement that
they see consistent data. Unix handles this by making the two mappings
share the same physical page, but the Unix @code{mmap} system call also has
an argument allowing the client to specify whether the page is shared or
private (i.e.@: copy-on-write).
@subsection Accessing User Memory
You will need to adapt your code to access user memory (@pxref{Accessing
User Memory}) while handling a system call. Just as user processes may
access pages whose content is currently in a file or in swap space, so
can they pass addresses that refer to such non-resident pages to system
calls. Moreover, unless your kernel takes measures to prevent this,
a page may be evicted from its frame even while it is being accessed
by kernel code. If kernel code accesses such non-resident user pages,
a page fault will result.
While accessing user memory, your kernel must either be prepared to handle
such page faults, or it must prevent them from occurring. The kernel
must prevent such page faults while it is holding resources it would
need to acquire to handle these faults. In PintOS, such resources include
locks acquired by the device driver(s) that control the device(s) containing
the file system and swap space. As a concrete example, you must not
allow page faults to occur while a device driver accesses a user buffer
passed to @code{file_read}, because you would not be able to invoke
the driver while handling such faults.
Preventing such page faults requires cooperation between the code within
which the access occurs and your page eviction code. For instance,
you could extend your frame table to record when a page contained in
a frame must not be evicted. (This is also referred to as ``pinning''
or ``locking'' the page in its frame.) Pinning restricts your page
replacement algorithm's choices when looking for pages to evict, so be
sure to pin pages no longer than necessary, and avoid pinning pages when
it is not necessary.
@node Task 3 FAQ
@section FAQ
@table @b
@item How much code will I need to write?
Here's a summary of our reference solution, produced by the
@command{diffstat} program. The final row gives total lines inserted
and deleted; a changed line counts as both an insertion and a deletion.
@verbatim
Makefile.build | 4 +-
threads/init.c | 5 +
threads/interrupt.c | 2 +
threads/thread.c | 26 +-
threads/thread.h | 37 ++-
userprog/exception.c | 12 +-
userprog/pagedir.c | 10 +-
userprog/process.c | 355 +++++++++++++-----
userprog/syscall.c | 612 ++++++++++++++++++++++++++++++-
userprog/syscall.h | 1 +
vm/frame.c | 162 +++++++++
vm/frame.h | 23 +
vm/page.c | 293 ++++++++++++++++
vm/page.h | 51 ++
14 files changed, 1489 insertions(+), 104 deletions(-)
@end verbatim
This summary is relative to the PintOS base code, but the reference
solution for task 3 starts from the reference solution to task 2.
@xref{Task 2 FAQ}, for the summary of task 2.
The reference solution represents just one possible solution. Many
other solutions are also possible and many of those differ greatly from
the reference solution. Some excellent solutions may not modify all the
files modified by the reference solution, and some may modify files not
modified by the reference solution.
@item Do we need a working Task 2 to implement Task 3?
Yes.
@item How complex does our page replacement algorithm need to be?
@anchor{VM Extra Credit}
If you implement an advanced page replacement algorithm,
such as the ``second chance'' or the ``clock'' algorithms,
then you will get more marks for this part of the task.
You should also implement sharing: when multiple processes are created that use
the same executable file, share read-only pages among those processes
instead of creating separate copies of read-only segments for each
process. If you carefully designed your data structures,
sharing of read-only pages should not make this part significantly
harder.
@item Do we need to handle paging for both user virtual memory and kernel virtual memory?
No, you only need to implement paging for user virtual memory.
One of the golden rules of OS development is ``Don't page out the paging code!''
@item How do we resume a process after we have handled a page fault?
Returning from @func{page_fault} resumes the current user process
(@pxref{Internal Interrupt Handling}).
It will then retry the instruction to which the instruction pointer points.
@item Why do user processes sometimes fault above the stack pointer?
You might notice that, in the stack growth tests, the user program faults
on an address that is above the user program's current stack pointer,
even though the @code{PUSH} and @code{PUSHA} instructions would cause
faults 4 and 32 bytes below the current stack pointer.
This is not unusual. The @code{PUSH} and @code{PUSHA} instructions are
not the only instructions that can trigger user stack growth.
For instance, a user program may allocate stack space by decrementing the
stack pointer using a @code{SUB $n, %esp} instruction, and then use a
@code{MOV ..., m(%esp)} instruction to write to a stack location within
the allocated space that is @var{m} bytes above the current stack pointer.
Such accesses are perfectly valid, and your kernel must grow the
user program's stack to allow those accesses to succeed.
@item Does the virtual memory system need to support data segment growth?
No. The size of the data segment is determined by the linker. We still
have no dynamic allocation in PintOS (although it is possible to
``fake'' it at the user level by using memory-mapped files). Supporting
data segment growth should add little additional complexity to a
well-designed system.
@item Why should I use @code{PAL_USER} for allocating page frames?
@anchor{Why PAL_USER?}
Passing @code{PAL_USER} to @func{palloc_get_page} causes it to allocate
memory from the user pool, instead of the main kernel pool. Running out
of pages in the user pool just causes user programs to page, but running
out of pages in the kernel pool will cause many failures because so many
kernel functions need to obtain memory.
You can layer some other allocator on top of @func{palloc_get_page} if
you like, but it should be the underlying mechanism.
Also, you can use the @option{-ul} kernel command-line option to limit
the size of the user pool, which makes it easy to test your VM
implementation with various user memory sizes.
@item What should we do if the stack grows into a @code{mmap} file?
This should not be possible.
The specification of @code{mmap()} rules out creating mappings within the possible stack space,
so you should abort any attempt to create a mapping within that space.
The stack should also not be able to grow beyond it's reserved stack space,
thus ruling out the possibility of any such overlap.
@item What should I expect from the Task 3 code-review?
The code-review for this task will be conducted with each group in-person.
Our Task 3 code-review will cover @strong{four} main areas:
functional correctness, efficiency, design quality and general coding style.
@itemize @bullet
@item For @strong{functional correctness}, we will be looking to see if your code for page sharing accurately tracks the accessed/dirty status of each shared page and if your stack-fault heuristic is correct.
We will also be checking if your code for page allocation, page fault handling and memory mapping/unmapping is free of any race conditions.
@item For @strong{efficiency}, we will be making sure that you only load executable code segments on demand (lazy loading).
We will also be checking to make sure that you have made efficient use of the swap space and that your code is free of any memory leaks.
@item For @strong{design quality}, we will be looking to see if you have implemented an advanced eviction algorithm
(i.e. an algorithm that considers the properties of the pages in memory in order to chose a good eviction candidate).
@item For @strong{general coding style}, we will be paying attention to all of the usual elements of good style
that you should be used to from last year (e.g. consistent code layout, appropriate use of comments, avoiding magic numbers, etc.)
as well as your use of git (e.g. commit frequency and commit message quality).
In this task, we will be paying particular attention to any additional efficiency improvements you have made to your eviction algorithm (e.g. encouraging fairness) or the system's overall use of memory (e.g. increased page sharing).
We will also be looking at any use of hash tables, specifically checking that your hash functions are chosen to avoid frequent collisions.
@end itemize
@end table

130
doc/vm.tmpl Normal file
View File

@@ -0,0 +1,130 @@
+--------------------------+
| OS 211 |
| TASK 3: VIRTUAL MEMORY |
| DESIGN DOCUMENT |
+--------------------------+
---- GROUP ----
>> Fill in the names and email addresses of your group members.
FirstName LastName <email@domain.example>
FirstName LastName <email@domain.example>
FirstName LastName <email@domain.example>
---- PRELIMINARIES ----
>> If you have any preliminary comments on your submission, or notes for the
>> markers, please give them here.
>> Please cite any offline or online sources you consulted while preparing your
>> submission, other than the PintOS documentation, course text, lecture notes
>> and course staff.
PAGE TABLE MANAGEMENT AND LAZY LOADING
======================================
---- DATA STRUCTURES ----
>> A1: (2 marks)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration that relates to your
>> supplemental page table and table of file mappings.
>> Identify the purpose of each in roughly 25 words.
---- ALGORITHMS ----
>> A2: (2 marks)
>> Describe your code for finding the location of a given page that is not
>> currently loaded into memory.
>> A3: (2 marks)
>> How have you implemented sharing of read only pages?
>> A4: (2 marks)
>> When a process P obtains a frame that was previously used by a process Q,
>> how do you adjust the page directory of process Q (and any other data
>> structures) to reflect the frame Q no longer has?
---- SYNCHRONIZATION ----
>> A5: (2 marks)
>> Explain how you handle access to user pages that are not present when a
>> system call is made.
---- RATIONALE ----
>> A6: (2 marks)
>> Why did you choose the data structure(s) that you did for representing the
>> supplemental page table and table of file mappings?
FRAME TABLE MANAGEMENT AND EVICTION
===================================
---- DATA STRUCTURES ----
>> B1: (1 marks)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration that relates to your
>> frame table.
>> Identify the purpose of each in roughly 25 words.
---- ALGORITHMS ----
>> B2: (2 marks)
>> When a frame is required but none is free, some frame must be evicted.
>> Describe your code for choosing a frame to evict.
---- SYNCHRONIZATION ----
>> B3: (2 marks)
>> When two user processes both need a new frame at the same time, how are
>> races avoided?
>> You should consider both when there are and are not free frames
>> available in memory.
>> B4: (2 marks)
>> A page fault in process P can cause another process Q's frame to be evicted.
>> How do you ensure that Q cannot access or modify that page during the
>> eviction process?
>> B5: (2 marks)
>> A page fault in process P can cause another process Q's frame to be evicted.
>> How do you avoid a race between P evicting Q's frame and Q faulting the page
>> back in?
>> B6: (2 marks)
>> Explain how your synchronization design prevents deadlock.
>> (You may want to refer to the necessary conditions for deadlock.)
---- RATIONALE ----
>> B7: (2 marks)
>> There is an obvious trade-off between parallelism and the complexity of your
>> synchronisation methods.
>> Explain where your design falls along this continuum and why you chose to
>> design it this way.
MEMORY MAPPED FILES
===================
---- DATA STRUCTURES ----
>> C1: (2 marks)
>> Copy here the declaration of each new or changed `struct' or `struct' member,
>> global or static variable, `typedef', or enumeration that relates to your
>> file mapping table.
>> Identify the purpose of each in roughly 25 words.
---- ALGORITHMS ----
>> C2: (2 marks)
>> Explain how you determine whether a new file mapping overlaps with any
>> existing segment and how you handle such a case.
---- RATIONALE ----
>> C3: (1 mark)
>> Mappings created with "mmap" have similar semantics to those of data
>> demand-paged from executables.
>> How does your code-base take advantage of this?