pintos_22/doc/vm.texi

@node Task 3--Virtual Memory
@chapter Task 3: Virtual Memory

By now you should have some familiarity with the inner workings of
PintOS.  Your OS can properly handle multiple threads of execution with proper
synchronization, and can load multiple user programs at once.  However,
the number and size of programs that can run is limited by the machine's
main memory size.  In this assignment, you will remove that limitation.

You will build this assignment on top of the last one.  Test programs
from task 2 should also work with task 3.  You should take care to
fix any bugs in your task 2 submission before you start work on
task 3, because those bugs will most likely cause the same problems
in task 3.

You will continue to handle PintOS disks and file systems the same way
you did in the previous assignment (@pxref{Using the File System}).

@menu
* Task 3 Background::
* Task 3 Suggested Order of Implementation::
* Task 3 Requirements::
* Task 3 FAQ::
@end menu

@node Task 3 Background
@section Background

@menu
* Task 3 Source Files::
* Memory Terminology::
* Resource Management Overview::
* Managing the Supplemental Page Table::
* Managing the Frame Table::
* Accessed and Dirty Bits::
* Managing the Table of File Mappings::
* Managing the Swap Partition::
* Managing Memory Mapped Files Back::
@end menu

@node Task 3 Source Files
@subsection Source Files

You will work in the @file{vm} directory for this task.  The
@file{vm} directory contains only @file{Makefile}s.  The only
change from @file{userprog} is that this new @file{Makefile} turns on
the setting @option{-DVM}.  All code you write will be in new
files or in files introduced in earlier tasks.

You will probably be encountering just a few files for the first time:

@table @file
@item devices/block.h
@itemx devices/block.c
Provides sector-based read and write access to block devices.

@item devices/swap.h
@itemx devices/swap.c
Provides page-based read and write access to the swap partition.
You will use this interface to access the swap partition, as wrapper around a block device.
@end table

@node Memory Terminology
@subsection Memory Terminology

Careful definitions are needed to keep discussion of virtual memory from
being confusing.  Thus, we begin by presenting some terminology for
memory and storage.  Some of these terms should be familiar from task
2 (@pxref{Virtual Memory Layout}), but much of it is new.

@menu
* Pages::
* Frames::
* Page Tables::
* Swap Slots::
@end menu

@node Pages
@subsubsection Pages

A @dfn{page}, sometimes called a @dfn{virtual page}, is a continuous
region of virtual memory 4,096 bytes (the @dfn{page size}) in length.  A
page must be @dfn{page-aligned}, that is, start on a virtual address
evenly divisible by the page size.  Thus, a 32-bit virtual address can
be divided into a 20-bit @dfn{page number} and a 12-bit @dfn{page
offset} (or just @dfn{offset}), like this:

@example
@group
               31               12 11        0
              +-------------------+-----------+
              |    Page Number    |   Offset  |
              +-------------------+-----------+
                       Virtual Address
@end group
@end example

Each process has an independent set of @dfn{user virtual pages}, which
are those pages below virtual address @code{PHYS_BASE}, typically
@t{0xc0000000} (3 GB).  The set of @dfn{kernel virtual pages}, on the
other hand, is global, remaining the same regardless of what thread or
process is active.  The kernel may access both user virtual and kernel virtual pages,
but a user process may access only its own user virtual pages.  @xref{Virtual
Memory Layout}, for more information.

PintOS provides several useful functions for working with virtual
addresses.  @xref{Virtual Addresses}, for details.

@node Frames
@subsubsection Frames

A @dfn{frame}, sometimes called a @dfn{physical frame} or a @dfn{page
frame}, is a continuous region of physical memory.  Like pages, frames
must be page-size and page-aligned.  Thus, a 32-bit physical address can
be divided into a 20-bit @dfn{frame number} and a 12-bit @dfn{frame
offset} (or just @dfn{offset}), like this:

@example
@group
               31               12 11        0
              +-------------------+-----------+
              |    Frame Number   |   Offset  |
              +-------------------+-----------+
                       Physical Address
@end group
@end example

The 80@var{x}86 doesn't provide any way to directly access memory at a
physical address.  PintOS works around this by mapping kernel virtual
memory directly to physical memory: the first page of kernel virtual
memory is mapped to the first frame of physical memory, the second page
to the second frame, and so on.  Thus, frames can be accessed through
kernel virtual memory.

PintOS provides functions for translating between physical addresses and
kernel virtual addresses.  @xref{Virtual Addresses}, for details.

@node Page Tables
@subsubsection Page Tables

In PintOS, a @dfn{page table} is a data structure that the CPU uses to
translate a virtual address to a physical address, that is, from a page
to a frame.  The page table format is dictated by the 80@var{x}86
architecture.  PintOS provides page table management code in
@file{pagedir.c} (@pxref{Page Table}).

The diagram below illustrates the relationship between pages and frames.
The virtual address, on the left, consists of a page number and an
offset.  The page table translates the page number into a frame number,
which is combined with the unmodified offset to obtain the physical
address, on the right.

@example
@group
                         +----------+
        .--------------->|Page Table|-----------.
       /                 +----------+            |
   31  |  12 11 0                            31  |  12 11 0
  +---------+----+                          +---------+----+
  |Page Nr  | Ofs|                          |Frame Nr | Ofs|
  +---------+----+                          +---------+----+
   Virt Addr   |                             Phys Addr   |
                \_______________________________________/
@end group
@end example

@node Swap Slots
@subsubsection Swap Slots

A @dfn{swap slot} is a continuous, page-size region of disk space in the
swap partition.  Although hardware limitations dictating the placement of
slots are looser than for pages and frames, swap slots should be
page-aligned because there is no downside in doing so.

@node Resource Management Overview
@subsection Resource Management Overview

You will need to design the following data structures:

@table @asis
@item Supplemental page table

Enables page fault handling by supplementing the page table.
@xref{Managing the Supplemental Page Table}.

@item Frame table

Allows efficient implementation of eviction policy.
@xref{Managing the Frame Table}.

@item Table of file mappings

Processes may map files into their virtual memory space.  You need a
table to track which files are mapped into which pages.
@end table

You do not necessarily need to implement three completely distinct data
structures: it may be convenient to wholly or partially merge related
resources into a unified data structure.

For each data structure, you need to determine what information each
element should contain.  You also need to decide on the data structure's
scope, either local (per-process) or global (applying to the whole
system), and how many instances are required within its scope.

To simplify your design, you may store these data structures in
non-pageable memory (i.e. kernel space).
That means that you can be sure that pointers among them will remain valid.

Possible choices of data structures include arrays, lists, bitmaps, and
hash tables.  An array is often the simplest approach, but a sparsely
populated array wastes memory.  Lists are also simple, but traversing a
long list to find a particular position wastes time.  Both arrays and
lists can be resized, but lists more efficiently support insertion and
deletion in the middle.

PintOS includes a bitmap data structure in @file{lib/kernel/bitmap.c}
and @file{lib/kernel/bitmap.h}.  A bitmap is an array of bits, each of
which can be true or false.  Bitmaps are typically used to track usage
in a set of (identical) resources: if resource @var{n} is in use, then
bit @var{n} of the bitmap is true.  PintOS bitmaps are fixed in size,
although you could extend their implementation to support resizing.

PintOS also includes a hash table data structure (@pxref{Hash Table}).
PintOS hash tables efficiently support insertions and deletions over a
wide range of table sizes.

Although more complex data structures may yield performance or other
benefits, they may also needlessly complicate your implementation.
Thus, we do not recommend implementing any advanced data structure
(e.g.@: a balanced binary tree) as part of your design.

@node Managing the Supplemental Page Table
@subsection Managing the Supplemental Page Table

The @dfn{supplemental page table} extends the page table with
additional data about each page.  It is needed because of the
limitations imposed by the page table's format.  Such a data structure
is also often referred to as  a ``page table''; we add the word ``supplemental''
to reduce confusion.

The supplemental page table is used for at least two purposes.  Most
importantly, on a page fault, the kernel looks up the virtual page that
faulted in the supplemental page table to find out what data should be
there.  Second, the kernel consults the supplemental page table when a
process terminates, to decide what resources to free.

You may organize the supplemental page table as you wish.  There are at
least two basic approaches to its organization: in terms of segments or
in terms of pages.  Optionally, you may use the page table itself as an
index to track the members of the supplemental page table.  You will
have to modify the PintOS page table implementation in @file{pagedir.c}
to do so.  We recommend this approach for advanced students only.
@xref{Page Table Entry Format}, for more information.

The most important user of the supplemental page table is the page fault
handler.  In task 2, a page fault always indicated a bug in the
kernel or a user program.  In task 3, this is no longer true.  Now, a
page fault might only indicate that the page must be brought in from a
file or swap.  You will have to implement a more sophisticated page
fault handler to handle these cases.  Your page fault handler, which you
should implement by modifying @func{page_fault} in
@file{userprog/exception.c}, needs to do roughly the following:

@enumerate 1
@item
Locate the page that faulted in the supplemental page table.  If the
memory reference is valid, use the supplemental page table entry to
locate the data that goes in the page, which might be in the file
system, or in a swap slot, or it might simply be an all-zero page.  When
you implement sharing, the page's data might even already be in a page
frame, but not in the page table.

If the supplemental page table indicates that the user process should
not expect any data at the address it was trying to access, or if the
page lies within kernel virtual memory, or if the access is an attempt
to write to a read-only page, then the access is invalid.  Any invalid
access terminates the process and thereby frees all of its resources.

@item
Obtain a frame to store the page.  @xref{Managing the Frame Table}, for
details.

When you implement sharing, the data you need may already be in a frame,
in which case you must be able to locate that frame.

@item
Fetch the data into the frame, by reading it from the file system or
swap, zeroing it, etc.

When you implement sharing, the page you need may already be in a frame,
in which case no action is necessary in this step.

@item
Point the page table entry for the faulting virtual address to the frame.
You can use the functions in @file{userprog/pagedir.c}.
@end enumerate

@node Managing the Frame Table
@subsection Managing the Frame Table

The @dfn{frame table} contains one entry for each frame that contains a
user page.  Each entry in the frame table contains a pointer to the
page, if any, that currently occupies it, and other data of your choice.
The frame table allows PintOS to efficiently implement an eviction
policy, by choosing a page to evict when no frames are free.

The frames used for user pages should be obtained from the ``user
pool,'' by calling @code{palloc_get_page(PAL_USER)}.  You must use
@code{PAL_USER} to avoid allocating from the ``kernel pool,'' which
could cause some test cases to fail unexpectedly (@pxref{Why
PAL_USER?}).  If you modify @file{palloc.c} as part of your frame table
implementation, be sure to retain the distinction between the two pools.

The most important operation on the frame table is obtaining an unused
frame.  This is easy when a frame is free.  When none is free, a frame
must be made free by evicting some page from its frame.

If no frame can be evicted without allocating a swap slot, but swap is
full, panic the kernel.  Real OSes apply a wide range of policies to
recover from or prevent such situations, but these policies are beyond
the scope of this task.

The process of eviction comprises roughly the following steps:

@enumerate 1
@item
Choose a frame to evict, using your page replacement algorithm.  The
``accessed'' and ``dirty'' bits in the page table, described below, will
come in handy.

@item
Remove references to the frame from any page table that refers to it.

Until you have implemented sharing, only a single page should refer to
a frame at any given time.

@item
If necessary, write the page to the file system or to swap.
@end enumerate

The evicted frame may then be used to store a different page.

@node Accessed and Dirty Bits
@subsubsection Accessed and Dirty Bits

80@var{x}86 hardware provides some assistance for implementing page
replacement algorithms, through a pair of bits in the page table entry
(PTE) for each page.  On any read or write to a page, the CPU sets the
@dfn{accessed bit} to 1 in the page's PTE, and on any write, the CPU
sets the @dfn{dirty bit} to 1.  The CPU never resets these bits to 0,
but the OS may do so.

You need to be aware of @dfn{aliases}, that is, two (or more) pages that
refer to the same frame.  When an aliased frame is accessed, the
accessed and dirty bits are updated in only one page table entry (the
one for the page used for access).  The accessed and dirty bits for the
other aliases are not updated.

In PintOS, every user virtual page is aliased to its kernel virtual
page.  You must manage these aliases somehow.  For example, your code
could check and update the accessed and dirty bits for both addresses.
Alternatively, the kernel could avoid the problem by only accessing user
data through the user virtual address.

Other aliases should only arise once you implement sharing, or if there is a bug in your code.

@xref{Page Table Accessed and Dirty Bits}, for details of the functions
to work with accessed and dirty bits.

@node Managing the Table of File Mappings
@subsection Managing the Table of File Mappings

In order to implement sharing of read-only executable pages, you will need to track which files are mapped to which page.
We suggest that you create a table, or nested data-structure, to store this information.
This table only needs to store details about read-only executable pages.
Do not confuse this with memory-mapped files, which you will probably want to manage separately.

There are a couple of functions in @file{filesys/file.c} that you might find very helpful when impelementing sharing.
The @func{file_compare} function can be used to check if two file structs (@code{file1} and @code{file2})
are referencing the same underlying file (i.e. inode).
The @func{file_hash} function is a hashing function that also operates on the internal underlying file representation.

@node Managing the Swap Partition
@subsection Managing the Swap Partition

PintOS provides a complete impelentation of a swap partition manager, that wraps around a block device.
This includes an internal swap table to track in-use and free swap slots.
The @func{swap_out} function can be used to pick an unused swap slot when evicting a page @code{vaddr} from its frame 'out' to the swap partition.
The @func{swap_in} function can be used to restore a page 'in' to memory at @code{vaddr} and free its swap @code{slot}.
You can also use the @func{swap_drop} function to free a swap @code{slot}, for example when the owning process is terminated.

Internally, the swap partition makes use of the @code{BLOCK_SWAP} block device for swapping.
It obtainins the @struct{block} that represents it by calling @func{block_get_role}.
From the @file{vm/build} directory, use the command @code{pintos-mkdisk swap.dsk
--swap-size=@var{n}} to create an disk named @file{swap.dsk} that contains a @var{n}-MB swap partition.
Afterward, @file{swap.dsk} will automatically be attached as an extra disk when you run @command{pintos}.
Alternatively, you can tell @command{pintos} to use a temporary @var{n}-MB swap disk for a single
run with @option{--swap-size=@var{n}}.

Swap slots should be allocated lazily, that is, only when they are actually required by eviction.
Reading data pages from the executable and writing them to swap immediately at process startup is not lazy.
Swap slots should also not be reserved to store particular pages.

@node Managing Memory Mapped Files Back
@subsection Managing Memory Mapped Files

The file system is most commonly accessed with @code{read} and
@code{write} system calls.  A secondary interface is to ``map'' the file
into virtual pages, using the @code{mmap} system call.  The program can
then use memory instructions directly on the file data.

Suppose file @file{foo} is @t{0x1000} bytes (4 kB, or one page) long.
If @file{foo} is mapped into memory starting at address @t{0x5000}, then
any memory accesses to locations @t{0x5000}@dots{}@t{0x5fff} will access
the corresponding bytes of @file{foo}.

Here's a program that uses @code{mmap} to print a file to the console.
It opens the file specified on the command line, maps it at virtual
address @t{0x10000000}, writes the mapped data to the console (fd 1),
and unmaps the file.

@example
#include <stdio.h>
#include <syscall.h>
int main (int argc UNUSED, char *argv[])
@{
  void *data = (void *) 0x10000000;     /* @r{Address at which to map.} */

  int fd = open (argv[1]);              /* @r{Open file.} */
  mapid_t map = mmap (fd, data);        /* @r{Map file.} */
  write (1, data, filesize (fd));       /* @r{Write file to console.} */
  munmap (map);                         /* @r{Unmap file (optional).} */
  return 0;
@}
@end example

A similar program with full error handling is included as @file{mcat.c}
in the @file{examples} directory, which also contains @file{mcp.c} as a
second example of @code{mmap}.

Your submission must be able to track what memory is used by memory
mapped files.  This is necessary to properly handle page faults in the
mapped regions and to ensure that mapped files do not overlap any other
segments within the process.

@node Task 3 Suggested Order of Implementation
@section Suggested Order of Implementation

We suggest the following initial order of implementation:

@enumerate 1
@item
Frame table (@pxref{Managing the Frame Table}).  Change @file{process.c}
to use your frame table allocator.

Do not implement swapping yet.  If you run out of frames, fail the
allocator or panic the kernel.

After this step, your kernel should still pass all the task 2 test
cases.

@item
Supplemental page table and page fault handler (@pxref{Managing the Supplemental Page Table}).
Change @file{process.c} to lazy-load a process's executable
by recording the necessary information for each page in the supplemental page table during @code{load_segment}.
Implement the actual loading of these code and data segments in the page fault handler.
For now, consider only valid accesses.

After this step, your kernel should pass all of the task 2
functionality test cases, but only some of the robustness tests.

@item
From here, you can implement stack growth, mapped files, sharing and page
reclamation on process exit in parallel.

@item
The next step is to implement eviction (@pxref{Managing the Frame
Table}).  Initially you could choose the page to evict randomly.  At
this point, you need to consider how to manage accessed and dirty bits
and aliasing of user and kernel pages.  Synchronization is also a
concern: how do you deal with it if process A faults on a page whose
frame process B is in the process of evicting?
@end enumerate

@node Task 3 Requirements
@section Requirements

This assignment is an open-ended design problem.  We are going to say as
little as possible about how to do things.  Instead we will focus on
what functionality we require your OS to support.  We will expect
you to come up with a design that makes sense.  You will have the
freedom to choose how to handle page faults, how to organize the swap
partition, how to implement paging, etc.

@menu
* Task 3 Design Document::
* Paging::
* Stack Growth::
* Memory Mapped Files::
@end menu

@node Task 3 Design Document
@subsection Design Document

When you submit your work for task 3, you must also submit a completed copy of
@uref{vm.tmpl, , the task 3 design document template}.
You can find a template design document for this task in @file{pintos/doc/vm.tmpl} and also on CATe.
You are free to submit your design document as either a @file{.txt} or @file{.pdf} file.
We recommend that you read the design document template before you start working on the task.
@xref{Task Documentation}, for a sample design document that goes along with a fictitious task.

@node Paging
@subsection Paging

Implement paging for segments loaded from executables.  All of these
pages should be loaded lazily, that is, only as the kernel intercepts
page faults for them.  Upon eviction, pages modified since load (e.g. data
segment pages), as indicated by the ``dirty bit'', should be written to swap.
Unmodified pages, including read-only pages, should never be written to
swap because they can always be read back from the executable.

Your design should allow for parallelism.  If one page fault requires
I/O, in the meantime processes that do not fault should continue
executing and other page faults that do not require I/O should be able
to complete.  This will require some synchronization effort.

You'll need to modify the core of the program loader, which is the loop
in @func{load_segment} in @file{userprog/process.c}.  Each time around
the loop, @code{page_read_bytes} receives the number of bytes to read
from the executable file and @code{page_zero_bytes} receives the number
of bytes to initialize to zero following the bytes read.  The two always
sum to @code{PGSIZE} (4,096).  The handling of a page depends on these
variables' values:

@itemize @bullet
@item
If @code{page_read_bytes} equals @code{PGSIZE}, the page should be demand
paged from the underlying file on its first access.

@item
If @code{page_zero_bytes} equals @code{PGSIZE}, the page does not need to
be read from disk at all because it is all zeroes.  You should handle
such pages by creating a new page consisting of all zeroes at the
first page fault.

@item
Otherwise, neither @code{page_read_bytes} nor @code{page_zero_bytes}
equals @code{PGSIZE}.  In this case, an initial part of the page is to
be read from the underlying file and the remainder zeroed.
@end itemize

Watch out for executable segments that share a page in memory, and thus overlap in the page-table.
The provided code in @file{userprog/process.c} already handles this by checking during @code{load_segment} if any @code{upage} has already been installed. In such a case, rather than allocating/installing a new page of memory, the existing page is updated instead.
You will need to do something similar in your supplemental page table.

@node Stack Growth
@subsection Stack Growth

Implement stack growth.
In task 2, the stack was limited a single page at the top of the user virtual address space and user programs would crash if they exceeded this limit.
Now, if the stack grows past its current size, you should allocate additional pages as necessary.

You should allocate additional stack pages only if the corresponding page fault ``appears'' to be a stack access.
To this end, you will need to devise a heuristic that attempts to distinguish stack accesses from other accesses.

User programs are buggy if they write to the stack below the stack
pointer, because typical real OSes may interrupt a process at any time
to deliver a ``signal,'' which pushes data on the stack.@footnote{This rule is
common but not universal.  One modern exception is the
@uref{http://www.x86-64.org/documentation/abi.pdf, @var{x}86-64 System V
ABI}, which designates 128 bytes below the stack pointer as a ``red
zone'' that may not be modified by signal or interrupt handlers.}
However, the 80@var{x}86 @code{PUSH} instruction checks access
permissions before it adjusts the stack pointer, so it may cause a page
fault 4 bytes below the stack pointer.  (Otherwise, @code{PUSH} would
not be restartable in a straightforward fashion.)  Similarly, the
@code{PUSHA} instruction pushes 32 bytes at once, so it can fault 32
bytes below the stack pointer.

You will need to be able to obtain the current value of the user
program's stack pointer.  Within a system call or a page fault generated
by a user program, you can retrieve it from the @code{esp} member of the
@struct{intr_frame} passed to @func{syscall_handler} or
@func{page_fault}, respectively.  If you verify user pointers before
accessing them (@pxref{Accessing User Memory}), these are the only cases
you need to handle.  On the other hand, if you depend on page faults to
detect invalid memory access, you will need to handle another case,
where a page fault occurs in the kernel.  Since the processor only
saves the stack pointer when an exception causes a switch from user
to kernel mode, reading @code{esp} out of the @struct{intr_frame}
passed to @func{page_fault} would yield an undefined value, not the
user stack pointer.  You will need to arrange another way, such as
saving @code{esp} into @struct{thread} on the initial transition
from user to kernel mode.

You should impose some absolute limit on stack size, as do most OSes.
Some OSes make the limit user-adjustable, e.g.@: with the
@command{ulimit} command on many Unix systems.
On many GNU/Linux systems, the default limit is 8 MB.

The first stack page need not be allocated lazily.  You can allocate
and initialize it with the command line arguments at load time, with
no need to wait for it to be faulted in.

All stack pages should be candidates for eviction.  An evicted stack
page should be written to swap.

@node Memory Mapped Files
@subsection Memory Mapped Files

Implement memory mapped files, including the following system calls.

@deftypefn {System Call} mapid_t mmap (int @var{fd}, void *@var{addr})
Maps the file open as @var{fd} into the process's virtual address
space.  The entire file is mapped into consecutive virtual pages
starting at @var{addr}.

Your VM system must lazily load pages in @code{mmap} regions and use the
@code{mmap}ed file itself as backing store for the mapping.  That is,
evicting a page mapped by @code{mmap} writes it back to the file it was
mapped from.

If the file's length is not a multiple of @code{PGSIZE}, then some
bytes in the final mapped page ``stick out'' beyond the end of the
file.  Set these bytes to zero when the page is faulted in from the
file system,
and discard them when the page is written back to disk.

If successful, this function returns a ``mapping ID'' that
uniquely identifies the mapping within the process.  On failure,
it must return -1, which otherwise should not be a valid mapping id,
and the process's mappings must be unchanged.

A call to @code{mmap} may fail if the file open as @var{fd} has a
length of zero bytes.  It must fail if @var{addr} is not page-aligned
or if the range of pages mapped overlaps any existing set of mapped
pages, including the space reserved for the stack
or pages mapped at executable load time.
It must also fail if @var{addr} is 0, because some PintOS code assumes
virtual page 0 is not mapped.  Finally, file descriptors 0 and 1,
representing console input and output, are not mappable.
@end deftypefn

@deftypefn {System Call} void munmap (mapid_t @var{mapping})
Unmaps the mapping designated by @var{mapping}, which must be a
mapping ID returned by a previous call to @code{mmap} by the same
process that has not yet been unmapped.
@end deftypefn

All mappings are implicitly unmapped when a process exits, whether via
@code{exit} or by any other means.  When a mapping is unmapped, whether
implicitly or explicitly, all pages written to by the process are
written back to the file, and pages not written must not be.  The pages
are then removed from the process's list of virtual pages.

Closing or removing a file does not unmap any of its mappings.  Once
created, a mapping is valid until @code{munmap} is called or the process
exits, following the Unix convention.  @xref{Removing an Open File}, for
more information.  You should use the @code{file_reopen} function to
obtain a separate and independent reference to the file for each of
its mappings.

If two or more processes map the same file, there is no requirement that
they see consistent data.  Unix handles this by making the two mappings
share the same physical page, but the Unix @code{mmap} system call also has
an argument allowing the client to specify whether the page is shared or
private (i.e.@: copy-on-write).

@subsection Accessing User Memory
You will need to adapt your code to access user memory (@pxref{Accessing
User Memory}) while handling a system call.  Just as user processes may
access pages whose content is currently in a file or in swap space, so
can they pass addresses that refer to such non-resident pages to system
calls.  Moreover, unless your kernel takes measures to prevent this,
a page may be evicted from its frame even while it is being accessed
by kernel code.  If kernel code accesses such non-resident user pages,
a page fault will result.

While accessing user memory, your kernel must either be prepared to handle
such page faults, or it must prevent them from occurring.  The kernel
must prevent such page faults while it is holding resources it would
need to acquire to handle these faults.  In PintOS, such resources include
locks acquired by the device driver(s) that control the device(s) containing
the file system and swap space.  As a concrete example, you must not
allow page faults to occur while a device driver accesses a user buffer
passed to @code{file_read}, because you would not be able to invoke
the driver while handling such faults.

Preventing such page faults requires cooperation between the code within
which the access occurs and your page eviction code.  For instance,
you could extend your frame table to record when a page contained in
a frame must not be evicted.  (This is also referred to as ``pinning''
or ``locking'' the page in its frame.)  Pinning restricts your page
replacement algorithm's choices when looking for pages to evict, so be
sure to pin pages no longer than necessary, and avoid pinning pages when
it is not necessary.

@node Task 3 FAQ
@section FAQ

@table @b
@item How much code will I need to write?

Here's a summary of our reference solution, produced by the
@command{diffstat} program.  The final row gives total lines inserted
and deleted; a changed line counts as both an insertion and a deletion.

@verbatim
 Makefile.build       |    4 +-
 threads/init.c       |    5 +
 threads/interrupt.c  |    2 +
 threads/thread.c     |   26 +-
 threads/thread.h     |   37 ++-
 userprog/exception.c |   12 +-
 userprog/pagedir.c   |   10 +-
 userprog/process.c   |  355 +++++++++++++-----
 userprog/syscall.c   |  612 ++++++++++++++++++++++++++++++-
 userprog/syscall.h   |    1 +
 vm/frame.c           |  162 +++++++++
 vm/frame.h           |   23 +
 vm/page.c            |  293 ++++++++++++++++
 vm/page.h            |   51 ++
 14 files changed, 1489 insertions(+), 104 deletions(-)
@end verbatim

This summary is relative to the PintOS base code, but the reference
solution for task 3 starts from the reference solution to task 2.
@xref{Task 2 FAQ}, for the summary of task 2.

The reference solution represents just one possible solution.  Many
other solutions are also possible and many of those differ greatly from
the reference solution.  Some excellent solutions may not modify all the
files modified by the reference solution, and some may modify files not
modified by the reference solution.

@item Do we need a working Task 2 to implement Task 3?

Yes.

@item How complex does our page replacement algorithm need to be?
@anchor{VM Extra Credit}
If you implement an advanced page replacement algorithm,
such as the ``second chance'' or the ``clock'' algorithms,
then you will get more marks for this part of the task.

You should also implement sharing: when multiple processes are created that use
the same executable file, share read-only pages among those processes
instead of creating separate copies of read-only segments for each
process.  If you carefully designed your data structures,
sharing of read-only pages should not make this part significantly
harder.

@item Do we need to handle paging for both user virtual memory and kernel virtual memory?

No, you only need to implement paging for user virtual memory.
One of the golden rules of OS development is ``Don't page out the paging code!''

@item How do we resume a process after we have handled a page fault?

Returning from @func{page_fault} resumes the current user process
(@pxref{Internal Interrupt Handling}).
It will then retry the instruction to which the instruction pointer points.

@item Why do user processes sometimes fault above the stack pointer?

You might notice that, in the stack growth tests, the user program faults
on an address that is above the user program's current stack pointer,
even though the @code{PUSH} and @code{PUSHA} instructions would cause
faults 4 and 32 bytes below the current stack pointer.

This is not unusual.  The @code{PUSH} and @code{PUSHA} instructions are
not the only instructions that can trigger user stack growth.
For instance, a user program may allocate stack space by decrementing the
stack pointer using a @code{SUB $n, %esp} instruction, and then use a
@code{MOV ..., m(%esp)} instruction to write to a stack location within
the allocated space that is @var{m} bytes above the current stack pointer.
Such accesses are perfectly valid, and your kernel must grow the
user program's stack to allow those accesses to succeed.

@item Does the virtual memory system need to support data segment growth?

No.  The size of the data segment is determined by the linker.  We still
have no dynamic allocation in PintOS (although it is possible to
``fake'' it at the user level by using memory-mapped files).  Supporting
data segment growth should add little additional complexity to a
well-designed system.

@item Why should I use @code{PAL_USER} for allocating page frames?
@anchor{Why PAL_USER?}

Passing @code{PAL_USER} to @func{palloc_get_page} causes it to allocate
memory from the user pool, instead of the main kernel pool.  Running out
of pages in the user pool just causes user programs to page, but running
out of pages in the kernel pool will cause many failures because so many
kernel functions need to obtain memory.
You can layer some other allocator on top of @func{palloc_get_page} if
you like, but it should be the underlying mechanism.

Also, you can use the @option{-ul} kernel command-line option to limit
the size of the user pool, which makes it easy to test your VM
implementation with various user memory sizes.

@item What should we do if the stack grows into a @code{mmap} file?

This should not be possible.
The specification of @code{mmap()} rules out creating mappings within the possible stack space,
so you should abort any attempt to create a mapping within that space.
The stack should also not be able to grow beyond it's reserved stack space,
thus ruling out the possibility of any such overlap.

@item What should I expect from the Task 3 code-review?

The code-review for this task will be conducted with each group in-person.
Our Task 3 code-review will cover @strong{four} main areas:
functional correctness, efficiency, design quality and general coding style.

@itemize @bullet
@item For @strong{functional correctness}, we will be looking to see if your code for page sharing accurately tracks the accessed/dirty status of each shared page and if your stack-fault heuristic is correct.
We will also be checking if your code for page allocation, page fault handling and memory mapping/unmapping is free of any race conditions.

@item For @strong{efficiency}, we will be making sure that you only load executable code segments on demand (lazy loading).
We will also be checking to make sure that you have made efficient use of the swap space and that your code is free of any memory leaks.

@item For @strong{design quality}, we will be looking to see if you have implemented an advanced eviction algorithm
(i.e. an algorithm that considers the properties of the pages in memory in order to chose a good eviction candidate).

@item For @strong{general coding style}, we will be paying attention to all of the usual elements of good style
that you should be used to from last year (e.g. consistent code layout, appropriate use of comments, avoiding magic numbers, etc.)
as well as your use of git (e.g. commit frequency and commit message quality).
In this task, we will be paying particular attention to any additional efficiency improvements you have made to your eviction algorithm (e.g. encouraging fairness) or the system's overall use of memory (e.g. increased page sharing).
We will also be looking at any use of hash tables, specifically checking that your hash functions are chosen to avoid frequent collisions.
@end itemize

@end table