In this lab you will explore page tables and modify them to speed up certain system calls and to detect which pages have been accessed.
Before you start coding, read Chapter 3 of the xv6 book, and related files:
In this lab, there are several questions for you to answer. Questions are in boxes with a light orange background. Write each question and its answer in your notebook. Take photo(s) of your questions/answers and submit the photo(s) on Canvas.
The Linux grep command can be helpful on some questions. For example, suppose a question asks you about the macro TRAPFRAME. You can discover the definition and uses of the macro TRAPFRAME by issuing the following Linux grep command in the kernel directory.
$ grep TRAPFRAME *.h memlayout.h:// TRAPFRAME (p->trapframe, used by the trampoline) memlayout.h:#define TRAPFRAME (TRAMPOLINE - PGSIZE) memlayout.h:#define USYSCALL (TRAPFRAME - PGSIZE) % grep TRAPFRAME *.c proc.c: if(mappages(pagetable, TRAPFRAME, PGSIZE, proc.c: uvmunmap(pagetable, TRAPFRAME, 1, 0); proc.c: uvmunmap(pagetable, TRAPFRAME, 1, 0);
In the directory of your xv6-labs, create two files: answers-pgtbl.txt and time.txt that I may use if I run your code using your zip file submission. The answers-pgtbl.txt is a blank file. The time.txt file contains the number of hours you spent on the lab.
$ echo > answers-pgtbl.txt $ echo 8 > time.txt
To start the lab, switch to the pgtbl branch:
$ git fetch $ git checkout pgtbl $ make clean
The first six questions are warmup. Refer to Figure 3-2 of our Xv6 Textbook.
1. Draw a diagram of a RISC-V virtual address (VA).
2. Draw a diagram of a RISC-V physical address (PA).
3. Draw a diagram of a RISC-V page table entry? Underneath of your diagram, write the meanings of the PTE fields/bits? For example, the V bit means this PTE refers to a valid page that is part of the proc's address page. Another way to say this is the VA is mapped to a PA.
4. How many bits is a physical address (PA)?
5. How many bits is a virtual address (VA)?
6. How many bits is a physical page number?
Some operating systems (e.g., Linux) speed up certain system calls by sharing data in a read-only region between userspace and the kernel. This eliminates the need for kernel crossings when performing these system calls. To help you learn how to insert mappings into a page table, your first task is to implement this optimization for the getpid() system call in xv6.
When each process is created, map one read-only page at USYSCALL (a virtual address defined in memlayout.h). At the start of this page, store a struct usyscall (also defined in memlayout.h), and initialize it to store the PID of the current process. This initial mapping is done in the kernel, which creates processes. You must also map this page in the proc's pagetable. For this lab, ugetpid() has been provided on the userspace side and will automatically use the USYSCALL mapping. You will receive full credit for this part of the lab if the ugetpid test case passes when running pgtbltest.
You should study how allocproc() allocates the TRAPFRAME page to get ideas for how to allocate the USYSCALL page.
7. How does allocproc() allocate the TRAPFRAME page?
8. How does kalloc() allocate a page? What is the data structure that contains the free pages?
The file memlayout.h defines the virtual memory layout of the kernel with various macros such as KERNBASE, PHYSTOP, MAXVA, TRAMPOLINE, and TRAPFRAME. The TRAMPOLINE page is mapped to the top of the virtual address space, and TRAPFRAME is the page immediately below TRAMPOLINE. You can study memlayout.h as you examine Figure 3.3 of our xv6 text book.
As mentioned above, the file kernel/memlayout.h contains two pieces of information that you will use - USYSCALL and struct usyscall. You must define these in kernel/memlayout.h.
#define USYSCALL (TRAPFRAME - PGSIZE)
The struct usyscall is defined specifically for this problem. usyscall has one member, int pid.
struct usyscall { int pid; // Process ID };
9. What are the values of the macros TRAPFRAME and PGSIZE?
10. Explain how the expression defining the macro MAXVA relates the the format of the VA.
The file user/ulib.c contains a user library function ugetpid(), which is called from the function ugetpid_test in user/pgtbltest.c. The function is already present in user/ulib.c. The code is shown here for easy study.
int ugetpid(void) { struct usyscall *u = (struct usyscall *)USYSCALL; return u->pid; }
Choose permission bits that allow userspace to only read the page. The file riscv.h defines page protection attributes. The PTE access bit are shown as follows. The USYSCALL page should have PTE_U and PTE_R set.
#define PTE_V (1L << 0) // valid #define PTE_R (1L << 1) #define PTE_W (1L << 2) #define PTE_X (1L << 3) #define PTE_U (1L << 4) // user can access
There are several things that need to be done over the lifecycle of a new page, which must be deallocated as well as allocated. For inspiration, understand the trapframe handling in kernel/proc.c. Search for trapframe to get yourself going. You will discover that the struct proc has a member trapframe.
11. The above blue box with this problem's specification states that ugetpid() has been provided on the userspace side. Where is ugetpid() located and how is is included in the executable test program pgtbltest? Explain how it works.
12. Draw a diagram of kalloc's data structure of free pages.
We provide the user program pgtbltest.c. It tests the USYSCALL page with the pid defined in the problem and the pgaccess() system call defined in the third problem of this lab. pgtbltest has two testing functions.
int ugetpid(void) { struct usyscall *u = (struct usyscall *)USYSCALL; return u->pid; }
To help you visualize RISC-V page tables, and perhaps to aid future debugging, your second task is to write a function that prints the contents of a page table.
Define a function called vmprint(). It should take a pagetable_t argument, and print that pagetable in the format described below. Insert
if(p->pid==1) vmprint(p->pagetable)
in exec.c just before the return argc, to print the first process's page table.
Now when you start xv6 it should print output like this, describing the page table of the first process at the point when it has just finished exec()ing init:
page table 0x0000000087f6b000 ..0: pte 0x0000000021fd9c01 pa 0x0000000087f67000 .. ..0: pte 0x0000000021fd9801 pa 0x0000000087f66000 .. .. ..0: pte 0x0000000021fda01b pa 0x0000000087f68000 .. .. ..1: pte 0x0000000021fd9417 pa 0x0000000087f65000 .. .. ..2: pte 0x0000000021fd9007 pa 0x0000000087f64000 .. .. ..3: pte 0x0000000021fd8c17 pa 0x0000000087f63000 ..255: pte 0x0000000021fda801 pa 0x0000000087f6a000 .. ..511: pte 0x0000000021fda401 pa 0x0000000087f69000 .. .. ..509: pte 0x0000000021fdcc13 pa 0x0000000087f73000 .. .. ..510: pte 0x0000000021fdd007 pa 0x0000000087f74000 .. .. ..511: pte 0x0000000020001c0b pa 0x0000000080007000 init: starting sh
13. Why do all of the physical addresses end with three hexadecimal zeros?
Your code might emit different physical addresses than those shown above. The number of entries and the virtual addresses should be the same.
uint64 child = PTE2PA(pte);The macro PTE2PA is defined in riscv.h, and it converts a page table entry to a physical address. PTE2PA is given by the following.
#define PTE2PA(pte) (((pte) >> 10) << 12)
14. Explain how the macro PTE2PA converts a PTE into a PA.
... MORE from file not shown ... // vm.c void vmprint(pagetable_t); ← Add this line void kvminit(void); void kvminithart(void); ... MORE from file not shown ...
void _vmprint(pagetable_t pagetable, int depth) { for(int i=0; i<512; i++){ if((pagetable[i] & PTE_V) == 0) continue; uint64 pa = PTE2PA(pagetable[i]); for(int j=0; j<depth; j++){ printf(" .."); } printf("%d: pte %p pa %p\n", i, pagetable[i], pa); if((pagetable[i] & (PTE_R|PTE_W|PTE_X)) == 0){ _vmprint((pagetable_t)pa, depth+1); } } } void vmprint(pagetable_t pagetable) { printf("page table %p\n", pagetable); _vmprint(pagetable, 1); }
... MORE code that is not shown ... proc_freepagetable(oldpagetable, oldsz); if(p->pid==1) vmprint(p->pagetable); ← Add this line return argc; // this ends up in a0, the first argument to main(argc, argv) bad: if(pagetable) proc_freepagetable(pagetable, sz); if(ip){ iunlockput(ip); end_op(); } return -1; }
15. For every leaf page in the vmprint output, explain what it logically contains and what its permission bits are. Figure 3.4 in the xv6 book might be helpful, although note that the figure might have a slightly different set of pages than the init process that's being inspected here.
16. Xv6 does not have address space randomization. Search for this concept of address space randomization, explain it, and why is it important in today's systems?
17. Unix implementations of exec traditionally include special handling for shell scripts. If the file to execute begins with the text #!, then the first line is taken to be a program to run to interpret the file. For example, if exec is called to run myprog arg1 and myprog's first line is #!/interp, then exec runs /interp with command line /interp myprog arg1. Discuss how you would Implement support for this convention in xv6.
18. Examine the code in user/init.c. What process does the init process create? What does the init process do for standard input, output, and error?
If you have completed all of the above changes, when you start xv6 it should print output like this, describing the page table of the first process at the point when it has just finished exec()ing init:
page table 0x0000000087f6b000 ..0: pte 0x0000000021fd9c01 pa 0x0000000087f67000 .. ..0: pte 0x0000000021fd9801 pa 0x0000000087f66000 .. .. ..0: pte 0x0000000021fda01b pa 0x0000000087f68000 .. .. ..1: pte 0x0000000021fd9417 pa 0x0000000087f65000 .. .. ..2: pte 0x0000000021fd9007 pa 0x0000000087f64000 .. .. ..3: pte 0x0000000021fd8c17 pa 0x0000000087f63000 ..255: pte 0x0000000021fda801 pa 0x0000000087f6a000 .. ..511: pte 0x0000000021fda401 pa 0x0000000087f69000 .. .. ..509: pte 0x0000000021fdcc13 pa 0x0000000087f73000 .. .. ..510: pte 0x0000000021fdd007 pa 0x0000000087f74000 .. .. ..511: pte 0x0000000020001c0b pa 0x0000000080007000 init: starting sh
Do not get yourself caught in the details of implementation so that you do not understand what you have done. Think about virtual addresses, physical addresses, page tables, page table entries, and how this code walks the three-level page table to print the PTEs and physical addresses.
Some garbage collectors (a form of automatic memory management) can benefit from information about which pages have been accessed (read or write). In this part of the lab, you will add a new feature to xv6 that detects and reports this information to userspace by inspecting the access bits in the RISC-V page table. The RISC-V hardware page walker marks these bits in the PTE whenever it resolves a TLB miss.
Your job is to implement a system call, pgaccess(), that reports which pages have been accessed. The system call takes three arguments. The first argument is the starting virtual address of the first user page to check. The second argument is number of pages to check. The third argument is a user address of a 32-bit variable to store the results into a bitmask (a datastructure that uses one bit per page and where the first page corresponds to the least significant bit).
See Reference: Xv6 System Calls for a reminder on how to implement a system call.
char a[3*PGSIZE]; unsigend int x; ... some code here that may access a ... pgaccess(a, 3, &x);If a[0] has changed, bit 0 of x will be set. If a[4096] has changed, bit 1 of x will be set. If a[8192] has changed, bit 2 of x will be set.
19. The above code snippet will not work in Xv6, because it defines the variable a as a local variable that would be allocated on the stack. Why will this NOT work on Xv6?
20. How does the function pgaccess_test() in pgtbltest.c define its variable that corresponds to a?
The following shows the page table entry bits as defined in kernel/riscv.h, along with some other lines in the file. The macro defines include how to define PTE_A. Notice these macro definitions make use of C shifts and masks. These low-level C features allow us to manipulate the page table entries and virtual addresses as defined by the RISC-V architecture.
#define PTE_V (1L << 0) // valid #define PTE_R (1L << 1) #define PTE_W (1L << 2) #define PTE_X (1L << 3) #define PTE_U (1L << 4) // user can access #define PTE_A (1L << 6) ← Add this line // shift a physical address to the right place for a PTE. #define PA2PTE(pa) ((((uint64)pa) >> 12) << 10) #define PTE2PA(pte) (((pte) >> 10) << 12) #define PTE_FLAGS(pte) ((pte) & 0x3FF) // extract the three 9-bit page table indices from a virtual address. #define PXMASK 0x1FF // 9 bits #define PXSHIFT(level) (PGSHIFT+(9*(level))) #define PX(level, va) ((((uint64) (va)) >> PXSHIFT(level)) & PXMASK) // one beyond the highest possible virtual address. // MAXVA is actually one bit less than the max allowed by // Sv39, to avoid having to sign-extend virtual addresses // that have the high bit set. #define MAXVA (1L << (9 + 9 + 9 + 12 - 1))
Version 1 - does not have line numbers.
int pgaccess(uint64 va_start, int npage, uint64 res_addr) { if(va_start > MAXVA) { panic("pgaccess: too large VA"); } if(npage > 32){ panic("pgaccess: too many pages"); } uint res = 0; struct proc *p = myproc(); for(int i=0; i<npage; i++){ uint64 va = va_start + i * PGSIZE; pte_t *pte = walk(p->pagetable, va, 0); if((*pte & PTE_V) && (*pte & PTE_A)) { res |= (1 << i); *pte ^= PTE_A; } } return copyout(p->pagetable, res_addr, (char*)&res, sizeof(res)); }
Version 2 - has line numbers.
1 int 2 pgaccess(uint64 va_start, int npage, uint64 res_addr) 3 { 4 if(va_start > MAXVA) { 5 panic("pgaccess: too large VA"); 6 } 7 if(npage > 32){ 8 panic("pgaccess: too many pages"); 9 } 10 11 uint res = 0; 12 struct proc *p = myproc(); 13 for(int i=0; i<npage; i++){ 14 uint64 va = va_start + i * PGSIZE; 15 pte_t *pte = walk(p->pagetable, va, 0); 16 if((*pte & PTE_V) && (*pte & PTE_A)) { 17 res |= (1 << i); 18 *pte ^= PTE_A; 19 } 20 } 21 return copyout(p->pagetable, res_addr, (char*)&res, sizeof(res)); 22 }
21. Explain the if statement on lines 16 through 19 of the function pgaccess() given above.
22. Suppose you have the following page table information, which is formatted like the Print a page table problem. Explain how you convert the PTE value (e.g., 0x021fd9c01) to its PA (e.g., 0x87f67000).
page table 0x0000000087f6b000 ..0: pte 0x0000000021fd9c01 pa 0x0000000087f67000 .. ..0: pte 0x0000000021fd9801 pa 0x0000000087f66000 .. .. ..0: pte 0x0000000021fda01b pa 0x0000000087f68000
23. The function walk() in vm.c "walks" a RISC-V Sv39 page table. The Sv39 indicates a 39 bit VA as shown in Xv6 Textbook, Figure 3.2. The succinct code for walk() is given as follows.
pte_t * walk(pagetable_t pagetable, uint64 va, int alloc) { if(va >= MAXVA) panic("walk"); for(int level = 2; level > 0; level--) { pte_t *pte = &pagetable[PX(level, va)]; if(*pte & PTE_V) { pagetable = (pagetable_t)PTE2PA(*pte); } else { if(!alloc || (pagetable = (pde_t*)kalloc()) == 0) return 0; memset(pagetable, 0, PGSIZE); *pte = PA2PTE(pagetable) | PTE_V; } } return &pagetable[PX(0, va)]; }
A. Explain the PX macro. Give an example.
B. Explain the PTE2PA macro.
C. Pick a level and explain the assignment statement.
pte_t *pte = &pagetable[PX(level, va)];
D. Explain the PA2PTE macro.
E. Explain how walk adds a PTE to a pagetable when the argument alloc is not 0.
24. Given the diagram below for a Translation Lookaside Buffer, search the Internet to help your answer the following.
A. Explain how a TLB makes paging more efficient. This one you should know without searching, but search if you must.
B. What does Xv6 do when it must change to a new page table? This one will require you to examine Xv6.
C. Explain how an OS would use the ASID field in the TLB to make paging more efficient. Search for ASID and how it is used. ASID is Address Space Identifier.
We provide the user program pgtbltest.c. It tests the USYSCALL page with the pid defined in the problem and the pgaccess() system call defined in the third problem of this lab. pgtbltest has two testing functions.
int ugetpid(void) { struct usyscall *u = (struct usyscall *)USYSCALL; return u->pid; }
>
This completes the lab.
Read Lab Submissions for instructions on how
to submit your lab.
Submit the lab