CPSC 405 Lecture: Coordination (sleep&wakeup) # plan re-emphasize a few points about xv6 thread switching sequence coordination sleep & wakeup lost wakeup problem termination why hold p->lock across swtch()? this point affects many situations in xv6 yield(): scheduler(): acquire(&p->lock); p->state = RUNNABLE; swtch(); swtch(); release(&p->lock); two reasons to hold p->lock across swtch(): 1. prevent another core's scheduler from seeing p->state == RUNNABLE until after the original core has stopped executing the thread and after the original core has stopped using the thread's stack 2. prevent timer interrupt from calling yield() during swtch() (remember: acquire() turns off interrupts) 2nd swtch() would overwrite already-saved registers in context why does xv6 never hold a spinlock when yielding the CPU? (other than p->lock) on a single-core machine, imagine this: P1 P2 acq(Lx) sched() acq(Ly) acq(Lx) P2 will wait forever: P2 will spin waiting for P1 to release Lx P2 holds Ly, so it must keep interrupts off No clock interrupts, so P1 won't run So Lx won't ever be released possible even on multi-core, with more locks/threads solution: forbid holding lock when yielding the CPU! (other than p->lock) # topic: sequence coordination threads often wait for specific events or conditions: wait for disk read to complete (event is from an interrupt) wait for pipe writer to produce data (event is from a thread) wait for a child to exit # coordination is a fundamental building-block for thread programming. but subject to rules that can present puzzles. # why not just have a while-loop that spins until event happens? pipe read: while buffer is empty { } pipe write: put data in buffer # better solution: coordination primitives that yield the CPU there are a bunch e.g. barriers, semaphores, event queues. xv6 uses sleep & wakeup used by many systems similar to "condition variables" # example: uartwrite() and uartintr() in uart.c recall: the UART is the device hardware that connects to the console these are simplified versions of xv6 code see "code" link on schedule page the basic idea: the UART h/w accepts one byte of output at a time (really a few) h/w takes a long time to send each byte, perhaps a millisecond processes writing the console must wait until UART sends prev char the UART h/w interrupts after it has sent each character writing thread should give up the CPU until then write() calls uartwrite() uartwrite() writes first byte (if it can) uartwrite() calls sleep() to wait for the UART's interrupt uartintr() calls wakeup() the "&tx_chan" argument serves to link the sleep and wakeup simple and flexible: sleep/wakeup don't need to understand what you're waiting for no need to allocate explicit coordination objects # why the lock argument to sleep()? sadly you cannot design sleep() as cleanly as you might hope sleep() cannot simply be "wait for this event" the problem is called "lost wakeups" it lurks behind all sequence coordination schemes, and is a pain here's the story # suppose just sleep(chan); how would we implement? here's a BROKEN sleep/wakeup broken_sleep(chan) sleeps on a "channel", a number/address identifies the condition/event we are waiting for p->state = SLEEPING; p->chan = chan; sched(); wakeup(chan) wakeup wakes up all threads sleeping on chan may wake up more than one thread for each p: if p->state == SLEEPING && p->chan == chan: p->state = RUNNABLE (I've omitted p->lock, which both need) # how would uart code use this (broken) sleep/wakeup? int busy int chan uartwrite(buf): for each char c: while busy: sleep(&chan) send c busy = 1 uartintr(): busy = 0 wakeup(&chan) busy==0 is the condition we're waiting for &chan is the sleep channel (a dummy variable) # but what about locking? - driver's data structures e.g. busy - UART hardware both uartwrite()() and uartintr() need to lock should uartwrite()() hold a lock for the whole sequence? no: then uartintr() can't get lock and clear busy maybe uartwrite() could release the lock before sleep()? let's try it -- modify uart.c to call broken_sleep() release(&uart_tx_lock); broken_sleep(&tx_chan); acquire(&uart_tx_lock); make qemu ; cat README what goes wrong when uartwrite() releases the lock before broken_sleep()? uartwrite() saw that the previous character wasn't yet done being sent interrupt occurred after release(), before broken_sleep() uartwrite() went to sleep EVEN THOUGH UART TX WAS DONE now there is nothing to wake up uartwrite(), it will sleep forever really, until next UART interrupt, due to input this is the "lost wakeup" problem. we need to eliminate the window between uartwrite()'s check of the condition, and sleep() marking the process as asleep. we'll use locks to prevent wakeup() from running during the entire window. we'll change the sleep() interface and the way it's used. we'll require that there be a lock that protects the condition, and require that the callers of both sleep() and wakeup() hold this "condition lock" sleep(chan, lock) caller must hold lock sleep releases lock, re-acquires before returning wakeup(chan) caller must hold lock (repair uart.c) let's look at wakeup(chan) in proc.c it scans the process table, looking for SLEEPING and chan it grabs each p->lock remember also that caller acquired condition lock before calling wakeup() so wakeup() holds BOTH the condition lock and each p->lock Q: why is it enough to just set p->state = RUNNABLE? why does that cause the thread to run? let's look at sleep() in proc.c sleep *must* release the condition lock since we can't hold locks when calling swtch(), other than p->lock Q: how to prevent wakeup() from running after sleep() releases the condition lock? A: acquire p->lock before releasing condition lock since wakeup() holds *both* locks, it's enough for sleep() to hold *either* in order to force wakeup() to spin rather than look at this process now wakeup() can't proceed until after swtch() completes so wakeup() is guaranteed to see p->state==SLEEPING and p->chan==chan thus: no lost wakeups! note that uartwrite() wraps the sleep() in a loop i.e. re-checks the condition after sleep() returns, may sleep again two reasons: 1. maybe multiple waiters, another thread might have consumed the event 2. kill() wakes up processes even when condition isn't true all uses of sleep are wrapped in a loop, so they re-check # Another example: piperead() in pipe.c the condition is data waiting to be read (nread != nwrite) pipewrite() calls wakeup() at the end what is the race if piperead() used broken_sleep()? lost wakeups are not just about interrupts! note the the loop around sleep() multiple processes may be reading the same pipe why the wakeup() at the end of piperead()? # the sleep/wakeup interface/rules are a little complex in particular, sleep() needs the condition lock but, sleep() doesn't need to understand the condition sleep/wakeup is pretty flexible, though low-level there are other schemes that are cleaner but perhaps less general-purpose e.g. the counting semaphore in today's reading all have to cope with lost wakeups, one way or another ************* # another coordination challenge -- how to terminate a thread? # problem: thread X cannot just destroy thread Y what if Y is executing on another core? what if Y holds locks? what if Y is in the middle of a complex update to important data structures? # problem: a thread cannot free all of its own resources e.g. its own stack, which it is still using e.g. its struct context, which it may need to call swtch() # xv6 has two ways to get rid of processes: exit() and kill() # ordinary case: process voluntarily quits with exit() system call the strategy: exit() frees some things, but not proc slot or stack parent's wait() does the final frees the goal: p->state = UNUSED, so a future fork() can use this proc[] slot exit() in proc.c: some cleanup wake up wait()ing parent p->state = ZOMBIE ZOMBIE means ready for parent's wait() not UNUSED -- so won't be allocated by a fork() won't run again (note stack and proc[] entry are still allocated...) swtch() to scheduler wait() in proc.c (parent will eventually call): sleep()s waiting for any child exit() scans proc[] table for children with p->state==ZOMBIE calls freeproc() (p->lock held...) trapframe, pagetable, ..., p->state=UNUSED thus: wait() is not just for app convenience, but for O/S as well every process must be wait()ed for some complexity due to wait-then-exit versus exit-then-wait what if parent has already exited? # what about kill(pid)? problem: may not be safe to forcibly terminate a process it might be executing in the kernel using its kernel stack, page table, proc[] entry, trapframe it might hold locks e.g. in the middle of fork()ing a new process and must finish to restore invariants so: kill() can't directly destroy the target! solution: kill() sets p->killed flag, nothing else the target process itself checks for p->killed and calls exit() itself look for "if(p->killed) exit(-1);" in usertrap() not in the middle of anything here, and no locks are held so it's safe to exit() # what if kill() target is sleep()ing? in that case it doesn't hold locks, and isn't executing! is it OK for kill() to destroy the target right away? no: what if waiting for disk midway through file creation? # xv6 solution to kill() of sleep()ing process see kill() in proc.c changes SLEEPING to RUNNABLE -- like wakeup() so sleep() will return, probably before condition is true some sleep loops check for p->killed e.g. piperead(), consoleread() otherwise read could hang indefinitely for a killed process some sleep loops don't check p->killed e.g. virtio_disk.c OK not to check p->killed since disk reads are pretty quick so a kill()ed process may continue for a while but usertrap() will exit() after the system call finishes # xv6 spec for kill if target is in user space will die next time it makes a system call or takes a timer interrupt if target is in the kernel target will never execute another user instruction but may spend a while yet in the kernel # Summary sleep/wakeup let threads wait for specific events concurrency means we have to worry about lost wakeups termination is a pain in threading systems