[xv6] Buffer Cache

프로젝트/운영체제 만들기 2023. 7. 24. 23:53

글의 참고

- https://github.com/mit-pdos/xv6-public/tree/master

- xv6 - DRAFT as of September 4, 2018

글의 전제

- 밑줄로 작성된 글은 강조 표시를 의미한다.

- 그림 출처는 항시 그림 아래에 표시했다.

글의 내용

- Buffer cache layer

: `xv6` 에서 버퍼 캐시는 2가지 역할을 한다. 메모리와 디스크이 동기화를 맞추고, 실제 퍼포먼스를 위해 자주 사용되는 블락들을 캐쉬한다. 각 버퍼 캐시는 자신만의 `슬립-락`을 가지고 있어서, 어느 한 시점에 하나의 스레드만 접근할 수 있다는 것을 보장한다. `bread`를 통해서 `락-버퍼 캐시`를 반환받고, `brelse` 함수를 통해서 해당 락을 푼다.

The buffer cache has two jobs: (1) `synchronize access to disk blocks` to ensure that only one copy of a block is in memory and that only one kernel thread at a time uses that copy; (2) `cache popular blocks` so that they don’t need to be re-read from the slow disk. The code is in `bio.c`

The main interface exported by the buffer cache consists of `bread` and `bwrite`; the former obtains a buf containing a copy of a block which can be read or modified in memory, and the latter writes a modified buffer to the appropriate block on the disk. A kernel thread must release a buffer by calling `brelse` when it is done with it. The buffer cache uses a per-buffer sleep-lock to ensure that only one thread at a time uses each buffer (and thus each disk block); `bread` returns a locked buffer, and `brelse` releases the lock.

Let’s return to the buffer cache. The buffer cache has a `fixed number of buffers` to hold disk blocks, which means that if the file system asks for a block that is not already in the cache, the buffer cache must recycle a buffer currently holding some other block. The buffer cache recycles the `least recently used` buffer for the new block. The assumption is that the least recently used buffer is the one least likely to be used again soon

: 버퍼 캐시의 개수는 고정되어 있다(30). 그래서 파일 시스템에서 캐쉬에 없는 버퍼를 요구할 경우, 버퍼 캐시는 반드시 `교체 알고리즘`을 통해서 이미 사용중인 버퍼 캐시를 재활용해야 한다. `xv6` 버퍼 캐시는 교체 알고리즘으로 `LRU` 알고리즘을 사용한다.

: 아래는 `xv6` 파일 시스템이 디스크 포맷을 나타낸 것이다.

- 코드 분석

: `main` 함수에서 파일 시스템 관련 제일 먼저 호출되는 함수는 `binit`이다. 이 함수가 버퍼 캐시를 초기화하는 함수다.

// main.c
...
...

// Bootstrap processor starts running C code here.
// Allocate a real stack and switch to it, first
// doing some setup required for memory allocator to work.
int main(void)
{
  kinit1(end, P2V(4*1024*1024)); // phys page allocator
  kvmalloc();      // kernel page table
  mpinit();        // detect other processors
  lapicinit();     // interrupt controller
  seginit();       // segment descriptors
  picinit();       // disable pic
  ioapicinit();    // another interrupt controller
  consoleinit();   // console hardware
  uartinit();      // serial port
  pinit();         // process table
  tvinit();        // trap vectors
  binit();         // buffer cache
  fileinit();      // file table
  ideinit();       // disk
  startothers();   // start other processors
  kinit2(P2V(4*1024*1024), P2V(PHYSTOP)); // must come after startothers()
  userinit();      // first user process
  mpmain();        // finish this processor's setup
}

...
...

####################################################################################################
// bio.c
...
...

struct {
  struct spinlock lock;
  struct buf buf[NBUF];

  // Linked list of all buffers, through prev/next.
  // head.next is most recently used.
  struct buf head;
} bcache;

...
...

void binit(void)
{
  struct buf *b;

  initlock(&bcache.lock, "bcache");

//PAGEBREAK!
  // Create linked list of buffers
  bcache.head.prev = &bcache.head;
  bcache.head.next = &bcache.head;
  for(b = bcache.buf; b < bcache.buf+NBUF; b++) {
    b->next = bcache.head.next;
    b->prev = &bcache.head;
    initsleeplock(&b->lock, "buffer");
    bcache.head.next->prev = b;
    bcache.head.next = b;
  }
}

: `binit` 함수는 전역 변수`bcache.head`를 초기화한다. 전역 변수 `bcache` 내부적으로 `buf`라는 배열이 있는데, 이걸 배열 그대로 사용하기 보다는 `buf.refcnt` 기준으로 `bcache.head`에 연결한다. 즉, `buf` 라는 배열은 그저 `struct buf`를 여러 개 할당받기 위한 저장소에 불과하다. 비유하면, `스레드 풀`과 같다. 즉, `버퍼 캐시 풀` 이라고 생각하면 될 것 같다. `binit`에서는 `bcache.head`에 `bcache.buf` 값들을 `list_add` 방식이 아닌, `list_add_head` 방식으로 추가한다. 즉, 새로운 노드를 연결 리스트 뒤쪽에 추가하는 게 아니라, 헤드 바로 뒤에 추가하고 있다.

: `xv6`의 버퍼 캐시 교체 알고리즘은 `LRU`를 사용한다. `LRU`를 이해하기 위해서는 `brelse` 함수를 분석해야 한다. `LRU`는 `Least Recently Used`의 약자로, `최근에 가장 적게 사용되는` 이라는 뜻이다. 즉, `최근에 가장 적게 사용된 버퍼 캐시를 새로운 버퍼 캐시로 교체한다`는 것이다. `brelse` 함수는 호출될 때마다, 인자로 전달된 버퍼의 `refcnt`를 하나씩 감소시킨다. 그리고 실제로 이 버퍼가 어느곳에서도 쓰이지 않을 때(`b->refcnt == 0 `), `LRU` 알고리즘을 적용한다. 이 `LRU`가 적용되기 위해서는 반드시 `b->refcnt == 0` 임을 주의하자.

// bio.c
...
// Release a locked buffer.
// Move to the head of the MRU list.
void brelse(struct buf *b)
{
  if(!holdingsleep(&b->lock))
    panic("brelse");

  releasesleep(&b->lock);

  acquire(&bcache.lock);
  b->refcnt--;
  if (b->refcnt == 0) {
    // no one is waiting for it.
    b->next->prev = b->prev;  // --- 1
    b->prev->next = b->next;  // --- 1
    
    b->next = bcache.head.next;  // --- 2
    b->prev = &bcache.head;  // --- 2
    bcache.head.next->prev = b;  // --- 2
    bcache.head.next = b;  // --- 2
  }
  
  release(&bcache.lock);
}
...

: `xv6` LRU 알고리즘은 아래와 같이 2가지 과정으로 이루어져 있다.

1" LRU 알고리즘에 의해 선택된 버퍼 캐시를 연결 리스트(bcache)에서 제거한다.
2" 앞에서 제거 한 버퍼 캐시를 헤드(bcache.head) 바로 뒤에 연결시킨다.

: 아래 `bget`에서 보겠지만, 새로운 버퍼 캐시를 찾는 과정에서 버퍼 캐시 연결 리스트를 앞에서 순차 탐색을 진행한다. 그러나, `brelse` 함수에 적용된 `LRU` 덕분에 FREE 버퍼 캐시들은 헤드 바로 뒤쪽에 배치되므로, 연결 리스트의 마지막 노드까지 탐색할 일은 거의 없다.

: 그리고 main 함수에서 파일 시스템 관련 진행되는 초기화 함수는 `fileinit` 함수다. 이 함수에서는 전역 변수인 `ftable`의 동기화 변수를 초기화한다.

// file.c

struct {
  struct spinlock lock;
  struct file file[NFILE];
} ftable;

...

void fileinit(void)
{
  initlock(&ftable.lock, "ftable");
}

...

: 최초의 유저 프로세스(`initproc`)가 생성되고, 컨택스트 스위칭으로 인해 `initproc`은 `forkret` 함수를 호출한다. `initproc`은 순서대로 `iinit`과 `initlog`를 호출하게 된다. 먼저 `iinit` 에 대해 알아보자.

// fs.c

...
...

// there should be one superblock per disk device, but we run with
// only one device
struct superblock sb;

// Read the super block.
void readsb(int dev, struct superblock *sb)
{
  struct buf *bp;

  bp = bread(dev, 1);
  memmove(sb, bp->data, sizeof(*sb));
  brelse(bp);
}

...
...

struct {
  struct spinlock lock;
  struct inode inode[NINODE]; // NINODE == 50
} icache;

...
...

void iinit(int dev)
{
  int i = 0;

  initlock(&icache.lock, "icache");
  for(i = 0; i < NINODE; i++) {
    initsleeplock(&icache.inode[i].lock, "inode");
  }

  readsb(dev, &sb);
  cprintf("sb: size %d nblocks %d ninodes %d nlog %d logstart %d\
inodestart %d bmap start %d\n", sb.size, sb.nblocks,
          sb.ninodes, sb.nlog, sb.logstart, sb.inodestart,
          sb.bmapstart);
}
...
...
####################################################################################################
// fs.h
...
...

// in-memory copy of an inode
struct inode {
  uint dev;           // Device number
  uint inum;          // Inode number
  int ref;            // Reference count
  struct sleeplock lock; // protects everything below here
  int valid;          // inode has been read from disk?

  short type;         // copy of disk inode
  short major;
  short minor;
  short nlink;
  uint size;
  uint addrs[NDIRECT+1];
};

...
...

####################################################################################################
param.h

...
...

#define NOFILE       16  // open files per process
#define NFILE       100  // open files per system
#define NINODE       50  // maximum number of active i-nodes
#define NDEV         10  // maximum major device number
#define ROOTDEV       1  // device number of file system root disk
#define MAXARG       32  // max exec arguments
#define MAXOPBLOCKS  10  // max # of blocks any FS op writes
#define LOGSIZE      (MAXOPBLOCKS*3)  // max data blocks in on-disk log
#define NBUF         (MAXOPBLOCKS*3)  // size of disk block cache
#define FSSIZE       1000  // size of file system in blocks

...
...

: `readsb`는 디스크 드라이브에서 `superblock(슈퍼블락)`을 읽어들인다. 슈퍼블락은 현재 디스크에 적용된 파일 시스템에 대한 모든 메타 정보를 갖고 있는 블락을 의미한다. 그리고 `dev` 인자는 디스크 드라이브 번호와 같다고 보면 된다. 버퍼 캐시에서 이 `dev`라는 인자가 항상 전달되는데 반드시 필요한 인자일까?

: `xv6`는 2개의 이미지 파일을 사용한다. 아래의 내용은 `xv6`의 `Makefile` 파일 내용의 일부이다. 여기서 QEMU 옵션을 보면 ,`drive`가 2개인 것을 확인할 수 있다.

Drive 0" xv6.img - 부트 로더 및 커널. 즉, 부팅 디스크 이미지
Drive 1" fs.img - 유저 프로그램. 즉, 파일 시스템 이미지

QEMUOPTS = -drive file=fs.img,index=1,media=disk,format=raw -drive file=xv6.img,index=0,media=disk,format=raw -smp $(CPUS) -m 512 $(QEMUEXTRA)

qemu: fs.img xv6.img
	$(QEMU) -serial mon:stdio $(QEMUOPTS)

qemu-memfs: xv6memfs.img
	$(QEMU) -drive file=xv6memfs.img,index=0,media=disk,format=raw -smp $(CPUS) -m 256

qemu-nox: fs.img xv6.img
	$(QEMU) -nographic $(QEMUOPTS)

: IDE 인터페이스를 사용하므로, 각 드라이브에 접근하는 방법은 `Device Register(DEV[4])`를 설정하면 된다. IDE 인터페이스에서 디바이스 레지스터 어떤 디스크 드라이브를 설정할 지를 고를 수 있다.

//ide.c
...
outb(0x1f6, 0xe0 | ((b->dev&1)<<4) | ((sector>>24)&0x0f));
...

: 위의 내용은 `idestart` 함수의 일부이다. `idestart` 함수는 `(b->dev&1) << 4`를 통해서 어떤 디스크에서 내용을 읽을 지 혹은 쓸지를 결정한다.

: `readsb`의 두 번째 인자는 섹터 번호를 나타낸다. 그런데, `1`을 던지는 의미가 뭘까? 아래는 `xv6`의 파일 시스템 구조다. `xv6`는 두 번째(섹터 번호 1) 섹터에 `슈퍼 블락(super)`가 온다. 그래서 슈퍼블락을 읽기 위해 `bread(dev, 1)`을 호출하는 것이다.

Block 1 is called the `superblock`; it contains metadata about the file system (the file system size in blocks, the number of data blocks, the number of inodes, and the number of blocks in the log).

: 슈퍼 블락(섹터 번호 1, 하나의 블락을 차지)을 정상적으로 읽어올 경우, `fs.c` 파일에 전역적으로 선언되어 있는 `struct superblock sb`에 디스크로 읽어온 슈퍼 블락의 내용을 복사한다. 그리고 `bread`로 읽을 경우, 버퍼 캐시를 사용하게 된다. 그런데, 슈퍼 블락의 내용은 이미 전역 변수인 `struct superblock sb` 에 복사됬으니, 버퍼 캐시에서 제거하는것이 맞다. 그래서 `brelse`를 호출한다.

: `if((b->flags & B_VALID) == 0)`의 의미는 디스크에서 메모리를 데이터를 읽어온 적이 없는 것이다. 최초에 디스크로부터 데이터를 읽어 버퍼 캐시에 썻다면, 해당 버퍼 캐시의 플래그에는 `B_VALID` 표시가 된다. `iderw`는 버퍼와 디스크의 싱크를 맞추는 함수다. 아래에서 다시 설명한다.

// buf.h
...
...

struct buf {
int flags;
  uint dev;
  uint blockno;
  struct sleeplock lock;
  uint refcnt;
  struct buf *prev; // LRU cache list
  struct buf *next;
  struct buf *qnext; // disk queue
  uchar data[BSIZE];
};

#define B_VALID 0x2  // buffer has been read from disk
#define B_DIRTY 0x4  // buffer needs to be written to disk

...
...
####################################################################################################
// fs.c
...
...

// there should be one superblock per disk device, but we run with
// only one device
struct superblock sb;

// Read the super block.
void readsb(int dev, struct superblock *sb)
{
  struct buf *bp;

  bp = bread(dev, 1);
  memmove(sb, bp->data, sizeof(*sb));
  brelse(bp);
}

...
...

####################################################################################################
// bio.c
...
...

// Return a locked buf with the contents of the indicated block.
struct buf* bread(uint dev, uint blockno)
{
  struct buf *b;

  b = bget(dev, blockno);
  if((b->flags & B_VALID) == 0) {
    iderw(b);
  }

  return b;
}

...
...

// Look through buffer cache for block on device dev.
// If not found, allocate a buffer.
// In either case, return locked buffer.
static struct buf* bget(uint dev, uint blockno)
{
  struct buf *b;

  acquire(&bcache.lock);

  // Is the block already cached?
  for(b = bcache.head.next; b != &bcache.head; b = b->next){
    if(b->dev == dev && b->blockno == blockno){ // --- `1 영역`
      b->refcnt++;

      release(&bcache.lock);
      acquiresleep(&b->lock);

      return b;
    }
  }

  // Not cached; recycle an unused buffer.
  // Even if refcnt==0, B_DIRTY indicates a buffer is in use
  // because log.c has modified it but not yet committed it.
  for(b = bcache.head.prev; b != &bcache.head; b = b->prev){
    if(b->refcnt == 0 && (b->flags & B_DIRTY) == 0) { // --- `2 영역`
      b->dev = dev;
      b->blockno = blockno;
      b->flags = 0;
      b->refcnt = 1;

      release(&bcache.lock);
      acquiresleep(&b->lock);

      return b;
    }
  }

  panic("bget: no buffers");

}

: `for(b = bcache.head.next; b != &bcache.head; b = b->next)` 은 `순환 양방향 연결 리스트`에서 사용하는 루프문이다. 루프문의 조건이 `b != &bcache.head` 인데, `순환 양방향 연결 리스트`는 마지막 노드의 NEXT 노드가 HEAD를 바라보기 때문에 저런 조건이 된 것이다.즉, 마지막 노드가 HEAD면 모든 연결 리스트를 탐색하게 되는 것이다.

: `1 영역`은 먼저 요청한 섹터가 버퍼 캐시에 존재하는지를 탐색한다. 버퍼 캐시의 존재 여부는 `if(b->dev == dev && b->blockno == blockno)` 이다. `blockno`는 섹터 번호를 의미하며, `dev`는 모든 객체들을 식별할 수 있는 ID와 같다고 보면 된다. 즉, 각 객체(디바이스)마다 여러 개의 버퍼를 가질 수 있다는 뜻이고, 각 버퍼는 무조건 특정 디바이스에 속해야 한다는 것이다.

: `2 영역`은 요청한 버퍼 캐시가 없을 경우, 새로운 버퍼 캐시를 만들게 된다. 그리고 빈 버퍼 캐시를 찾기 위해 또 루프를 돌게 된다. FREE 버퍼 캐시를 찾는 조건은 `if(b->refcnt == 0 && (b->flags & B_DIRTY) == 0)` 이다. 당연히, 버퍼 캐시 중 현재 사용중이지 않으면서 (`b->refcnt == 0`), 버퍼 캐시와 디스크와의 싱크가 맞는 버퍼 캐시를 찾아야 한다. `b->refcnt == 0` 조건은 이해가 가는데, `(b->flags & B_DIRTY) == 0` 조건은 직관적으로 와닿지 않을 수 있다.

: `(b->flags & B_DIRTY) == 0` 가 조건에 들어가는 이유를 한 번 반대로 생각해보자. 조건문이 `b->refcnt == 0`만 있다고 해보자. 새로운 버퍼 캐시 요청을 했는데, FREE 버퍼 캐시를 찾는 조건문이 `b->refcnt == 0` 면 될까? 찾은 버퍼 캐시가 만약, 인-메모리와 디스크의 싱크가 맞지 않다면? 이 버퍼 캐시를 그대로 사용한다면, 디스크에 써야 하는 내용들이 사라질 수 있다. 그러므로, 반드시 `(b->flags & B_DIRTY) == 0` 조건도 필요하다.

: 이 부분을 좀 더 최적화 해 볼 수 없을까? 예를 들어, `b->refcnt == 0` 인 버퍼 캐시가 있다면, 실제 사용 중인 프로세스가 없다는 뜻이다. 그런데, `xv6`는 `(b->flags & B_DIRTY) == 0` 을 충족하지 못할 경우, 해당 버퍼 캐시를 재활용하지 않는다. 결국, 다음 버퍼 캐시를 찾게된다. 저 시점에 버퍼 캐시에 있는 데이터들을 디스크에 쓴 다음에 재활용하면 안될까? 사실 이 시점에 쓰는 것 보다도 `brelse` 함수가 호출된 시점에 분명히 `b->refcnt == 0` 을 만족했기 때문에, LRU 알고리즘에 의해 bcache의 앞쪽에 배치되었을 것이다. 이 시점에 버퍼 캐시가 DIRTY 였다면, 디스크에 써줄 수는 없었을까? 알아보니, `fs.c` 파일에서는 `brelse` 함수가 호출되기 전에 거의 항상 `log_write` 함수가 호출된다. 이 때, 버퍼 캐시의 내용을 log 영역에 쓰는 것으로 보인다. 그리고, `log.c` 에서는 `brelse` 함수를 호출하기전에, `bwrite` 함수를 호출한다.

: 만약, 디스크에 쓰는 시간 동안은 아직 해당 버퍼 캐시의 DIRTY를 0으로 할 수 없어서 즉각적으로 bcache에 새로 추가한다거나 `bget`에서 즉각적으로 반환하지 못한다면, 별도의 대기 및 펜딩 버퍼를 두고 거기에 버퍼 캐시 데이터를 쓰고 버퍼 캐시는 즉각적으로 반환하면 안될까?

: 그리고, `b->refcnt == 0 && (b->flags & B_DIRTY) == 0` 조건문만으로 충분한가? 버퍼마다 `슬립-락`을 가지고 있는데, 이 락도 잡혀있지는 않은지 검사하지 않아도 되는건가?

`b->refcnt`는 해당 블록이 참조된 횟수를 의미한다. 이 값은 나중에 버퍼 교체 알고리즘 `LRU` 에서 사용된다. `LRU` 알고리즘에서 이 값이 가장 적은 버퍼 캐시는 디스크로 강제 써지고, 새로운 데이터로 교체된다.

: `iderw` 에서 가장 중요한 부분은 `iderw`를 호출했다고 해서 즉각적으로 버퍼 캐시의 내용이 디스크로 써지는 게 아니다. 아래에서 `if(idequeue == b) idestart(b);` 의 의미는 현재 idequeue의 맨 앞에 요청이 현재 요청한 버퍼 캐시라면, 디스크에 쓴다는 것이다. 즉, idequeue가 맨 앞에 있는 데이터가 현재이 들어온 버퍼 캐시가 아니라면, 디스크에 바로 쓰지 않는 것이다. 정리해서, `iderw`를 호출했는데, idequeue가 비어있으면 요청한 버퍼 캐시를 바로 디스크에 쓴다. 그러나, idequeue가 비어있지 않다면, idequeue에 들어있는 요청들이 모두 마무리되고 나서 호출된다(`sleep(b, &ideclock)`).

// ide.c
...
...

// idequeue points to the buf now being read/written to the disk.
// idequeue->qnext points to the next buf to be processed.
// You must hold idelock while manipulating queue.

static struct buf *idequeue;

...
...

// Start the request for b.  Caller must hold idelock.
static void idestart(struct buf *b)
{
  if(b == 0)
    panic("idestart");

  if(b->blockno >= FSSIZE)
    panic("incorrect blockno");

  int sector_per_block =  BSIZE/SECTOR_SIZE;
  int sector = b->blockno * sector_per_block;
  int read_cmd = (sector_per_block == 1) ? IDE_CMD_READ :  IDE_CMD_RDMUL;
  int write_cmd = (sector_per_block == 1) ? IDE_CMD_WRITE : IDE_CMD_WRMUL;

  if (sector_per_block > 7) panic("idestart");

  idewait(0);
  outb(0x3f6, 0);  // generate interrupt
  outb(0x1f2, sector_per_block);  // number of sectors
  outb(0x1f3, sector & 0xff);
  outb(0x1f4, (sector >> 8) & 0xff);
  outb(0x1f5, (sector >> 16) & 0xff);
  outb(0x1f6, 0xe0 | ((b->dev&1)<<4) | ((sector>>24)&0x0f));

  if(b->flags & B_DIRTY){
    outb(0x1f7, write_cmd);
    outsl(0x1f0, b->data, BSIZE/4);
  } else {
    outb(0x1f7, read_cmd);
  }

}

// Interrupt handler.
void ideintr(void)
{
  struct buf *b;

  // First queued buffer is the active request.
  acquire(&idelock);

  if((b = idequeue) == 0){
    release(&idelock);
    return;
  }
  idequeue = b->qnext;

  // Read data if needed.
  if(!(b->flags & B_DIRTY) && idewait(1) >= 0)
    insl(0x1f0, b->data, BSIZE/4);

  // Wake process waiting for this buf.
  b->flags |= B_VALID;
  b->flags &= ~B_DIRTY;
  wakeup(b);

  // Start disk on next buf in queue.
  if(idequeue != 0)
    idestart(idequeue);

  release(&idelock);
}

//PAGEBREAK!
// Sync buf with disk.
// If B_DIRTY is set, write buf to disk, clear B_DIRTY, set B_VALID.
// Else if B_VALID is not set, read buf from disk, set B_VALID.
void iderw(struct buf *b)
{
  struct buf **pp;

  if(!holdingsleep(&b->lock))
    panic("iderw: buf not locked");

  if((b->flags & (B_VALID|B_DIRTY)) == B_VALID)
    panic("iderw: nothing to do");

  if(b->dev != 0 && !havedisk1)
    panic("iderw: ide disk 1 not present");

  acquire(&idelock);  //DOC:acquire-lock

  // Append b to idequeue.
  b->qnext = 0;
  for(pp=&idequeue; *pp; pp=&(*pp)->qnext)  //DOC:insert-queue
    ;
  *pp = b;

  // Start disk if necessary.
  if(idequeue == b)
    idestart(b);

  // Wait for request to finish.
  while((b->flags & (B_VALID|B_DIRTY)) != B_VALID) {
    sleep(b, &idelock);
  }

  release(&idelock);

}

: 외부에서 보면 `iderw` 함수는 디스크와 버퍼 캐시를 동기화해주는 함수이다. 그런데, 이 함수를 내부적으로 보면 전달받은 버퍼 캐시와 디스크를 동기화 한다기 보다는, `idequeue`에 버퍼 캐시를 삽입하는 코드만 들어있다. 일단, 먼저 이 코드부터 분석해보자. 앞쪽에는 예외 처리 관련한 코드들이 들어있다. 항상 주의할 부분은 멀티 프로세서 동기화다. `xv6` 모든 버퍼 캐시는 `bread` 함수를 통해 자신만의 `슬립-락`에 대한 소유권이 가진 채로 반환된다. 그래서 `if(!holdingsleep(&b->lock))` 조건문을 통해서 `슬립-락`이 되어있는지를 판단한다. `슬립-락`이 되어있지 않다면, `bread`에서 애초에 잘못된 버퍼 캐시를 제공해준 셈인 것이다.

: `b->dev != 0 && !havedisk1` 이 조건문은 디스크 드라이브가 존재하지 않을 경우에 대한 조건문인 것 같다. 그런데, b->dev != 0`의 의미가 뭘까?

`b->qnext = 0 `을 통해서 아래의 루프문이 종료 조건을 만들어준다. 위의 `b->qnext = 0 `을 작성하지 않으면, 루프문의 종료되지 않을 수 있다. 그 이하의 코드들은 예시를 통해서 알아보자.

0" 여기서 2가지 케이스로 보자. idequeue가 NULL인 경우와, NULL이 아닌 경우로 나누자.

1.0" idequeue가 비어있다고 가정하자. 그러면, `pp`는 `&idequeue`가 대입된다. 조건문이 `*pp`이므로, 종료가 된다. 왜냐면, 최초 `iderw`가 호출되는 시점에는 idequeue에는 아무값도 들어있지 않기 때문이다.
1.1" `if(idequeue == b) idestart(b)` 조건을 충족해서 곧바로 디스크에 써지면 된다.
1.2" `sleep(b, &idelock)`에 들어선다.
1.3" 디스크 작업이 완료되면, `ideintr`이 호출된다. 그러면, `b->flags |= B_VALID; b->flags &= ~B_DIRTY; wakeup(b);` 를 통해서, 위에 있는 `1.2` 에서 탈출하게 된다.

2.0" idequeue가 비어있지 않다고 가정하자. `pp`는 `(&idequeue)->next`가 대입된다. 여기서 이 `next`는 NULL이면 루프는 종료한다. 즉, 버퍼 캐시를 `idequeue`의 어느 위치에 넣어야 할지를 발견한 것이다. 만약, NULL이 아니면, 버퍼 캐시 위치를 삽입할 위치를 찾을 때 까지 루프를 돈다. idequeue에 삽입할 버퍼 캐시 위치를 찾을 수 있는 이유는 이전에 삽입된 `b->qnext = 0` 때문이다. 이 걸 통해, 루프문에서 다음 버퍼 캐시 위치를 찾을 수 있게 된다. 루프문에서는 다음 버퍼 캐시의 위치를 찾고, `*pp = b`를 통해 idequeue에 다음 버퍼 캐시를 할당하게 된다.
2.1" 그리고 `if(idequeue == b)` 조건을 충족하지 못할 경우, `sleep(b, &idelock)`에 들어선다.
2.2" 디스크 완료 인터럽트 핸들러(`ideintr`)을 통해서 `디스크-대기-루프문`에서 빠져 나올 수 있다.

: 사실, 2번 케이스는 나오기가 힘든 케이스인 것 같다. 즉, `iderw` 함수가 호출됬을 때, `idequeue`에 버퍼 캐시가 1개 이상 있는 경우가 많지는 않을 것 같다. 왜냐면, `idestart`함수와 `ideintr` 함수는 서로가 서로를 호출해주는 관계이다. 즉, `idequeue`를 실제 처리하는 함수는 `idestart` 함수다. 그런데, `idestart` 함수를 호출하는 함수는 2개밖에 없다. 바로 `iderw`와 `ideintr` 함수다. 여기서 `iderw` 함수는 `idequeue`에 버퍼 캐시를 추가하는 역할이고, `ideintr`은 `idequeue`에 버퍼 캐시가 남아있으면, `idestart` 함수를 다시 호출하는 함수다. 즉, `ideintr` 함수는 `idequeue`에서 버퍼 캐시를 제거하는 역할이라고 볼 수 있다.

: 만약, 2번 케이스가 존재한다면, 이런 상황일 것 같다. `iderw`가 호출됬다. `idequeue`에 버퍼 캐시가 추가됬다. 최초의 버퍼 캐시이므로, `idestart` 함수가 호출된다. `idequeue`에 있는 버퍼 캐시를 처리한다. `sleep` 함수를 호출해서 인터럽트가 날라올 때 까지 대기한다. 이 때, `sleep` 함수가 호출되면서 `idelock`에 대한 인터럽트가 비활성화된다. 이 시점에 만약 디스크 인터럽트전에 `iderw`를 호출하는 무언가가 있다면? `idequeue`에 새로운 버퍼 캐시가 추가될 것 이다. 근데, `idequeue`에서 이전에 처리된 버퍼 캐시가 실제로 비워지는 시점은 `idestart`에서가 아니라, `ideintr` 에서 지워진다(`idequeue = b->qnext`).

: 그래서 `iderw` 함수에서 `idestart` 함수를 호출하지 못하고, 바로 `sleep`에 빠지게 되는 경우가 있을 수 있다. 그런데, 이런 경우가 많은지는 잘 모르겠다.

: 참고로, `iderw`에서 `디스크-대기-루프문`을 빠져나오는 조건은 `ideintr` 함수에서 충족된다. 이 핸들러에서는 디스크 드라이브에 요청한 버퍼가 제대로 읽거나 써졌다면, 아래의 버퍼 캐시의 플래그들을 설정한다.

b->flags |= B_VALID;
b->flags &= ~B_DIRTY;

: 버퍼 데이터가 디스크와 동기화가 되었으니, B_VALID를 SET하고, B_DIRTY를 CLEAR 하게 된다. 이러한 이유 때문에 `iderw` 함수의 `디스크-대기-루프문`을 통과할 수 있게 된다.

저작자표시 비영리 변경금지

'프로젝트 > 운영체제 만들기' 카테고리의 다른 글

[xv6] - sbrk (0)	2023.07.26
[xv6] Log (0)	2023.07.25
[xv6] inode (0)	2023.07.24
[xv6] Process (0)	2023.07.24
[xv6] Sleeplock (0)	2023.07.23

ABOUT ME

Ease is the greatest threat Ease is the greatest threat

글의 참고

글의 전제

글의 내용

- Buffer cache layer

- 코드 분석

'프로젝트 > 운영체제 만들기' 카테고리의 다른 글

티스토리툴바

ABOUT ME

글의 참고

글의 전제

글의 내용

- Buffer cache layer

- 코드 분석

'프로젝트 > 운영체제 만들기' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바