-
[xv6] File system카테고리 없음 2023. 7. 16. 20:22
글의 참고
- https://github.com/mit-pdos/xv6-public/tree/master
- xv6 - DRAFT as of September 4, 2018
글의 전제
- 밑줄로 작성된 글은 강조 표시를 의미한다.
- 그림 출처는 항시 그림 아래에 표시했다.
글의 내용
- File system
The purpose of a file system is to organize and store data. File systems typically support sharing of data among users and applications, as well as persistence so that data is still available after a reboot.
The xv6 file system provides `Unix-like` files, directories, and pathnames (see Chapter 0), and stores its data on an IDE disk for persistence (see Chapter 3). The file system addresses several challenges:
• The file system needs on-disk data structures to represent the tree of named directories and files, to record the identities of the blocks that hold each file’s content, and to record which areas of the disk are free.
• The file system must support `crash recovery`. That is, if a crash (e.g., power failure) occurs, the file system must still work correctly after a restart. The risk is that a crash might interrupt a sequence of updates and leave inconsistent on-disk data structures (e.g., a block that is both used in a file and marked free).
• Different processes may operate on the file system at the same time, so the file system code must coordinate to maintain invariants.
• Accessing a disk is orders of magnitude slower than accessing memory, so the file system must maintain an in-memory cache of popular blocks.
The rest of this chapter explains how xv6 addresses these challenges.- Overview
The xv6 file system implementation is organized in seven layers, shown in Figure 6-1. The disk layer reads and writes blocks on an IDE hard drive. The buffer cache layer caches disk blocks and synchronizes access to them, making sure that only one kernel process at a time can modify the data stored in any particular block. The logging layer allows higher layers to wrap updates to several blocks in a transaction, and ensures that the blocks are updated atomically in the face of crashes (i.e., all of them are updated or none). The inode layer provides individual files, each represented as an inode with a unique i-number and some blocks holding the file’s data. The directory layer implements each directory as a special kind of inode whose content is a sequence of directory entries, each of which contains a file’s name and i-number. The pathname layer provides hierarchical path names like /usr/rtm/xv6/fs.c, and resolves them with recursive lookup. The file descriptor layer abstracts many Unix resources (e.g., pipes, devices, files, etc.) using the file system interface, simplifying the lives of application programmers.
- Buffer cache layer
- Inode layer
The term inode can have one of two related meanings. It might refer to the ondisk data structure containing a file’s size and list of data block numbers. Or ‘‘inode’’ might refer to an in-memory inode, which contains a copy of the on-disk inode as well as extra information needed within the kernel.
The on-disk inodes are packed into a contiguous area of disk called the inode blocks. Every inode is the same size, so it is easy, given a number n, to find the nth inode on the disk. In fact, this number n, called the inode number or i-number, is how inodes are identified in the implementation.
The on-disk inode is defined by a struct dinode (4078). The type field distinguishes between files, directories, and special files (devices). A type of zero indicates that an on-disk inode is free. The nlink field counts the number of directory entries that refer to this inode, in order to recognize when the on-disk inode and its data blocks should be freed. The size field records the number of bytes of content in the file. The addrs array records the block numbers of the disk blocks holding the file’s content.
The kernel keeps the set of active inodes in memory; `struct inode` (4162) is the in-memory copy of a `struct dinode` on disk. The kernel stores an inode in memory only if there are C pointers referring to that inode. The ref field counts the number of C pointers referring to the in-memory inode, and the kernel discards the inode from memory if the reference count drops to zero. The iget and iput functions acquire and release pointers to an inode, modifying the reference count. Pointers to an inode can come from file descriptors, current working directories, and transient kernel code such as exec.- 개요
: 이제 `initlog` 에 대해 알아봐야 하는데, 그전에 `xv6` 파일 시스템 포맷이 어떻게 진행되는지 부터 알아보자. 왜냐면, 파일 시스템 구조를 알기 위해서는 먼저 해당 파일 시스템의 메타 데이터가 어떻게 이루어졌는지를 파악하는게 우선 이기 때문이다. 그러므로, `mkfs.c` 파일의 main 함수를 먼저 분석해본다. (`mkfs` 파일은 유저 프로그램 파일이다. 그러므로, 여기서는 `xv6` 파일 시스템 구조만 알아보고, 자세한 `mkfs`의 자세한 분석은 다른 글에서 다루도록 한다)
: 먼저 특정 영역의 개수들은 고정된 값을 갖는다. 예를 들어, bitmap
param.h
...
...
#define NOFILE 16 // open files per process
#define NFILE 100 // open files per system
#define NINODE 50 // maximum number of active i-nodes
#define NDEV 10 // maximum major device number
#define ROOTDEV 1 // device number of file system root disk
#define MAXARG 32 // max exec arguments
#define MAXOPBLOCKS 10 // max # of blocks any FS op writes
#define LOGSIZE (MAXOPBLOCKS*3) // max data blocks in on-disk log
#define NBUF (MAXOPBLOCKS*3) // size of disk block cache
#define FSSIZE 1000 // size of file system in blocks
...
...
####################################################################################################
// mkfs.c
...
...
int nbitmap = FSSIZE/(BSIZE*8) + 1;
int ninodeblocks = NINODES / IPB + 1;
int nlog = LOGSIZE;
int nmeta; // Number of meta blocks (boot, sb, nlog, inode, bitmap)
int nblocks; // Number of data blocks
int fsfd;
struct superblock sb;
char zeroes[BSIZE];
uint freeinode = 1;
uint freeblock;
...
...
int main(int argc, char *argv[])
{
int i, cc, fd;
uint rootino, inum, off;
struct dirent de;
char buf[BSIZE];
struct dinode din;
static_assert(sizeof(int) == 4, "Integers must be 4 bytes!");
if(argc < 2){
fprintf(stderr, "Usage: mkfs fs.img files...\n");
exit(1);
}
assert((BSIZE % sizeof(struct dinode)) == 0);
assert((BSIZE % sizeof(struct dirent)) == 0);
fsfd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666);
if(fsfd < 0){
perror(argv[1]);
exit(1);
}
// 1 fs block = 1 disk sector
nmeta = 2 + nlog + ninodeblocks + nbitmap;
nblocks = FSSIZE - nmeta;
sb.size = xint(FSSIZE);
sb.nblocks = xint(nblocks);
sb.ninodes = xint(NINODES);
sb.nlog = xint(nlog);
sb.logstart = xint(2);
sb.inodestart = xint(2+nlog);
sb.bmapstart = xint(2+nlog+ninodeblocks);
printf("nmeta %d (boot, super, log blocks %u inode blocks %u, bitmap blocks %u) blocks %d total %d\n",
nmeta, nlog, ninodeblocks, nbitmap, nblocks, FSSIZE);
freeblock = nmeta; // the first free block that we can allocate
for(i = 0; i < FSSIZE; i++)
wsect(i, zeroes);
memset(buf, 0, sizeof(buf));
memmove(buf, &sb, sizeof(sb));
wsect(1, buf);
rootino = ialloc(T_DIR);
assert(rootino == ROOTINO);
bzero(&de, sizeof(de));
de.inum = xshort(rootino);
strcpy(de.name, ".");
iappend(rootino, &de, sizeof(de));
bzero(&de, sizeof(de));
de.inum = xshort(rootino);
strcpy(de.name, "..");
iappend(rootino, &de, sizeof(de));
for(i = 2; i < argc; i++){
assert(index(argv[i], '/') == 0);
if((fd = open(argv[i], 0)) < 0){
perror(argv[i]);
exit(1);
}
// Skip leading _ in name when writing to file system.
// The binaries are named _rm, _cat, etc. to keep the
// build operating system from trying to execute them
// in place of system binaries like rm and cat.
if(argv[i][0] == '_')
++argv[i];
inum = ialloc(T_FILE);
bzero(&de, sizeof(de));
de.inum = xshort(inum);
strncpy(de.name, argv[i], DIRSIZ);
iappend(rootino, &de, sizeof(de));
while((cc = read(fd, buf, sizeof(buf))) > 0)
iappend(inum, buf, cc);
close(fd);
}`initlog` 함수는 `xv6`의 파일 시스템 7개의 레이어 중 3번째 레이어를 나타낸다. 그리고 `xv6`의 파일 시스템에서 2번째 블락부터 `로그` 블락을 나타낸다. 위의 `그림 6-2`를 참고하자.