Implementation of a Grid-wide File System

Student:Sasha Ruppert
Title:Implementation of a Grid-wide File System
Type:diploma thesis
Advisors:Veldema, R.; Philippsen, M.
State:submitted on April 10, 2006

Grid computing is, in short, the usage of multiple clusters of machines to compute some problem or to achieve some result.

The problem is that when using multiple clusters it is likely that the resources at each site are heterogenous (different CPUs, networks, file systems at each site), sometimes not reliable (your network or CPU reservation ran out, hard-disk failed, etc), and possibly the CPUs that you get are not exclusively yours to use.

All this makes using grid systems hard to use. Our current research therefore focuses on making cluster and grid computing easier. Our research vehicle is Jackal, a software DSM that allows you to use a cluster/grid as if it were a single computer. The programmer writes a normal multi-threaded Java program and our system will handle fault-tolerance and heterogeneity and will hence present a single system image (SSI).

Our current prototype, however, has one big problem that is best sketched by the following example:

Say you have access to two accounts, one login at the RRZE in Erlangen as 'foo' and one login at a Computing Center in Amsterdam, the Netherlands. Now, these logins are simple user accounts and you are unable to convince the local system administrators at both sites to disable firewalls, insert kernel modules, export NFS file systems etc.

Furthermore, a path to a file will, in general, not be valid on both clusters. To the Java programmer that uses our system we need to hide these paths by presenting him with a unified or completely new file system.

Another problem is that of using file descriptors. Say one thread running in Erlangen opens a file resulting in a file descriptor. That file descriptor is stored into an object and that object is sent to Amsterdam. A thread running on a machine in Amsterdam upon using the file descriptor will get an exception. This happens, because traditional file descriptors are only valid on the machines in Erlangen. To present an SSI, we need grid-wide portable file descriptors.

Finally, assume that a file is created on one machine, the machine exports the file descriptor and after a while some machine tries to access that file but that machine has crashed in the mean time. To present a fault tolerant SSI, the files need to be replicated.


Goals: create a plugin library that replaces our 'open', 'close', 'read','write', 'readdir' etc functions in the runtime system of our Java system. Our runtime system is written in C and would have more or less the same syntax as the UNIX read,write etc functions.

These replacement functions would operate on portable file descriptors whose structure would need to be determined. An efficient method of performing replication of files would need to be thought out. For example: a simple scheme would broadcast the parameters of each 'write' to a number of machines. The question is to which machines and how often (you could buffer writes to files for a certain amount of time in hope of aggregation of writes).

To present a unified file system, you could imagine that each site runs a 'directory' server that maintains a copy of the entire grid-wide directory structure. Creation of directories would be with special command line versions of 'mkdir', 'rm', 'ls' that call upon your library functions. Your library functions would contact the 'directory' server and read/change the internal data structures. For example, the 'directory' server would contain instances of 'GridDirectory' and 'GridFile'. Each GridDirecory contains an in-core list of GridFile instances. Each GridFile contains a path to the local replica of the file it represents on the local disk. A 'read' on a GridFile would perform a read on the replica while a write is broadcasted or performed locally and invalidates all other gridfiles on the other directory servers (globally).

watermark seal