Nordita AFS User Guide
Contents
AFS is an acronym for the Andrew File System, developed
at Carnegie-Mellon University, Pittsburgh, under a sponsorship from IBM. For
years, AFS has been maintained & sold by the Transarc Corporation, now part
of IBM. Today, AFS has been "opened" by IBM, and is mainly maintained &
distributed by the OpenAFS consortium.
Files and directories stored in the AFS file system are stored in logical entities
called volumes. A volume physically resides in parts of a file server
machine's physical storage (i.e.disk's) called partitions. The information
about which volume is located where and on which file server is managed by
volume location server processes running on servers inside the AFS
cell (see below).
The mapping information of files/directories to volumes ("mount points")
is contained in the file system itself and
therefore the same for any client traversing the AFS file space. Through
adequate construction of mount points it is possible to reference each file in
AFS by a path name which will be the same for every AFS client machine in the
world.
In order to keep path names unique, an early component of the path name is
the cell name. An AFS cell is a collection of AFS servers and
clients. The cell is an administratively independent domain, and among others
the boundary for the validity of AFS userids (see below). A typical
pathnames of an AFS file is
/afs/physto.se/home/r/rahman/public/afsug.ps
Access to AFS files is location transparent. System processes resolve each
file's physical location automatically based on information contained within the
file system. The user does normally not even know how many servers are in a
cell. Files can sometime exist in multiple copies on several servers for load
balancing and availability purposes, with the AFS cache manager choosing a copy
based on availability and topological distance in the network.
AFS uses Access Control Lists stored with each directory, which allow fine specifications.
Users are identified to AFS using Kerberos authentication. AFS user
identifications are global to the whole cell, and not necessarily related to the
local authentication on the user's workstation. Authenticated users hold
Kerberos tickets for the service 'afs', in AFS called a token.
Simply presenting a valid token is already proof of authenticity. In
order to limit the damage in case of accidental disclosure tickets are therefore
protected by an expiry date - after that date the ticket or token is no
longer accepted. The usual lifetime for AFS is 25 hours, which forces the user
to re-authenticate every day.
Based on the user's authentication, connections to AFS file servers are
automatically established when needed, using complex Kerberos mutual
authentication. Although data is exchanged in clear, control information between
a user's workstation and a fileserver is encrypted using a key shared only
between the workstation and the fileserver, making it virtually impossible for
another node on the network to masquerade as the user, or to insert himself into
an ongoing connection and act on behalf of the user.
At Nordita, users can apply for an AFS account via
the secretary.
The account creation implies
creation of a reasonably sized AFS volume.
Every user's home directory is implemented as an AFS volume that is
accessible through the path /afs/physto.se/home/initial/userid,
where initial is the first character of userid
Example: /afs/physto.se/home/r/rahman.
The default access rights in the AFS home directory are set up such
that the locally authenticated users (hence physto.se users) have "l" lookup
rights, i.e. is allowed to see which files are contained in it, but not to
read or write any of them. (The rationale behind this will become clear from
what follows).
The following sub-directories
are set up, with the owner again having
complete control over them
- private subdirectory is not accessible to
anybody except the owner, it is meant as part of the tree which the user can
completely hide to anybody, not even granting the right to list the files in
there.
- Files in the public subdirectory are readable by everybody, here a
user would typically deposit files which could be accessed by other
users.
- private_html is the web home page directory, it has same
rights as public
Every night an AFS backup clone is created from all physto.se user AFS
volumes.
The net effect is that accidentally modified or deleted AFS files can easily
be recovered from the AFS backup volume, provided the error is discovered
before the clone is re-created the following night.
User AFS backup volumes are automatically mounted under the path
/afs/physto.se/homebackup/initial/userid
where initial is the first character of userid.
A number of architecture
dependent subdirectories are set up automatically upon AFS account creation,
together with a symbolic link bin -> .@sys/bin.
The directories are
named .alpha_dux40, .ppc_darwin_60, .i386_redhat73, etc... Because of the leading
'.' (dot) they will not appear in a normal 'ls' command.
Symbolic links among directories for different architectures are set up in
cases where programs generated (linked) for one architecture would normally run
unchanged in the other. The list of directories created upon account creation
will be adjusted regularly, since it is not possible to foresee all
architectures which will be supported in the future. If in doubt, consult your
current architecture string (fs sysname) and check your home directory
with ls -al.
Several Unix system components require that files be located in the user's
home directory, typically the so called dot files, e.g. files like
.profile or .login, .netrc, .forward,
.Xdefaults, .newsrc, .rhosts, directories and normal files
like Mail or mail, .nn or dead.letter.
Many of these files are not suitable to be world-readable!
.netrc, .rhosts and .Xauthority are highly
security-sensitive. The dead.letter file is created when you ctrl-C out
of some mail agents and could cause embarrassment if somebody reads a mail that
you actually decided not to send.
In AFS, access restrictions apply to all files in a directory,
regardless of the usual Unix access
bits. You can therefore either protect all files or none in
your home directory.
The default access rights for AFS home directories are therefore system:authuser
l, which allows anybody to look up files in a directory and therefore
to proceed further in the directory tree into subdirectories governed by their
own ACL, but not to read their contents.
However, some files have to be consulted by system processes running without
knowledge of the user's password (which makes them unprivileged regardless of
with whatever privileges they run on the local system). For example the
.forward file has to be read by the sendmail daemon when mail
arrives.
Such files have to be moved to a publicly accessible location. When they are
expected to reside in the user home directory, symbolic links can be set up
such as .forward -> public/.forward
In order to make AFS data publicly available, the access control lists for
the directories containing the file(s) have to grant access to
- system:anyuser, i.e. the whole (AFS-) world,
- system:authuser, i.e. everybody logged in to the physto.se cell,
- hosts:all, i.e. every machine physically at Fysikum.
Files and directories created under the $HOME/public tree will automatically
be "cell-listable", i.e. system:authuser l.
What now if existing files in a subdirectory other then public, e.g. $HOME/work, are
to be read by others? One way (not recommended) is to copy recursively the
files into the public path, e.g. cp -r work public/work,
and to remove work.
A better alternative is to leave work where it is
and change its ACL to grant access to the world:
cd ~/work
find . -type d -print -exec fs setacl {} system:anyuser rl \;
|
The above example will take care of walking down work\'s
subdirectory tree (if any) and give the world access to all subdirectories.
Contribution from Johan Lundberg:
How to use PBS with my AFS account?
Firstly you need to find out if you
really have an AFS account. Type pwd, the answer should begin with /afs/physto.se/.
All fysikum AFS users have their own publicly read/writable scratch area,
to be used with PBS. It is available through the environment variable
$SCRATCH, hence to go to it just type cd $SCRATCH
When using PBS, you have to make sure your job is only trying to write
to files in $SCRATCH. The complexity of doing this really depends on
what you do in the scripts. A minimum of things you need to do is to copy
your pbs scripts to $SCRATCH:
cp -p afs_testjob.sh $SCRATCH/
change current working directory to the scratch area:
cd $SCRATCH
and run your script with qsub as usual.
The most important command are
- klog
- to obtain an AFS tokens
- fs
- interacts with the host's AFS cache manager; this command is used to
manage Access Control Lists, obtain information about AFS directories (e.g.
mount points, AFS volume where the directory resides, quota), and
control/configure various variables in the cache manager (e.g. cache size,
server preferences).
- tokens
- lists all afs tokens the user currently holds
- unlog
- destroys AFS tokens
|
Example: in order to find out on which AFS file server a file resides
issue the command:
Example: an alternative and more verbose method is:
fs examine filename
vos examine 'AFS volume', with the 'AFS volume' name returned by the previous command.
|
All commands provide online help about command arguments,
displayed when using the -help switch.
Access Control Lists (ACLs) are used by AFS to control access to files
residing in the AFS file system. ACLs are defined for a directory and apply to
all files immediately residing in that directory. An ACL is a list of
userid-access right pairs. A userid is either an individual user
or a named collection of users called a protection
group, in which case the rights apply to all users in the group.
AFS defines the following access rights:
- r
- reading of files
- w
- writing to files
- k
- locking of files (flock() system call)
- l
- the right to search for files in the directory, i.e. list which files it
contains
- i
- creation of new files or directories
- d
- deletion of files
- a
- the right to change the ACL
|
Mnemonic rights formed from those above are
- all
- r+w+k+l+i+d+a
- none
- entry deleted from access list. This does not mean that the user has no
rights, since other ACL entries may still apply
- read
- r+l
- write
- r+w+k+l+i+d, i.e. everything except 'a'
Access rights for a directory can be examined and defined using the fs
listacl (or just fs la) and fs setacl (or fs sa) commands.
Examples:
-
> fs la /afs/physto.se/home/r/rahman/public
Access list for /afs/physto.se/home/r/rahman/public is
Normal rights:
system:anyuser rl
rahman rlidwka
|
> cd /afs/physto.se/home/r/rahman
> fs la ./Mail
Access list for ./Mail is
Normal rights
rahman rlidwka
|
This means that anybody may read files in /afs/physto.se/home/r/rahman/public
but that only rahman can write to and create files there, whereas /afs/physto.se/home/r/rahman/Mail
is completely private to rahman.
-
> fs sa /afs/physto.se/home/r/rahman/public rub write
> fs la /afs/physto.se/home/r/rahman/public
Access list for /afs/physto.se/home/r/rahman/public is
Normal rights:
system:anyuser rl
rahman rlidwka
rub rlidwk
|
Here the user rub is granted write rights for the
directory /afs/physto.se/home/r/rahman/public, which enables him to create, alter
and erase files, but not to alter the ACL.
-
> fs sa /afs/physto.se/home/r/rahman/public rub none
> fs la /afs/physto.se/home/r/rahman/public
Access list for /afs/physto.se/home/r/rahman/public is
Normal rights:
system:anyuser rl
rahman rlidwka
|
Here the user rub looses any rights for the directory
/afs/physto.se/home/r/rahman/public. This is the right way to remove someone
from ACLs.
-
> fs la /afs/physto.se/common/scratch
Access list for /afs/physto.se/common/scratch is
Normal rights:
hosts:all rlidwka
system:authuser rlidwka
...
|
The /afs/physto.se/common/scratch
tree contains scratch area that is accessible
only by machines at Fysikum (the hosts:all group) and/or users with a valid
physto.se AFS token.
Protection groups are defined by the system or can be can
be defined by users themselves using the pts creategroup command. Special
predefined protection groups are
- system:anyuser
- just any AFS user in the world
- system:authuser
- any AFS user with a valid Kerberos token for the local cell
- hosts:all
- all machines at Fysikum (based on IP addresses)
- hosts:gg
- all machines of the group 'gg'
- gg
- AFS space administrators for group 'gg'
- physto:gg
- all registered members of group gg
|
Users are encouraged to use protection groups whenever there is a need to
control access rights for even a small number of users. Any user can create a
protection group and then place the group on an ACLs as though it were an
individual user. Using groups simplifies maintenance considerably since whenever
there is a change of membership of the group, the change needs only to be made
in the definition of the group and not on each directory concerned.
Using protection groups rather than individual users is also a security
feature.
Protection groups are managed with the pts command, for example
pts creategroup -name gaston:myfriends
fs sa ~gaston/photos gaston:myfriends rl
|
issued by user gaston would create a protection group called
gaston:myfriends, owned by gaston and give access to directory "photos".
Any user who is a member of the group xyz can create a protection
group with a name of the form xyz:abcd. For example
pts creategroup -name gg:production -owner gg
|
would create a protection group called gg:production, owned (and thus
modifiable) by members of the protection group gg. Remember that gg is
the protection group containing the AFS space admininistrators of the computing
group GG.
The membership of a protection group is modified with the commands pts
adduser and pts removeuser.
pts adduser -user rahman gaston gabri -group snova:members
|
would add the userids rahman, gaston and gabri to the
protection group called snova:members.
pts removeuser -user gaston -group snova:members
|
would remove userid gaston.
The pts membership command can be used both to find the members of a
group or to which groups a user belongs. Note that nesting of protection groups
is not allowed. A protection group can not be a member of another protection
group.
pts membership -name physto:mol
|
would list the members of protection group physto:mol.
pts membership -name gaston
|
would list the protection groups of which gaston is a member.
AFS volumes are containers for files and directories. A volume is a certain
amount of space allocated on a physical disk on some file server. A volume
shrinks and grows to accommodate the data it contains, but a volume has a quota,
i.e. a maximum size assigned beyond which it will not grow. Quotas are assigned
by system administrators. The current space usage and the quota can be examined
using the fs listquota command, which takes the directory pathname as an
argument:
> fs listquota /afs/physto.se/home/r/rub
Volume Name Quota Used % Used Partition
home.r.rub 5000 12 0% 82%
|
The command also lists the amount of space available on the file server
physical disk (partition). Indeed an AFS user can run out of space in both
fashions: his volume quota may be exceeded, or there might not be enough space
available on the disk. The latter can only happen in case the disk has been
over-committed, i.e. the sum of all quotas for volumes residing on a disk
exceeds the amount of space available on the disk. In both cases there is not
much that he can do about it except starting to erase files or enter
negotiations with the system administrator.
Given that volumes cannot exceed the physical partition size, the world would
be confined to one single disk if mount points did not exist: a mount point is a
directory entry leading to another AFS volume. The mount point contains the name
and cell of the AFS volume it points to. The volume can reside on a different
disk on the same fileserver, on another fileserver or even on a fileserver in an
entirely different cell. Whenever the mount point is crossed, i.e. is referenced
as part of a filename (or a cd command), the new volume is automatically
opened by the AFS cache manager, who contacts the volume location server
in the volume\'s cell to discover the network address of the file server housing
the volume.
Mount points can be examined using fs lsmount. However, in order to
just find out in which volume a file resides or on which fileserver, the
commands fs examine, fs whereis and vos examine are more
convenient:
> cd /afs/physto.se/home/r
> fs lsmount rahman
'rahman' is a mount point for volume '#home.r.rahman'
|
> fs examine /afs/physto.se/home/r/rahman
Volume status for vid = 536870978 named home.r.rahman
Current disk quota is 2000000
Current blocks used are 1807774
The partition has 15142888 blocks available out of 68811300
|
> fs whereis /afs/physto.se/home/r/rahman
File /afs/physto.se/home/rahman is on host afs48.physto.se
|
> vos examine home.r.rahman
home.r.rahman 536870978 RW 1807774 K On-line
afs48.physto.se /vicepa
RWrite 536870978 ROnly 0 Backup 536870980
MaxQuota 2000000 K
Creation Wed May 20 13:24:17 1992
Last Update Fri Mar 28 08:27:47 2003
1598 accesses in the past day (i.e., vnode references)
RWrite: 536870978 Backup: 536870980
number of sites -> 1
server afs48.physto.se partition /vicepa RW Site
|
The fs lsmount command not only returned the name of the volume the
mount point rahman pointed to,
a hash sign (#) preceding volume name indicates that
directory is a regular mount point and
a percent sign (%) preceding volume name indicates that
directory is a ReadWrite mount point.
In order to act on AFS files in a way that requires privileges, such like
writing into file or reading files which are not readable by just anybody, a
user has to hold an AFS token. An AFS token can be obtained automatically during
login if the system has been set up accordingly, based on the user\'s AFS
password. In order to find out which tokens have been created the tokens
command can be issued.
> tokens
Tokens held by the Cache Manager
User's (AFS ID 1704) tokens for afs@physto.se [Expires Feb 27 16:51]
--End of list--
|
Obviously this user holds a token as AFS user 1704. The token management uses
numeric user ids just like the Unix system, although they might be different.
A token expires automatically after a certain time, or the user may decide
that he wishes to change identity (assuming he owns several accounts): the
klog command can then be used to obtain a token.
> klog rdt
Password for rdt@PHYSTO.SE:
|
At this point the user is prompted to type his AFS password. The command will
not login/logout him of the Unix system, it will only change his current AFS
token. A token can be discarded with the unlog command.
At any given time a user can only hold one(!) AFS token per cell,
but is is posible to be authenticated to many cells, say, 'cern.ch' and 'physto.se'
at the same time.
In addition to requesting AFS tokens, the login, klog
and other authenticating commands support maintaining the Kerberos ticket
granting ticket. This is similar to an AFS token (in fact, the AFS token is
derived from it), but enables the user to request Kerberos tickets for services
other than AFS.
AFS authentication is based upon passwords, and the user's identity is only
as safe as the password he chooses. It is therefore advisable to change the
password frequently, and to choose a password which is as difficult to guess as
possible yet not written down anywhere. The AFS password can be changed with the
command passwd.
On the Digital Unix you will be prompted for the "KRB5" or "BSD"
password. Choose "KRB5".
The AFS file system differs from the standard Unix file system in several
aspects, with more or less impact on the average end user.
The most common problem users face is missing authentication. AFS
tokens are required for access to data that is not publicly accessible.
Tokens are stored in the operating system kernel
and are
normally linked to the user's process tree, however the link between the user's
process and stored token(s) can be modified by commands like
klog and unlog, and, in some cases, even by an implicit (and
therefore difficult to detect) change of Unix UID.
Most of these problem can be avoided
by regular log-out (e.g. before going home).
Tokens in remote sessions, e.g. established via telnet,
rlogin or xrsh will not be refreshed after unlocking the screen;
this has to be done manually using klog.
In case of permission problems, the tokens command should be used to
find out what tokens have been acquired and which are still valid. the pts
membership command tells to which groups a user belongs, and the fs
listacl command should be used to find out which permissions are required.
What can a user do when he cannot access a file because the token has expired?
The most appropriate is to issue the klog command, which simply acquires
a new token. All windows in the current session, i.e. normally all windows
on an X screen except those in remote sessions will use the new token.
When this happens in an editor session - don't panic! If another window
within the same session is open where you can issue the klog command,
this should help. Alternatively, many editors allow opening up a temporary shell
from within the editor; a klog issued in such a subshell will access the
same PAG and is therefore all that is required.
File access is determined by examining the Access Control List for the file's
directory. If the user accessing the file passes this test, the Unix access bits
for the owner are examined to determine the access granted. Group access
bits are ignored.
Setuid/setgid programs are supported in AFS, but the chown function to
set the owner/group of a file requires special privileges which users will
obviously not be granted. If they were, users could create setuid root
programs on their own workstation and become root on any other station within
the local cell on which they have an account.
If setuid/setgid programs are required, they should reside on the local file
system of the workstation where they are intended to be run. If required,
symbolic links can be set up within the AFS name space for ease of use.
Hard links inside AFS can only be created in the directory where the target
resides. It is not possible to create a hard link from an AFS file into another
AFS directory (nor into another file system).
In AFS it is not possible to execute programs that do not also have 'read'
access, i.e. files with only --x attributes.
The reason for this is obvious: the AFS fileserver cannot trust the
requesting machine's operating system (since it does not have an identity, only
the user has), and a request of the type 'please pass me that file, and I
shall execute it but I promise not to look at it' does not make sense.
AFS has been designed to be usable on machines of different system flavors
and even CPUs with different instruction sets. Consequently, users working
on machines of different architectures are faced with the need of having to
compile programs several times, and often to store them in appropriately named
places so that surrounding applications which are common to all architectures,
e.g. shell scripts, can easily locate them.
To this effect AFS introduces the special name @sys. If used as the
last characters of any but the last element of a path name (in other
words, it is followed by a slash!), it will automatically be replaced by the
current AFS architecture string.
Example: on a PC Linux box running RedHat7.3 the path $HOME/.@sys/bin
is equivalent to $HOME/.i386_redhat73/bin. The same path name
$HOME/.@sys/bin on a Mac OS X system would automatically translate
into $HOME/.ppc_darwin_60/bin. This can be used to access programs for different
architectures under the same name regardless of the architecture, e.g. by
setting up a directory per architecture and appropriate symbolic links:
> cd
> mkdir -p .alpha_dux40/bin
> mkdir -p .i386_redhat90/bin
> ln -s .@sys/bin bin
|
Now the following works on both Solaris and Linux, with nobody "stepping on
the other's feet":
- on Solaris:
> cc -o bin/helloworld helloworld.c
- on Linux:
> cc -o bin/helloworld helloworld.c
- on both:
> bin/helloworld
The architecture string is compiled into the AFS kernel and varies according
to the processor type and operating system characteristics.
The architecture string for the machine you are working on can always be
queried with the fs sysname command.
AFS tokens are valid for all processes in a Process
Authentication Group (PAG), or, if no PAG has been established, for all
processes running under the same Unix UID as the last issuer of klog
outside a process authentication group.
Normally a PAG is established for every session initiated by login.
Processes sprung off the PAG-creating process (i.e. its children) inherit
the same PAG.
Any child can modify the AFS tokens within a PAG, and the modification
applies to the whole 'family'. Using klog on a graphics display with
twenty-three windows open in six different workspaces can therefore have
surprising effects, the reason for which is not immediately obvious.
However, the effect can be even more drastic and difficult to debug when
tokens are associated with Unix UIDs instead of a PAG: in addition to any
process running with the same UID being able to change the token (and not only
processes in the same 'session'), a setuid root command (such as
lpr on some systems) will not inherit the user's tokens (unless he runs
as root) and can therefore not access his files.
The klog can be used to acquire a token for a different userid. However,
in a windowed system with several shells and processes running using the same
token, this approach can be have significant side-effects.
Some commonly used software packages, like MATLAB, Mathematica, root,
IDL, CERN Program Library, etc. are accessible through physto.se AFS cell. The basic path names are
/afs/physto.se/system/@sys/pkg/package/pro
/afs/physto.se/system/@sys/cern/pro
|
Those packages are always included in the search path when you login.
The directory trees consist of architecture (e.g. processor-type, operating system)
dependent and independent parts. The normal starting point is always an architecture
dependent subtree (e.g. hp700_ux90) which may contain symbolic links into
the architecture-independent subtree (i.e. share) for files which are common
amongst all architectures.
As of writing, the AFS documentation from the OpenAFS can be found here.
|