INTRO
Today's topic gives a peek “under the covers” at the technical detail of how files are stored.
Linux supports a large number of different “filesystems” - although on a server you’ll typically be dealing with just ext3 or ext4 and perhaps btrfs - but today we’ll not be dealing with any of these; instead with the layer of Linux that sits above all of these - the Linux Virtual Filesystem.
The VFS is a key part of Linux, and an overview of it and some of the surrounding concepts is very useful in confidently administering a system.
THE NEXT LAYER DOWN
Linux has an extra layer between the filename and the file's actual data on the disk - this is the inode. This has a numerical value which you can see most easily in two ways:
The -i
switch on the ls
command:
ls -li /etc/hosts
35356766 -rw------- 1 root root 260 Nov 25 04:59 /etc/hosts
The stat
command:
stat /etc/hosts
File: `/etc/hosts'
Size: 260 Blocks: 8 IO Block: 4096 regular file
Device: 2ch/44d Inode: 35356766 Links: 1
Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2012-11-28 13:09:10.000000000 +0400
Modify: 2012-11-25 04:59:55.000000000 +0400
Change: 2012-11-25 04:59:55.000000000 +0400
Every file name "points" to an inode, which in turn points to the actual data on the disk. This means that several filenames could point to the same inode - and hence have exactly the same contents. In fact this is a standard technique - called a "hard link". The other important thing to note is that when we view the permissions, ownership and dates of filenames, these attributes are actually kept at the inode level, not the filename. Much of the time this distinction is just theoretical, but it can be very important.
TWO SORTS OF LINKS
Work through the steps below to get familiar with hard and soft linking:
First move to your home directory with:
cd
Then use the ln
("link") command to create a “hard link”, like this:
ln /etc/passwd link1
and now a "symbolic link" (or “symlink”), like this:
ln -s /etc/passwd link2
Now use ls -li
to view the resulting files, and less
or cat
to view them.
Note that the permissions on a symlink generally show as allowing everthing - but what matters is the permission of the file it points to.
Both hard and symlinks are widely used in Linux, but symlinks are especially common - for example:
ls -ltr /etc/rc2.d/*
This directory holds all the scripts that start when your machine changes to “runlevel 2” (its normal running state) - but you'll see that in fact most of them are symlinks to the real scripts in /etc/init.d
It's also very common to have something like :
prog
prog-v3
prog-v4
where the program "prog", is a symlink - originally to v3, but now points to v4 (and could be pointed back if required)
Read up in the resources provided, and test on your server to gain a better understanding. In particular, see how permissions and file sizes work with symbolic links versus hard links or simple files
The Differences
Hard links:
- Only link to a file, not a directory
- Can't reference a file on a different disk/volume
- Links will reference a file even if it is moved
- Links reference inode/physical locations on the disk
Symbolic (soft) links:
- Can link to directories
- Can reference a file/folder on a different hard disk/volume
- Links remain if the original file is deleted
- Links will NOT reference the file anymore if it is moved
- Links reference abstract filenames/directories and NOT physical locations.
- They have their own inode
EXTENSION
RESOURCES
Some rights reserved. Check the license terms here