The Linux Filesystem Hierarchy
One of the things that made me curious when I approached Linux for the first time was the long list of different directories that are part of a basic installation. The command ‘man’ helps in figuring this out when searching for ‘hier’, which stands for hierarchy:?
[carlo@dragon ~]$ man hier
Another good source of information for the file system hierarchy of Linux is available at The Linux Documentation Project web site:
http://tldp.org/LDP/Linux-Filesystem-Hierarchy/Linux-Filesystem-Hierarchy.pdf
This is a 113 pages book in PDF format and is a good reference for those who want to know more. For the rest of us, here is a list of the most important directories in the hierarchy.
Keep in mind that different Linux distributions may use some of the directories in a slightly different way and, sometimes, even change the path of some of the contents described in this article.
However, the principle remain, and you can always type ‘man hier’ on the command line to learn how your particular distribution behaves, if something is different.
/
The so called “root” directory. Everything starts from here; all the other directory paths start from this root point.
/bin
This is a container for the programs that are used in command line mode. Mostly, these are the programs that are used in day to day chores, from creating and destroying directories, to change ownerships, to make searches, and even provide shell environments. All these programs are executable binaries, from which the name of the directory: bin[ary].
/sbin
The container for the system programs that are used to boot the system and to perform certain administrator tasks. Regular users usually don’t need to access the contents of this directory, although they are usually visible to everybody.
/boot
The boot loader directory. It contains the boot loader itself, all its related files, and the kernel binary. When the system is powered on, the BIOS goes directly into this directory and starts the boot loader. This may present a list of Operating Systems to start, like Linux or Windows or OSX, then it loads the kernel into memory and gives control to it.
/dev
This directory contains a list of the physical devices that are part of the system. In Linux, every device is represented as a file, so each device has an entry in this directory that emulates a file. Whenever you access a device, either a disk, or a USB device, or a console terminal, or any other, the OS thinks it is a file that is located in this directory and access it, like it was a real file. Low level drivers take care of converting the file access directives to actual directives for the device itself. If you have some knowledge of C++, you can easily see the concept when you think how all kind of devices can be accessed by a stream, simply using the iostream class or one of its derivatives.
/etc
The repository for most of the system configuration files. Some very large packages have their configuration file in subdirectories of /etc. Otherwise, everything is located at this exact level. Example of configuration files stored in this directory are: bashrc, crontab, host.conf, inittab, netconfig, passwd, virc, yum.conf. Whenever aprogram is launched, it usually go into this directory to see if there is a centralized configuration file it should use.
/etc/opt
This is a special subdirectory of /etc. While /etc is used to store configuration files for system applications, or applications that come with the operating system, /etc/opt is used to store configuration files for optional applications, those that can be installed from expernal repositories, or those that you can eventually create of buid yourself. These are applications that are normally stored in the /opt directory.
/home
The default home directory for all users. Each user has usually its own subdirectory allocated right below /home. For example: /home/carlo
/root
The home directory of the root user. The root user or superuser is the main administrator of the system.
/lib
The libraries directory. This is the place where all the shared libraries are stored. Shared libraries are those binary routines that are used by the programs but are not part of the programs themselves. When a specific functionality, or a set of them, can be used for different programs, it is usually put in such a library. Then, the programs don’t have to incorporate that code over and over again. Instead, they just link to the library that contains the code so they can use it. As you can imagine, this architecture has the advantage that a bug fixed for a specific functionality, that is fixed within a library, automatically and simultaneously fix the same bug in all the programs that use that functionality through the library. It is very effective and largely used.
/media
Each storage device, before it can be used, needs to be mounted, otherwise the OS and the programs cannot access it. This directory is the place where removable media is mounted. CD-ROMs and USB disks, for example, are mounted in this directory and accessed from there.
/opt
As already mentioned earlier, this is the directory where optional applications are installed. These are applications that are not specifically part of the distribution.
/proc
This is a directory containing files that describe the current status of the system and all its processes. These files are continuously updated by the OS and we can only access them to read their content. Unfortunately, although the contents are written in text format, they are kind of criptic to understand. However, there are several programs and commands available that are made just for read and decode the contents of the proc files in a more human readable format. Commands like top and iostat, or applications like System Monitor, simply access the files in this directory and present their contents to you in a nice format.
/srv
This is a repository for certain data used by the system when it is used as a server. It usually comes empty with a new installation.
/tmp
A temporary container for everybody to use. Depending on the policy defined for the system, files in this directory can be deleted without notice, either because they are odler than a certain age, or at predetermined days and times. Certain programs use this directory to hold temporary data. Users can put files in it too, but they have to be aware that they are not going to stay there forever.
/usr
This is the directory containing programs for the regular users, their documentation, everything related to X, files used by compilers, libraries, and a lot of other interesting stuff. To stalk abou tthis directory alone, it would take a lot of space so, just go browse into it; you’ll be surprised by the amount of information in it. This directory was the original home directory for all the users in the UNIX systems, but its usage has been restricted more and more over time, until today is only usage for files that are of support for the users. The home for the users has moved to /home. However, some UNIX versions still use /usr the way it was originally intended.
/var
This directory is divided in several subdirecories. It is the repository for most of the log files generated in the system, along with some configuration files, cache, users’ mailboxes, spools for various programs, and a lot more.
Virtual Machines (1)
What is a Virtual Machine?
If you run a Google search for the definition of a Virtual Machine, you get something like this:
In computer science, a virtual machine (VM) is a software implementation of a machine (computer) that executes programs like a real machine.
A software emulation of a computer that runs in an isolated partition of a real computer; A computer system that is implemented in software …
A simulated computer in that it runs on a host computer but behaves as if it were a separate computer.
…. and several more definitions.
But what that really means for us? Simply put, think of the Virtual Machine as a program that runs on your computer and acts like a computer in itself, a computer where you can install an operating system, like MS-Windows or Linux, and where you can run programs for that operating system. And that operating system may even be different from the one installed on your actual computer.
OK, you might say, so what? I already have a computer and I have already my Windows 7 happily running on it. Why would I care to use a Virtual Machine to install another OS? Couldn’t I do that simply by double booting my machine? I could install both Windows 7 and Ubuntu, for example, and when I turn on my computer I choose which one to use.
True, that’s a very good point. But think about this now: what if you want to go back and forth from one OS to the other? What if you are running an application on Windows, for example, and then you want to run another one from another OS? Ubuntu, for example, or Mac.
Do you start seeing the point? One of the great things of using a Virtual Machine, is the possibility to run side by side programs that can only run on a specific platform. So, for example, you could be able to run at the same time an application from a Mac computer and another one for a Windows computer, keep them side by side, and be more productive that having to reboot your computer every time you have to switch from one application to the other.
Or maybe, you are one of those guys who likes testing all possible programs that come in your hands and, once you are done, you may want to discard some of the programs from your computer, leaving no traces of it. Using a Virtual Machine to emulate a PC would just help you on this. The Virtual Machine would create an isolated environment for you where you can do all the experiments you like. Then, once you’re done, you could actually remove from your computer the whole Virtual Machine and leave your PC exactly the way it was before you started your experimentation.
And what about browsing on the Internet with the constant fear that you could catch a virus that would infect your machine and damage it? Again, a Virtual Machine would help you in this case, because the virus would be imprisoned in it, unable to spread in to the host computer where the Virtual Machine runs. Stopping the Virtual Machine and simply deleting it, would eliminate the virus from your computer with very little effort.
Finally, running a virtual machine on your computer, would allow you to run that old program that you liked so much and that is not supported anymore in your new version of Windows. How about that?
Am I intriguing you? I really hope so because when you’ll experience all the benefits that the use of a VM can bring to you, you will actually starting loving it.
So, follow me through the next posts, and I will show you how you can actually install a Virtual Machine Manager on your computer (any OS you are using will work), and how you can use it to safely browse the web, or run your old programs, or experiment with a new OS by installing it and all, without altering the setup of your real computer.
See you soon and … Happy browsing.
Online Safety for the Kids (2)
Here we are with the second part of this topic discussion.
This time I will talk about how I setup the proxy server in my Linux box to provide blacklist capability in my network. As explained in Part 1, the proxy server intercepts any web page request made from a browser and checks the requested URL against a blacklist. If the requested web site matches one present on the blacklist, the proxy server directly responds to the request with an error page, preventing the actual web site to respond to the request itself. If there is no match, then the request is forwarded to the actual web site, and the response goes back to the browser that made the request.
Seems complicated? Well, it really is more difficult to explain that to see it working. Fortunately, after a few steps to set it up, the program does all by itself, without any further control on your part.
Please be advised that the procedure I’m going to present reflects what I did on my Linux box, which is equipped with the Fedora 11 distro. If you have a different Linux distribution, the procedure may change a little bit, as different distributions use different ways to download packages and sometimes they store the configuration files in different places. So, please refer to the documentation of your distribution for further details. I also used Squid as my preferred proxy server. If you decide to use a different one, please use this discussion only as a high level reference and read the documentation of your proxy server for the details.
Along with Squid, I also installed squidGuard , which is the actual tool that handles the blacklists and runs under Squid.
Note also that there are two ways to make the browsers in your network use the proxy server for their web access. One way is to configure each browser to use the proxy. Another way is to configure the network so that all the requests are automatically redirected to the proxy. Of the two solutions, I decided to go with the first one that, at the moment, seemed the simplest to implement. However, keep in mind that such mechanism may be counteracted if people change the browser configuration to bypass the proxy server. In such a case, you may want to consult the proxy server documentation to implement the second solution. Right now I didn’t have the need to do so.
And finally the installation and configuration procedure:
Download and install the Squid package on your Linux box. I easily accomplished that by using the Add/Remove Software tool available in my Fedora 11 Linux Distribution.
Create a Squid configuration file named squid.conf on the Linux box in the following directory: /etc/squid. You can download my copy of the configuration file here (right click and choose Save Link from the context menu). Note that my configuration file already contains the reference to squidGuard to redirect the browser request to the error message. If you don’t use my configuration file, please make sure you add the redirection instructions for squidGuard.
Download and install the squidGuard Package. Again, you can use the Add/Remove Software tool or the tool that comes with your Linux distribution.
If not already there, create the directory /var/squidGuard and copy there the script shalla_update.sh . My own copy of the script is available here for download.
Download the blacklists by running the script shalla_update.sh. Make sure you do so with root privileges. You’ll see that the new directory /var/squidGuard/blacklists will be created. Note that, for squidGuard to work correctly, the mysql service must be running in your Linux box. I will assume here that you know how to do that but, in any case, post a comment to address the issue and I will reply with the necessary information.
Create the configuration file for squidGuard. Believe it or not, it is named squidGuard.conf and needs to be located in the directory /etc/squid , along with squid.conf. A copy of my own version of this file can be downloaded here . You will have to edit this file to define the blacklists that you would like to use. Use those that I selected as an example on how to do it, and look under the directory /var/squidGuard/blacklists for the complete set of available blacklists.
At this point everything we need is installed and configured. We just need to learn how to actually start Squid. To do so, the easiest thing is to execute the following command as root:
service squid start
If everything was done correctly, squid will start running happily until you shut down the box.
And here comes a little problem: when you turn the box back on, Squid will not be running anymore! To avoid the inconvenience of manually start Squid every time you reboot your machine, you’ll have to tell the computer to automatically do so. This is achieved by running the following command as root:
chkconfig –level 345 squid on
Once that is done, you don’t have to worry anymore to start Squid. The computer will take care of that automatically every time you turn it on.
I’m sure now you are wandering about how the blacklists are updated. In fact, people continuously keep adding new web sites and new pages. How can we keep up with all the changes worldwide? Well, we don’t have to do so. The Shalla organization takes care of that for us. We only have to run again the script shalla_update.sh every now and then, so the blacklists in our computer get updated. I do so by running the script every night, to make sure I catch all the most recent updates. You may choose to do the same, or instead do that once a week or once a month, depending on how long you feel comfortable to wait between updates. Anyhow, don’t waste your time doing updates more than once a day. The blacklists on the Shalla web site are updated only once a day, so there is no good in running the script more often than that.
That’s all, right? Hum… no, there is just one more thing: you have to instruct the WEB browsers in all your computers to point to the proxy server, so they will forward the users requests to Squid rather than directly to the WEB sites. This procedure depends on the browser you are using. I will show you how to do it for Internet Explorer and for Firefox. Other browsers, like Opera or Chrome, use a simila procedure.
Setting the proxy server in Internet Explorer:
Open the Tool menu and select Internet Options . Click on the tab Connections . Now click on the button LAN Settings and, in the dialog that comes up, select Use a proxy server for your LAN . Then add the IP address of your Linux box in the Address box and the number 8080 in the Port box (if you changed the port number in squid.conf, then put your number here, otherwise 8080 will work just fine). Click OK to close all the dialogs and accept the configuration changes. You are ready to go.
Setting the proxy server in Mozilla Firefox:
Open the Tool menu and select Options . Click on the Network tab. Click on the Settings button. Click on the radio button for Manual proxy configuration . Now write the IP address of the Linux box in the HTTP Proxy box and the number 8080 in the Port box (if you changed the port number in squid.conf, then put your number here, otherwise 8080 will work just fine). Check the box Use this proxy server for all protocols and click OK to close all the dialog.
OK, done. Now it is time to test the browser and make sure it works as expected. Try to request some WEB pages and make sure they are correctly retrieved. Try to request a WEB page you now is in the blacklists and check that an error will be reported and the web page is not retrieved.
If everything works fine, you’re done. Otherwise, review all the previous steps and make sure you didn’t miss anything. If you still have problems, drop me a note and I’ll try to give you some extra advise.
Thank you all for following me through this long exposure. I hope it wasn’t too much boring and that somebody may actually find it useful.
Happy browsing and … see you next time.
Online Safety for the Kids (1)
Ever faced the problem of let your own kids browsing the web without your supervision?
Here is the problem I had to solve recently: my daughter was about to turn 16 and she was eager to have her own computer on which to do her homework, handle her e-mail, socialize with her friends on-line, and do some research on the web. All of that without being subject to take turns with her brother on the family room computer.
Put it like than, everything seems to be very innocent and safe. But we know very well what lurks on the Internet, ready to jump on their prey. Or maybe just some inappropriate site that you really don’t want your kids to see.
So, how to solve the problem of giving my daughter her own computer, installed in her own room, where both my wife and I cannot really supervise?
I’m sure many of you are already thinking at the many programs available on the market that deal with these kind of things. Programs that you buy, and then you have to pay a subscription to keep updated the database with the blacklisted web sites.
The point is, I don’t like those programs for different reasons:
I have little or no control on what can be put on those blacklists
I have to pay for the program
I have to pay the annual subscription or the database that comes with the program becomes quickly obsolete and basically useless.
What to do then?
Well, I happen to have an old computer that I use to experiment with Linux (you heard about it, right?). This is the kind of operating system that many web site providers use to handle their servers. It is a very powerful OS, it is very stable (you can keep the computer on for weeks without ever needing to reboot it). And it is free. Since I had that, I thought: why don’t I use this computer (the Linux Box, as it’s named by the Linux community) to run a program capable of intercepting and filtering all the web traffic on the home network? Since it is Linux, I have the opportunity to look at the huge list of open source code available on this platform, and surely I will find something that can be used for my purpose.
So, I started studying the problem and came up with a very simple, efficient , and absolutely free solution for my problem: a program that acts as a proxy, like a middle man that sits between each computer in the network and the Internet itself, filtering everything that goes back and forth, and selecting what can be viewed and what cannot. It’s name: SQUID . Yeah, like the name of that very tasteful mollusk.
So I set it up on my Linux Box, made a few adjustments to the network configuration, installed its companion squidGuard , configured the blacklists the way I like it, and let it run.
It is now running in my home network since about a month, smoothly and efficiently, and I have to say it really does a good job the way I desired.
Want to know the details? Keep watching this blog. Next time, I will describe all the details on how to set it up. And don’t be scared. It is not a difficult thing to do. If you ever had the need to install a program on your computer, then you are expert enough to handle this one too.
Hope to see you soon here again…