Ubuntu/Fedora Linux Notes on System Administration

A few notes on the use of Linux and Linux system administration

To Install a Debian package manually

To install a debian package. The .deb file is downloaded from the web and saved in the local directory. This is useful for packages that can't be automatically downloaded and installed with aptitude.

dpkg -i ./tth_3.67-4_i386.deb

Apply a Shell command to files through Find

$ find . -name '*.html' -exec grep -Hi 'mailto:foo@yahoo.com' {} \; 

Here we have asked find to start in the current directory, look for an html file, *.html, and execute -exec the grep command on the current file, {}. When using the -exec action, a semicolon, ;, is required, as it is for a few other actions when using find. The backslash, \, and quotes are needed to ensure that BASH passes these terms through so they are interpreted by the command rather than the shell.

Add a new user with Linux command line

Add a new user name "newlogin" whose primary group is "webedit". A new group with the same name as the user's login is automatically created if -g is not specified. Then the user's temporatory password can be given with "passwd". Type "sudo /usr/sbin/userdel -r newlogin" to remove the user and start over if there is a mistake.

$ sudo /usr/sbin/useradd newlogin -g webedit
$ sudo /usr/bin/passwd newlogin
Changing password for user newlogin
New UNIX password: ********
Retype new UNIX password: ********
passwd: all authentication tokens updated successfully.

Use the interactive script called "adduser" on Ubuntu Linux (version 6.10):

[U ~]% sudo /usr/sbin/adduser newlogin
Adding user `newlogin'...
Adding new group `newlogin' (1002).
Adding new user `newlogin' (1001) with group `newlogin'.
Creating home directory `/home/newlogin'.
Copying files from `/etc/skel'
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Changing the user information for newlogin
Enter the new value, or press ENTER for the default
        Full Name []: Firstname Lastname
        Room Number []:
        Work Phone []: 646.123.4567
        Home Phone []:
        Other []:
Is the information correct? [y/N] Y

Apt-get Key Problems

Once in a while "apt-get update" prints a warning that the ubunutu key id is invalid, like this:

W: GPG error: http://medibuntu.sos-sts.com edgy Release: The following 
signatures were invalid: BADSIG 2EBC26B60C5A2783 The Medibuntu Team 

W: You may want to run apt-get update to correct these problems

As far as I can tell, the ubuntu packages are signed by the distributors' public keys. This GPG error says that your apt-get utility can't verify the key id it knows (stored with the apt-key utility) with the key it finds on the distributor's website. The following commands seem to clear the problem.

# Retrieve the ubuntu key with wwwkeys.eu.pgp.net, store it in my keyring
gpg --keyserver wwwkeys.eu.pgp.net --recv-keys 437D05B5
# Add the retrieved key along with other existing keys in the keyring
sudo apt-key add ~/.gnupg/pubring.gpg
# apt-get update without cache to force the use of the newly added key(s)
sudo apt-get update -o Acquire::http::No-Cache=True

The last step seems essential. I tried resetting back to the defaults with synaptic. But it did not immediately clear the warning until I purged the cache.

Apt-get Distribution Upgrade

To upgrade to a new ubuntu distribution, do the following:

sudo update-manager -c -dor # changes sources.list to edgy/feisty
sudo apt-get dist-upgrade   # upgrade to the next distribution

"sudo dist-upgrade" will only upgrade your current version of ubuntu.

Sending email from Linux inside a firewall

I use mutt to manage my emails. In my setting, mutt works with other software programs in order to compose (with emacs), send (with msmtp), and sync (with offlineimap) emails with a MicroSoft Exchange mail server. Here I summarize how this is done.

These auxiliary software programs are defined in my ~/.mutt/muttrc file. In that file you define how mutt sends and receives emails. The `set editor' directive specifies which text editor to use to compose emails. Mine is `set editor ="/usr/local/bin/emacs -q -u yuelin -nw --load ~/.mutt/striphtml --load ~/.mutt/post --load ~/.mutt/mutt %s"'. It tells mutt to use /usr/local/bin/emacs as the default editor. The default editor can be changed to any text editor of your choice, like nano or vi. When I compose an email message, Mutt starts emacs and loads additional files. For example, the "--load ~/.mutt/striphtml" file contains EMACS-LISP commands that I wrote to strip HTML tags from emails. After the tags are removed, it is much easier to insert my responses in between the lines.

The "set sendmail=/usr/bin/msmtp" tells mutt to use the msmtp program to send emails to a SMTP server for delivery. The msmtp program reads from your ~/.msmtprc file to find information about the SMTP server. Mine is like this:

account default
host msxangeserver.myinstitution.org
from my_id@myinstitution.org
auth login
user my_id
password ************
dsn_notify failure,delay
# logfile ~/msmtp.log
# If your SMTP server supports TLS encryption, uncomment the next line
# tls

Here my actual login id and password are obsured. You have to write them down in the ~/.msmtp file. So it is wise to change its file permission to 600 so that only you can view its contents. Each time you send an email, mutt calls the msmtp program, then the msmtp program contacts the server specified with the the "host" entry and uses the login id and password. I don't think MS Exchange supports tls encryption so the password and login are tranmitted in plain text.

mutt mailcap

rtfreader (http://www.fiction.net/blong/programs/#rtf) allows you to convert Rich Text File format into plain text. I downloaded the source (written in C by Brandon Long I think), unpacked it, and used "make" to compile the program and saved the program in /usr/local/rtfreader. To get mutt to use it, add this into the mailcap file.

application/ms-rtf;     rtfreader ; copiousoutput

Compile mutt from source

Mutt can be compiled from source. The latest version is 1.5.17 (as of November, 2007). The source code of mutt can be downloaded from http://www.mutt.org/download.html. I download the tar.gz file, untar it, and run the following command to configure mutt.

./configure --with-curses --with-regex --enable-locales-fix \
   --enable-pop --enable-imap --enable-smtp --with-sasl --enable-hcache \
   --with-ssl --with-charmaps --without-wc-funcs \
   --mandir=/usr/local/man | tee configure.log

The --with-curses command tells mutt to work with the libncurses libraries to handle terminal displays, such as the drawing of arrows in message threads (e.g., 'set sort = reverse-threads' setting in muttrc). Both the libncurses and libncurses-dev packages are needed. On my ubuntu machine I have the libncurses5 packages installed (see below for installed and not installed ncurses packages):

v   libncurses-dev                  -                                           
i   libncurses5                     - Shared libraries for terminal handling    
p   libncurses5-dbg                 - Debugging/profiling libraries for ncurses 
i A libncurses5-dev                 - Developer's libraries and docs for ncurses
i   libncursesw5                    - Shared libraries for terminal handling (wi
p   libncursesw5-dbg                - Debugging/profiling libraries for ncurses 
i   libncursesw5-dev                - Developer's libraries for ncursesw   

The --enable-locals-fix and --without-wc-funcs are needed to avoid international font problems. It is described briefly in the INSTALL file that comes with the source files. I still don't understand things involving international fonts. But by adding these options I can view international fonts with mutt.

Additional packages are needed, such as the sasl and openssl are needed to compile the encryption options. I usually use configure to find out what packages are needed in a loop of configure -> error -> add packages -> confiure until there is no more errors. It is probably not an ideal way of compiling sources codes, but it works with minimal time spent figuring out exactly what packages are needed.

SSH Scriptkiddie Attacks

SSH gets many scriptkiddie attacks. Basically, someone attemps to gain entry to your system through port 22 by guessing passwords. You see many entries of failed log-in attempts in /var/log/secure. You can use iptables to drop all port-22 entry attempts unless they are from specific ip addresses. The first two lines tells iptables to accept all ssh connections from 123.45.0.0 and 678.90.0.0. All other connections are dropped.

-A INPUT -i eth0 -s 123.45.0.0/255.255.0.0 -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -i eth0 -s 678.90.0.0/255.255.0.0 -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 22 -j DROP

PostgreSQL

I install Postgresql from source code. It only involves a few simple steps.

tar zxvf postgresql-8.1.3.tar.gz
./configure
make 
make install  
adduser postgres
mkdir /usr/local/pgsql/data
chown postgres /usr/local/pgsql/data
su - postgres
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
/usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data >logfile 2>&1 &
/usr/local/pgsql/bin/createdb test
/usr/local/pgsql/bin/psql test

By default postgresql is installed in /usr/local/pgsql. The default configuration file, postgresql.conf, is in a protected directory /usr/local/pgsql/data. To boost efficiency, add these to postgresql.conf.

listen_address="localhost"   # "*" allows access from the network
max_connections=7   # 150 is reasonable, 600 consumes lots of resources
shared_buffers
# pgadmin3 recommends that the autovacuum feature be turned on (default to off)
autovacuum=on
stats_start_collector=on
stats_row_level=on

To start postgresql, either use the "runuser" package

sudo /sbin/runuser -l postgres -c "/usr/local/pgsql/bin/postmaster \
        -D /usr/local/pgsql/data > /var/log/pgsql.log 2>&1 & "

Note that /var/log/pgsql.log should already exist and is writable by user postgres. Alterntiavely, you can use "sudo -u postgres" to run postmaster as user postgres.

sudo -u postgres /usr/bin/nohup /usr/local/pgsql/bin/postmaster \
        -D /usr/local/pgsql/data > /var/log/pgsql.log 2>&1 & 

To stop postgresql,

 sudo -u postgres /usr/local/pgsql/bin/pg_ctl 
                  -D /usr/local/pgsql/data stop 

Postgresql Output Into a CSV File

The result of an SQL query can be exported into a comma-delimited CSV file. This can be done using the COPY command in PostgreSQL. But the COPY command is different depending on the version of the PostgreSQL. In PostgreSQL 8-1, the follow syntax creates a temporary table. Then the COPY command exports the temporary table into a CSV file. In version 8.2 (the latest version as of June, 2007), the COPY command can directly dump a query into a file. The syntax below should work in versions 8.1 and 8.2.

BEGIN;
CREATE TEMP TABLE tempcsv AS
       SELECT * FROM weather WHERE city LIKE 'San%';
COPY tempcsv TO '/tmp/tempcsv.sql' DELIMITERS ',' CSV HEADER FORCE QUOTE city QUOTE AS '"';
ROLLBACK; 

It searches the table 'weather' for city names that contain 'San'. So cities like 'San Francisco', 'San Diego', and 'Santorini' are selected. City names often contain a space, so it is important to quote them with double quotes in the CSV file. The exported entries contain '"San Francisco",', but not 'San Francisco,'. Generally, columns of text strings should be quoted by a 'FORCE QUOTE colname' option. The HEADER option states that the variable names should be included in the first row of the CSV file. Many statistical packages automatically use the first row of the CSV file to name the variables.

Postgresql Administration

On ubuntu, this command creates a super suder (-s) who gets a prompt for a new password (-P) and whose password is encrypted (-E). The command to create this user is echoed to the stdout. The user must enter a password (-W) at login.

% sudo -u postgres /usr/bin/createuser -P -s -E -W -e yuelin
Enter password for new role:********
Enter it again:********
Password:********
CREATE ROLE yuelin ENCRYPTED PASSWORD '*********' SUPERUSER CREATEDB
CREATEROLE INHERIT LOGIN;
CREATE ROLE

A password is only significant if the client authentication method requires the user to supply a password when connecting to the database. The password, md5, and crypt authentication methods make use of passwords. Database passwords are separate from operating system passwords. Specify a password upon role creation with CREATE ROLE name PASSWORD 'string'.

phpSurveyor

phpSurveyor authentication

<Directory "/var/www/html/phpsurveyor/admin">
   Options FollowSymLinks
   AllowOverride All
   DirectoryIndex admin.php
#### YL ##################
#     This Directory can only be accessed by a valid-user (defined in
   the
# .htpasswd file below).  The following settings can appear in many
   places,
# including in the httpd.conf file for the Apache web server, in the
# /etc/httpd/conf/access.conf, or in the .htaccess file under each
# individual subdirectory that needs security. In physurveyor,
# the security settings are in the file
# /var/www/html/phpsurveyor/admin/.htaccess.
#
#     In order for this to work, I changed in httpd.conf the default
# setting of "AllowOverride None" to "AllowOverride All"
# because a "None" setting tells Apache to ignore all .htaccess files.
# "AllowOverride All" tells Apache to use the security settings
# in .htaccess if it exists. Therefore, security is disabled if
# .htaccess is removed.
#
# The .htaccess file contains the following settings:
   # AuthType Basic
   # AuthName "Survey Access Requires Password Login"
   # AuthUserFile /var/www/html/phpsurveyor/admin/.htpasswd
   # AuthGroupFile /dev/null
   # Require valid-user
# For AuthType, only Basic and Digest are currently
# implemented. It must be accompanied by AuthName and Require
# directives, and directives such as AuthUserFile and AuthGroupFile
# to work.
</Directory>

phpESP

How to install/remove phpESP

1. Install these packages for mysql and php support

    mysql-common [mysql-client|server] mysql-query-browser-common
         php5-mysql php5-mysqli mysql-admin  

The mysql package comes with no password for the root user. Add a new password for user root using mysqladmin.

    sudo mysqladmin -u root -p password 

2. Download and untar phpESP-1.8.2.tar.gz, which expands to /var/www/phpESP This should be done as root. If not, later you will run into problems when you change the file permission for phpESP.ini.php. Other details about the installation, see phpESP/docs/INSTALL.

3. Create the phpesp database and populate the tables

PhpESP uses MySQL to manage the surveys, items, users and survey data. The MySQL database is called "phpesp". The owner of this database is called "phpesp" by default. The default password for this owner is also "phpesp". These defaults are specified in an installation script called scripts/db/mysql_create.sql. As a security precaution, the default password of the database owner should be changed. Otherwise someone can connect to your MySQL server and take over your phpesp database manually. You can edit the file to change the default password to something more secure (like "SurvDBOwn3r"), as shown below:

-- ###### in mysql_create.sql ########
-- # Create a user called 'phpesp', with a password 'phpesp'. The
-- # 'localhost' limits this user to connecting to the database only
-- # via localhost. You may change the password by changing the
-- # PASSWORD('...') clause.
INSERT INTO user (host, user, password) VALUES ( 'localhost',
  'phpesp', PASSWORD('SurvDBOwn3r'));

The owner of the "phpesp" database is not to be confused with the default administrator of the phpESP login interface. The default administrator is "root" with a default password of "esp". When you use phpESP the first time, you should log in as "root" with the password "esp", not "phpesp" and "SurvDBOwn3r".

Run these scripts to create the "phpesp" database and to create the new database tables.

$ mysql -u root -p < scripts/db/mysql_create.sql
$ mysql -u root -p phpesp < scripts/db/mysql_populate.sql

4. Log in first time as root

Go to http://localhost/phpESP, log in as "root" with the default password of "esp" (see phpESP/docs/INSTALL). Change the default password to something else. After you have logged in, you will see a list of options for adding new users and creating new surveys.

5. Tighten up the security

Change the directory to admin/, and modify the file ownership to "root:www-data". Apache2 runs as root and www-data. So a file permission of 0440 will allow all users to read the phpESP.ini.php script.

$ sudo chown root.www-data phpESP.ini.pph
$ sudo chmod 0440 phpESP.ini.php

That concludes the installation of phpESP.

Sometimes you make a mistake during installation and you want to remove phpESP manually and rerun the steps above. The steps below will wipe out the phpesp database from MySQL.

$ sudo mysqladmin -u root -p drop phpesp 
$ mysql -u yuelin -p
mysql> use mysql;
mysql> DELETE from user WHERE User = "phpesp";
mysql> DELETE from db WHERE Db = "phpesp";

R and BLAS

You can install on your Linux system the enhanced BLAS (Basic Linear Algebra Subprograms) programs. You can ask R to use these instead of the ones that come with R. The "enhanced" BLAS may boost performance, although this is not explicited stated in the R Installation and Administration Guide.

I compile R from source. I decide to give it a try with R-2.4.1. I have both the "refblas3" and "atlas" packages installed. But I think you can skip refblas3 if you already have atlas. Using "reflbas3" packages alone seem to break during compilation. But having both seems fine.

sudo apt-get install refblas3 refblas3-dev refblas3-doc refblas3-test

sudo aptitude install atlas3-base atlas3-base-dev atlas3-doc 
    atlas3-headers atlas3-sse atlas3-sse-dev atlas3-sse2 
    atlas3-sse2-dev atlas3-test 

According to the "R Installation and Administration Guide", you need to tell R where to look for these external libraries. I manage two machines running unbuntu linux, one with an Intel CPU (32-bit) and the other with an AMD Athlon CPU (64-bit). On the Intel machine, the atlas libraries are installed under /usr/lib/atlas. So configure should go with:

./configure --with-blas="-L/usr/lib/atlas -lf77blas -latlas".  

On the AMD machine, I download the AMD Core Math Library from http://developer.amd.com/acml3.jsp, untar the file, and run the install*sh script to install the libraries in /usr/local/share/acml4.0.1. As shown in the R Installation and Administration Guide, configure should go with:

./configure --with-blas="-L/usr/local/share/acml4.0.1 -lacml_mp"

As far as I can see, on a ubuntu system with atlas installed, the only customization you need is editing the config.site file to change the PAPERSIZE from a4 to letter, then run configure with --with-blas="lf77blas -latlas". Then run these commands and you are done with compiling R from source: make; make check (if you want to check the installation); make dvi; make pdf; make info; sudo make install; sudo make install-dvi; sudo make install-info; and sudo make install-pdf;

R output from UTF-8 to ISO8859-1

R supports multiple languages in its output. On a Linux system, R consults the $LANG system variable (and others if $LANG is not set) during startup. For example, on my Ubuntu Gutsy machine the default $LANG is en_US.UTF-8. R therefore uses UTF-8 as the default language encoding in its output. This produces single quotes in UTF-8 instead of the latin1 encoding. These single quotes often appear as part of the printout for p-values. The problem is that they produce garbled output When they are formatted for printing (e.g., by a2ps, enscript, or mpage).

I use

recode UTF-8..ISO8859-1 ./output.Rout
a2ps -o output.Rout.ps ./output.Rout

to first recode the output.Rout file from UTF-8 to ISO8859-1 (the same as "latin1", see 'recode -l' for a list of formats). The recoded output is saved back into output.Rout. Then I use a2ps or enscript as usual (e.g., a2ps -o output.Rout.ps ./output.Rout).

Martin Maechler suggested a couple of alternative solutions (http://tolstoy.newcastle.edu.au/R/help/05/06/6200.html, accessed Dec 2007).

The UTF-8 encoding is part of the Unicode standard to implement a unified encoding system for multiple languages (http://unicode.org/).

RODBC and PostgreSQL

The RODBC package allows R to connect to a database server program that can communicate in ODBC (Open Data Base Connectivity). ODBC in essence provides a common language between different database computer programs. The ODBC specifications are used by MicroSoft in its SQL Server (including ACCESS). If a database program knows how to communicate in the ODBC language, then it can easily work with Microsoft SQL server and related software programs. The RODBC package in R does not directly talk to an SQL database. Instead, it talks to unixodbc (www.unixodbc.org), which serves as a translater between RODBC and a database program that understands ODBC. The base PostgreSQL does not understand ODBC, so you need another program called psqlodbc to communicate between unixodbc and PostgreSQL. To get RODBC to work with PostgreSQL, the unixodbc program calls a driver from psqlodbc. Psqlodbc then translates the ODBC requests from unixodbc into something that PostgreSQL understands. These two layers may seem unneccessarily complicated, but they work seamlessly. Here is how to get them to work.

You need to install unixodbc and psqlodbc. An alternative of unixodbc is iodbc (www.iodbc.org). But I have not been able to get iodbc to work. You can either download the unixodbc and psqlodbc source code from their respective websites and compile them, or you can use the package mangement sytem of your Linux distribution to install them. I use Ubuntu linux, so I use "sudo aptitude install unixodbc psqlodbc" to install both. The pre-compiled packages on Unbuntu are usually up to date. But if you really want the latest version, you can always compile them from the source code. Two other optional packages can also be installed with unixodbc: unixodbc-bin (a GUI) and unixodbc-dev (if you want to write programs that directly talk to unixodbc).

The final step is to write the ~/.odbc.ini file to include the following directives:

[ODBC Data Sources]
psqlgaat = PostgreSQL driver

[psqlgaat]
Description = Postgresql database for the GAAT survey
Driver = /usr/lib/odbc/psqlodbcw.so
# Driver = /usr/local/lib/psqlodbcw.so
Host        = localhost
Port        = 5432
Server      = localhost
ServerName  = localhost
UserName    = Yuelin Li
Database    = gaat
User        = yuelin

Inside ~/.odbc.ini, the [ODBC Data Sources] is a tag to list the names of all ODBC-enabled databases. Here we have only one, a data source called "psqlgaat". The [psqlgaat] tag defines the details. Note that the 'Driver=/usr/lib/odbc/psqlodbcw.so' entry specifies the location of a driver. This psqlodbcw.so binary file is the one that comes with psqlodbc. If you compile psqlodbc from source code, then the driver is probably in /usr/local/lib/psqlodbcw.so.

The unixodbc program then uses the psqlodbcw.so driver to communicate with the PostgreSQL server program. On my Unbuntu machine, the PostgreSQL server program listens to port 5432. Because I am running both R and PostgreSQL on the same machine, I can say 'Host = localhost'. If I am retrieving data from a remote machine with an IP address of 10.2.23.11, then I can specify it with 'Host = 10.2.23.11'.

In R, when a call to the odbcConnect() function is made (see below), it looks up the 'psqlgaat' data source in ~/.odbc.ini, and makes a connection to port 5432 as defined in that file.

> library(RODBC)
> ch <- odbcConnect('psqlgaat', uid='yuelin', pwd='********', case='tolower')
> sqlQuery(ch, "select * from att ")

If the connection is successful, then a "channel" is open between R and postgresql through unixodbc and psqlodbc. This channel can be used to use R to directly retrieve, store, and process data.

This is all you need to do to get RODBC to retrieve data from your PostgreSQL server program. If you want to compile the unixodbc and psqlodbc programs from source, read on.

The psqlodbc program is found at http://pgfoundry.org/projects/psqlodbc/, but the download site is elsewhere, at http://www.postgresql.org/ftp/odbc/versions/. As of October, 2007, the stable release is in the file psqlodbc-08.02.0500.tar.gz. The pg_config command, as part of the stock distribution of the postgresql-devel package, is needed to compile psqlodbc. It can be installed by "sudo aptitude install postgresql-devel" (in Ubuntu) or "yum install postgresql-devel" (in Fedora Core).

Untar the downloaded source code (tar xzvf psqlodbc-08.02.0500.tar.gz), run 'configure --with-unixodbc' (or --with-iodbc), 'make', and 'make install' to install the program. This should take less than a couple of minutes. The final step is to write the ~/.odbc.ini file as described above.

On Fedora Core 5

My Fedora Core 5 system is installed on an AMD64 CPU. UnixODBC-2.2.11 has to be compiled from source if you are running AMD64 (requires Qt-2.2 or higher). Also, unless you are running kde (I run gnome), the X11-based graphical user interface does not work. So you should run "configure --enable-gui=no" to disable GUI support. Without kde, the configure script can't seem to find the regular X11 includes and libraries, even after I have manually set X_INCLUDES=/usr/include/X11 and LIB_X11=/usr/lib64 and verified with the log entries in config.log.

After unixODBC has been compiled (by running 'configure --enable-gui=no', 'make', and 'make install'), you can download and compile psqlodbc as shown above. Edit the ~/.odbc.ini file and you will be able to use RODBC to connect to PostgreSQL.

RODBC and ACCESS Databases on Windows

R and GGobi

GGobi is an open source program for data visualization. Like XGobi, GGobi can be used to view high-dimensional data in 3-d, rotating the data points to generate a vivid 3-d perception. I think the best use of packages like XGobi and GGobi is the "tour" fucntion, which rorates the data in 3-d view to better see the clusters of data points.

In R, there is a library(rggobi) that bridges between R and the GGobi program. First, intall.packages(RGtk2) must be done in R because GGobi depends on it. To install.package(RGtk2), you need the libxml2-dev package in ubuntu (sudo aptitude install libxml2-dev).

GGobi has to be installed on the Linux system (compile from source, see instructions on www.ggobi.org):

$ bunzip2 ./ggobi-2.1.4.tar.bz2
$ /configure --with-all-plugins
$ make
$ sudo make install
$ make ggobirc
$ sudo mkdir -p /etc/xdg/ggobi
$ sudo cp ggobirc /etc/xdg/ggobi/ggobirc

Then you run install.packages("rggobi") in R with root permissions to install the rggobi library. I am compiling GGobi from source because I compiled R from source. Although you can install the ubuntu ggobi package by 'apt-get install ggobi'. The ubuntu ggobi package seems to be broken. It can't seem to find the plugins it needs. Also, it does not seem to include the GGobiAPI.h header file, which is needed during install.packages("rggobi"). R expects to find the GGobi program in /usr/local, while the ubuntu package is installed in /usr/.

Yum updates

Yum requires gpg key from RedHat to verify RPM packages built and signed by Red Hat. The following command should be run by root when Fedora Core is first installed.

sudo rpm --import /usr/share/doc/fedora-release-*/RPM-GPG-KEY*

xpdf paper size

When I install a fresh copy of ubuntu from CD, I often find a puzzling thing. I add the printers, set the paper size to letter. But xpdf, gv, and other tex/latex programs still prints pages in the a4 format. And it often causes the printer (e.g., HP laserjet 4200) to pause and ask me to insert a4 paper in Tray 1.

The following steps pretty much take care of the problems

I run "sudo texconfig" to change the papersize in tex/latex. It seems to fix gv/gs as well. xpdf first looks into the papersize definition in /etc/papersize, wich defaults to a4. You can override it with the psPaperSize definition in /etc/xpdf/xpdfrc. I have changed the only line in /etc/papersize from "a4" to "letter".

teTeX and TeXLive

On ubuntu, there are two falvors of TeX/LaTeX packages to choose from. The tetex package is the LaTeX distribution from CTAN. The TeXLive is based on tetex, but contains more support for languages and BibTeX styles (e.g., apacite.sty). My choice is to stay with tetex because it is the "definitive distribution" on CTAN and also because other packages I often use (like prosper for making slides) still depend on tetex. I think the latest prosper works with TeXLive, but not yet seamlessly. I installed the prosper source to /usr/share/texmf-TeXLive/tex/latex and got it to work. But I got warnings. Perhaps later I will switch to TeXLive, when it works better with these packages.

Since apacite.sty does not come with tetex, it has to be installed manually. There are no automatic installation scripts, so you must be careful where you install the source files. First download the apacite.zip file, unzip it, and put the .bst and .bib files under /usr/share/texmf-tetex/bibtex/. The .bst file goes into /usr/share/texmf-tetex/bibtex/bst and the .bib file goes into ./bib. BibTeX will not find the .bst file under the bib subdirectory, even after texhash. The apacite.sty file goes under /usr/share/texmf-tetex/tex/latex.

Run the /usr/bin/X11/texconfig command (why it goes under /usr/bin/X11 I have no idea) as root to set the default paper size and /usr/bin/X11/texhash to remap the installed files.

To change the default paper size, use these commands

sudo texconfig-sys paper letter
sudo texconfig-sys dvipdfm paper letter
sudo texconfig-sys xdvi paper us

Type "texconfig dvipdfm paper" with the last piece missing and you get a hint on how to complete the last piece. For example,

$ texconfig dvipdfm paper
Usage: texconfig dvipdfm paper PAPER

Valid PAPER settings:
  letter legal ledger tabloid a4 a3

Network

nmap can be used to scan the opening ports of a host, use

    nmap -P0 -sT -v idecide.mskcc.org

Don't ping host before scanning them "-P0", TCP connect() scan "-sT" (the most basic form of TCP scanning), verbose "-v"

Backup and Restoring

Remoting syncing a web development machine and a live server.

    rsync -auvze ssh /var/www/ yuelin@idecide.mskcc.org:/var/www/html/

The switches mean that it will be done in archive mode (-a), skip files that are newer on the receiver (-u), verbose (-v), compress file data curing the transfer (-z), and with a shell tunnel (-e ssh).

Using Patches (from http://www.kegel.com/academy/opensource.html#patches.using) To use a patch -- that is, to automatically carry out the changes described in a patch file -- you run a program called patch. For instance, if you're trying to apply the patch 'blarg.patch' to a package called foobar-0.17, you might say

cd foobar-0.17; patch -p1 < ../blarg.patch

That would merge the changes from blarg.patch into your source tree. (The -p1 tells patch to ignore the first directory in filenames in the patch; that way a patch generated against the directory foobar-0.11 will still apply properly.)

patch is part of the Gnu Project; to learn more, read "Merging with Patch" in the GNU project's manual for diff.

PostgreSQL and MySQL

As of MySQL v.4.1, the mysqld server program no longer accepts outside requests to connect to port 3306. Only connections coming from the local machine (represented as 127.0.0.1) are allowed to connect to port 3306. This default is set in the /etc/mysql/my.cnf configuration file.

bind-address          = 127.0.0.1

Which says that port 3306 is bound to localhost (127.0.0.1). Commenting out this entry will force mysqld to open and listen to port 3306. Outside computers can use this port to communicate with the mysqld server program. This can mean compromized security. So iptables can be added to provide firewall protection. A different approach is needed when you are not allowed to change the MySQL server's settings. Since most linux servers allow ssh connections, you can connect to mysqld on the remote server by an ssh tunnel.

This ssh-tunnel method applies to both mysql and postgresl. First, you run ssh to map the local port 3307 (-L 3307:) to the destination host port 3306 for MySQL (remotehost:3306). Then port 3307 of your local machine is mapped via an ssh tunnel to port 3306 on the destination machine. When you issue a connect request on the local computer to 127.0.0.1:3307, your connection goes from 127.0.0.1:3307, through an ssh tunnel, to the remote machine. Working as if the connection is coming from the remote host, you are allowed to connect to port 3306 on the remote host.

ssh -N -f -L 3307:remotehost:3306 user@remotehost
mysql --user=user --password --host 127.0.0.1 -P 3307

Note that --host 127.0.0.1, not --host localhost. For some reason it has to be in the number 127.0.0.1, not localhost.

PostgreSQL open port 5432

In /usr/local/pgsql/data/postgresql.conf

listen_addresses = '*'    # means all ip addresses

In /usr/local/pgsql/data/pg_hba.conf, add this line to allow user "yuelin" to access database "gaat" from 192.168.2.* (192.168.2.1/24 means that the first 24 bits are masked).

# TYPE    DATABASE    USER        CIDR_ADDRESS          METHOD
host      gaat        yuelin      192.168.2.1/24        trust

Stop and restart postmaster with the "-i" option to force postmaster to listen to tcp/ip connections.

/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data stop

/usr/local/pgsql/bin/postmaster -i -D /usr/local/pgsql/data > /var/log/pgsql.log 2>&1 &

Aqua Data Studio 4.7

The Aqua Data Studio is a database tool that can run across different computer hardware platforms. It can be downloaded for free from www.aquafold.com. On Linux x84_64, you want the "java enabled os" package, not the x86 version which seems to only run on an i386 flatform.

To install, the downloaded package is saved by root under /usr/local/, then unpacked using either tar or unzip to /usr/local/datastudio. Then you will need to edit the datastudio.sh shell script so that it uses the desired java version on the system (on mine it is /uar/local/java/bin/java).

netstat -ntl

The "netstat -ntl" command shows the ports that your computer listens to. The third entry shows that your computer is listening to port 3306 requests from all computers (0.0.0.0). But your computer is only listening to port 5432 (PostgreSQL) from the loopback (127.0.0.1:5432).

[U yl_home]% netstat -ntl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 127.0.0.1:32769         0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:32770         0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:139             0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:8081            0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:25              0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:445             0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:7741            0.0.0.0:*               LISTEN
tcp6       0      0 :::80                   :::*                    LISTEN
tcp6       0      0 :::22                   :::*                    LISTEN

Use crontab for backup

Use this to edit user root's crontab. A new one is created if no existing one is found.

sudo crontab -u root -e

Everyday at 1:53 a.m., backup files to the tape.

#ident  "@(#)root       1.19    98/07/06 SMI"   /* SVr4.0
1.1.3.1       */
# /usr/local/bin not in PATH, override or isync won't work, see man
cron
PATH=/usr/local/bin:/bin:/usr/bin:/usr/sbin:/usr/local/sbin
# run "crontab ./yuelin" as user yuelin to put this to
/var/spool/cron/yuelin
# crontab does not like blank lines and it complains about "unexpected
# end of line" in an e-mail sent to the user
# minutes  hours  day-of-month   month   weekday
#  0-59     0-23   1-31           1-12    0-6 (0 = Sunday)
# */3 means every 3 minutes
53 1 * * * /bin/tar cvzf /dev/st0 -C / etc var/www home/yuelin var/lib/mysql \
usr/local/pgsql/data 2>&1 > /dev/null

In the cron table the command is in one long line stretching over 80 columns. Although not necessary, it is splitted here by a backslash for better apparence on a web browser. The "-C /" part tells tar to first cd to /, then archive the directories without absolute path names. Without this "-C /" option proceding the directories, tar sends a warning that it is "Removing leading '/' from member names".

Building EMACS from source

The pre-packaged emacs you get from Ubuntu or Fedora Core uses standard X11 fonts, which look ugly compared to the nice anti-aliasing fonts available in these Linux distributions. If compiled from the source code, emacs can incorporate these anti-aliasing fonts. This section describes how to compile emacs from source. Also see http://www.emacswiki.org/cgi-bin/wiki/XftGnuEmacs).

The latest source files (version 23.0.60.2, last tried: April, 2008) should be downloaded first:

  $ cvs -z3 -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/emacs co emacs
 

No other patches are needed to get it to work. Next, run

   $ ./configure --prefix=/usr/local --enable-font-backend --with-xft --with-freetype --with-x-toolkit=gtk --with-xaw3d
   $ make bootstrap 
   $ make 
   $ sudo make install

Make bootstrap takes a while. Other than that, it is installed quickly. This installs emacs to /usr/local/emacs. The emacs binary command is installed in /usr/local/bin/emacs.bin. Type this to run emacs:

  $ emacs --enable-font-backend --font "Bitstream Vera Sans Mono-14"

Or by setting X resources.

  Emacs.font: Bitstream Vera Sans Mono-14

But --enable-font-backend won't work -- so it has to be added as an "alias emacs = /usr/bin/emacs", in my case in my ~/.zshrc file because I am running zsh. Then emacs knows to --enable-font-backend and look up the Bitstream font defined in ~/.Xresources.

Emacs uses the xaw3dg widget set to plot the icons on the menu bar, and libpng and lipjpeg. They should also be installed before you compile the source.

sudo aptitude install xaw3dg xaw3dg-dev

Installing ESS and Auctex after building EMACS

ESS (Emacs Speaks Statistics) is installed manually, by downloading the ess-.tgz file and un-taring it to /usr/local/share/emacs/site-lisp/. In the $HOME/.emacs file, make sure you have an entry (require 'ess-site) or (load 'ess-site).

Auctex installation can be done with the ./configure, make, make install steps. Make sure that /usr/local/bin/emacs is the original emacs binary file. Then just run ./configure, make, and sudo make install and an auctex subdirectory is installed under /usr/local/emacs/site-lisp.

Emacs byte-code

To compile byte-code for an emacs lisp file, use

/usr/local/bin/emacs -batch -q -no-site-file -f batch-byte-compile ./ess-site.el

This compiles the .el file into the .elc byte-code in emacs. Byte-code files load faster.

Compile R on an x86_64 system

On an i386 system, these variables are automatically set by the configure script to get tcl/tk support. Tcl/tk is necessary for some graphical user interface support such as the Bayesian network interface in the package "deal".

TCLTK_CPPFLAGS='-I/usr/include/tcl8.4 -I/usr/include/tcl8.4 -I/usr/X11R6/include'
TCLTK_LIBS='-L/usr/lib -ltcl8.4 -L/usr/lib -ltk8.4 -L/usr/X11R6/lib -lX11'
TCL_CONFIG='/usr/lib/tcl8.4/tclConfig.sh'

In an x86_64 system, R can be built to run in 64-bit or 32-bit. The "R Installation and Administration" document on www.r-project.org says that a 64-bit R is better for large data objects, but it takes longer than a 32-bit to run due to larger system objects in memory. Also, tcl/tk support is sometimes only available in a 64-bit build. On Fedora Core 5, tcl/tk comes in 64-bit versions by defaulty, and is installed in the 64-bit library directory /usr/lib64, not in the 32-bit library directory /usr/lib. So the configure script cannot find tcl/tk and complains in config.log about not finding tkConfig.sh and tclConfig.sh. Setting these variables manually in config.site solves the problem.

TCLTK_CPPFLAGS='-I/usr/include -I /usr/include/tcl-private/generic -I/usr/X11R6/include '
TCLTK_LIBS='-L/usr/lib -ltcl8.4 -L/usr/share -ltk8.4 -L/usr/lib64 -L/usr/X11R6/lib -lX11'
TCL_CONFIG='/usr/lib64/tclConfig.sh'

iptables to block brute force attacks

The following iptables scripts first use the "iptables --set" option to remember the ip address of a machine that recently tries to gain access to your machine. Then it uses the "--seconds 60" and "--hitcount 4" options to restrict from the same machine 3 attempts per every 60 seconds. Thus repeated automatic ssh login attempts are dropped. This tip is from Debian Admin article 187.

sudo /sbin/iptables -I INPUT -p tcp --dport 22 -i eth0 -m state --state NEW -m recent --set
sudo /sbin/iptables -I INPUT -p tcp --dport 22 -i eth0 -m state --state NEW -m recent --update --seconds 60 --hitcount 4 -j DROP
sudo /sbin/service iptables save
sudo /sbin/service iptables restart

Syncing system time with a time server

NTPD is the default method to sync the system time of a Linux computer with a time server (e.g., time-a.nist.gov). But ntpd does not seem to work behind a firewall. So the alternative is to run the "rdate" command in a cron job (see "sudo crontab -e").

Rdate can be installed by running "aptitude install rdate". It is not part of the "openntpd" package. The rdate command below is used to sync the system time with the abovenet.com time server in New York City.

sudo /usr/sbin/rdate -a time-a.nist.gov

The -a option uses the adjtimex command to gradually adjust the system time to avoid a jump in the system time. According to the openntpd.org website, rdate is not as accurate as ntpd, but it works behind a firewall.

SSH through a tunnel in a firewall

From home laptop

rsync -auvze "ssh middle ssh" machine_behind_firewall:~/source/ ~/laptop_dest/

Ubuntu on IBM Thinkpad T30: Wireless Driver

Add these lines in /etc/modprobe.d/blacklist to force the kernel to load the orinco drivers. The kernel tries to load the driver that fits the hardware. As shown in lspci, kernel probably tries to load a driver for the "Prism 2.5" hardware, but sometimes the driver fails to work. So the wireless connection (eth1) fails to work.

blacklist prism2
blacklist prism2_pci
blacklist hostap_pci
blacklist hostap

Change syslog

On most UNIX systems the activities of the system are logged by the syslogd daemon. On Linux, the log files can be found in /var/log. You can configure the /etc/syslog.conf to control what gets logged and how much detail you want to keep in the logs. For example, I use the following entries to control the logging of cron jobs.

cron.!info                      -/var/log/cron.log

It means that I want all 'cron' jobs to go to /var/log/cron.log.

The '-' in '-/var/log/cron.log' means that I do not want to sync the cron.log file every time a new log entry is recorded. The '!info' part in 'cron.!info' means that I want to log all cron jobs EXCEPT (the exclamation mark represents negation) information entries. What gets logged are warnings, errors, and major problems.

On my Linux system, I use offlineimap in a cron job to sync emails between my local Linux box and the remote email server. This generates many activities because I sync the emails every 1 minute between 6 a.m. and 11 p.m (see the '*/1 6 - 23 * * *' entry below) and every 59 minutes between 0 a.m. and 5 a.m.. I do not want to log so many entries unless they are errors or warnings.

*/1 6-23 * * * /usr/bin/offlineimap -u Noninteractive.Basic 2>&1 > /dev/null
*/59 0-5 * * * /usr/bin/offlineimap -u Noninteractive.Basic 2>&1 > /dev/null

So I tell syslog not to update the cron.log file every time I run offlineimap (the minus in '-/var/log/cron.log') because it takes time to update the file on the hard drive. Also, I tell syslog to record only the non-info entries.

To activate /etc/syslog.conf, on my Ubuntu Linux system I restart the syslog daemon by running 'sudo /etc/init.d/sysklogd restart'.

Mouting Shared Drive

apt-get smbfs (which gives you the /sbin/mount.cifs support you need). Othersise the following mount command can't get to the shared drive because the shared files by mount are "read-only".

sudo aptitude search smbfs   # smbfs has been installed?
sudo /bin/mount -t cifs -o ip=123.123.123.123,username=WORK/myid, \
     password=******,workgroup=WORK,uid=localid,gid=localgrp \
     //123.123.123.123/Shared /mnt/hdrv


sudo gconftool-2 --direct --config-source xml:readwrite:/etc/gconf/gconf.xml.defaults --type bool --set /apps/panel/global/tooltips_enabled false

Some of the Microsoft TrueType fonts can be installed with the msttcorefonts package. You need to reload X (e.g, quit gnome and login again) to access them. After a reload, both Firefox and openoffice should have access to these fonts.

[U ~]% sudo aptitude show msttcorefonts
Package: msttcorefonts
New: yes
State: installed
Automatically installed: no
Version: 1.2ubuntu3
Priority: optional
Section: multiverse/x11
Maintainer: Tollef Fog Heen 
Uncompressed Size: 168k
Depends: wget (>= 1.9.1-4), cabextract (>= 0.1-2), xutils (>= 4.0.2), debconf
         (>= 1.2.0), defoma, debianutils (>= 1.7)
Recommends: x-ttcidfont-conf
Description: Installer for Microsoft TrueType core fonts
 This package allows for easy installation of the Microsoft True Type Core Fonts
 for the Web including: 
 
  Andale Mono
  Arial Black
  Arial (Bold, Italic, Bold Italic)
  Comic Sans MS (Bold)
  Courier New (Bold, Italic, Bold Italic)
  Georgia (Bold, Italic, Bold Italic)
  Impact
  Times New Roman (Bold, Italic, Bold Italic)
  Trebuchet (Bold, Italic, Bold Italic)
  Verdana (Bold, Italic, Bold Italic)
  Webdings
 
 You will need an Internet connection to download these fonts if you don't
 already have them.

Index mbox archives with namazu

Namazu is a compact search engine. I use it to index my emails saved in gzipped archive files. It involves three steps: 1) index the email archive files, 2) results of the index are saved in /tmp/mboxIndex, 3) run 'namazu query whatIamlookingfor' to find the saved emails.

$ mkdir /tmp/mboxIndex  # where the indices are saved
$ mknmz --mailnews -O /tmp/mboxIndex ~/mbox/msk  # run index, takes time
Looking for indexing files...
19 files are found to be indexed.
1/19 - /.../.../mbox/msk/Mail_06.03.17.mail.gz [message/rfc822]
...

The mknmz index command skips large files by default. The size limits are set by two variables, FILE_SIZ_MAX (default to 2 Giga bytes, or 2000000) and TEXT_SIZE_MAX (600000). These variables are defined in /etc/namazu/mknmzrc. To change these defualt values, I do 'cp /etc/namazu/mknmzrc ~/.mknmzrc' to make a copy of that file and save it as ~/.mknmzrc. Then I edit ~/.mknmzrc to increase the two size limits to 200 Giga bytes because my largest email archive is approximately that size (due mostly to bulky email attachments in Word and PDF file formats, etc.).







Quick-Links