Archive for the ‘Solution’ Category

Screenshot Highlights with the Gimp

Here’s my preferred method of drawing attention to screen elements in technical documentation.

Direct Link: Screen Shot Highlights

iPhone / iPod Direct Video Link

Procedure:

  1. Copy window to clipboard with ALT+PrintScreen
  2. Paste as a new image into the Gimp with CTRL+SHIFT+V
  3. Use the rectangular selection tool to select the regions you want to draw attention to.
  4. Feather the selection for effect.
  5. Create a drop shadow if desired.
  6. Insert a new, totally black layer named mask.
  7. Keeping the selection in place, select the mask layer and delete the black pixels, creating a “hole” through the layer to the underlying image of the window.
  8. Set the mask layer’s transparency appropriately.
  9. Save the image, flattening the layers.
  10. Insert the image into your word processor of choice.

The embedded screen cast was created with CamStudio, by converting the resulting AVI into an H.264 AVC MP4 file using the SUPER ffmpeg/x264 front end by eRightSoft.  The embedded player is JW FLV Media Player.  All tools are open source software.

 

LDAP Berkeley Database Recovery

DirectoryWe experienced a power outage today, caused by someone tripping the emergency power off relay to our server room. Unfortunately, emergency power off really means “power off” so our UPS did the right thing and completely cut power rather than fall back to battery backup.

It was a little bit stressful getting everything back up, but everything appears to be working fine now.

The one serious error message we ran into is the following, when bring our OpenLDAP server back up:

[root@ldap ldap]# /etc/init.d/ldap restart
Stopping slapd:                                            [FAILED]
Checking configuration files for slapd:  bdb_db_open: unclean shutdown detected; attempting recovery.
bdb_db_open: Recovery skipped in read-only mode. Run manual recovery if errors are encountered.
bdb(dc=math,dc=ohio-state,dc=edu): PANIC: fatal region error detected; run recovery
bdb_db_open: Database cannot be opened, err -30974. Restore from backup!
bdb(dc=math,dc=ohio-state,dc=edu): DB_ENV->lock_id_free interface requires an environment configured for the locking subsystem
backend_startup_one: bi_db_open failed! (-30974)
slap_startup failed (test would succeed using the -u switch)
                                                           [FAILED]
stale lock files may be present in /var/lib/ldap           [WARNING]

Fortunately, the solution to this problem is easy enough. Just run slapd_db_recover -v in the Berkeley Database directory.

cd /var/lib/ldap
slapd_db_recover -v

Finding last valid log LSN: file: 4 offset 4818337
Recovery starting from [4][4815752]
Recovery complete at Wed Feb  6 15:33:42 2008
Maximum transaction ID 80000ba7 Recovery checkpoint [4][4818337]

After that, slapd should startup just fine.

[root@ldap lib]# /etc/init.d/ldap start
Checking configuration files for slapd:  bdb_db_open: unclean shutdown detected; attempting recovery.
bdb_db_open: Recovery skipped in read-only mode. Run manual recovery if errors are encountered.
config file testing succeeded
                                                           [  OK  ]
Starting slapd:                                            [  OK  ]
 

Nifty Work Around for File Size Limitations of FAT32

I picked up a 250 Gig Western Digital Passport portable hard drive to keep a backup copy of my file vault home directory, among other things while I travel next week, in the somewhat-likely event something disastrous happens to my laptop.

I really like how small and portable the drive is, along with it’s USB bus powered interface. There’s no futzing around with wall warts and power supplies, it truly is plug and play.

I also really like that my PS3 recognizes the device, since I’ve transfered my entire iTunes library over to it (Huzzah, Option-Starting iTunes to select a library!). All of my H.264 AVC movies play right off of the drive on my Playstation 3 as well, which is really nice and convenient.

Copying some rather large files, specifically a 7 gig ASR Golden Master image of my demonstration PowerBook leopard OS, and the actual Leopard ISO image itself, I ran into a file size limitation of FAT32. Of course, I knew FAT32 didn’t support large files, but I’ve just been spoiled in recent years by things like this “just working.”

I didn’t want to reformat the small drive, because that would surely mean my Playstation 3 would no longer recognize the file system, so instead I opted to create a sparsebundle HFS+ formatted disk image, exactly like I would do manually for Leopard File Vault images.

The end result is that each “band” in the sparse bundle image will satisfy the limitations of FAT32, while providing a nice, secure and robust HFS+J file system to store all of the “big files” I need to carry with me.

Long live robust Disk Imaging Frameworks.

The only catch is that these files are only accessible on Mac OS X Leopard machines now, but that’s not a huge problem for me. Especially traveling to the MacWorld conference.

 

TelePort NFS Home Directory

TeleportI usually compute with n-tupel of Mac computers sitting in front of me. I have a strong aversion to clutter, despite the state of my apartment, and the power of Teleport providing seamless, encrypted keyboard sharing, a-la so called “soft KVM” utilities is a killer app for me.

Alas, I’ve found that Teleport does not work as expected when operating from an NFS Mounted Home Directory.

Trying to connect to my Laptop, nutburner (Yes, nutburner is the given name of my first generation MacBook Pro), I received the following error.

Teleport Keychain Access

UNKNOWN wants permission to sign using key “privateKey” in your keychain. Do you want to allow this?

On a working host, e.g. two machines with file vault home folders, that “UNKNOWN” will actually display as “teleportd”. I suspect whatever logic Apple is using to verify the authenticity of program binaries doesn’t work as expected over NFS.

After clicking “Always Allow” twice, I get the following error:

Teleport Connection Error

I synchronize my login.keychain, so the private key and certificate are identical between these two hosts, leading me to believe a certificate algorithm mismatch is unlikely.

In any event, my solution was to simply redirect the teleport.prefPane to a local HFS+ volume using a symbolic link.

# /Scratch is a local HFS+ volume.
mkdir -p /Scratch/mccune/Library/PreferencePanes
mv ~/Library/PreferencePanes/teleport.prefPane \
  /Scratch/mccune/Library/PreferencePanes/
ln -s /Scratch/mccune/Library/PreferencePanes/teleport.prefPane \
  ~/Library/PreferencePanes/teleport.prefPane

Once teleport.prefPane resided on a local HFS volume, everything “just worked” perfectly.

As an alternative, you could deploy the prefPane to /Library/PreferencePanes to make teleport available to all users of the system.

 

Apache and strace /usr/sbin/httpd

TuxWorking with Apache today, I ran into an issue where the process would appear to start OK, returning a zero exit status, yet strace was showing a SIGCHLD being caught.

Needless to say, the server wasn’t actually running for any length of time, but I found the following strace command immensely helpful in figuring out the problem.

  strace -o /tmp/httpd.strace -ff /usr/sbin/httpd

Because apache spawns a number of children, strace with -ff attaches to each child and recorded the system calls in /tmp/httpd.strace.$PID

As it turns out, I was receiving the following error in the child processes:

    bind(5, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.0.0")}, 16) \
    = -1 EADDRINUSE (Address already in use)
 

Simplify Media - Rockin’ on the Road

Simplify Media IconAs a system administrator with a very large music collection, I’ve always been mildly irritated at the difficulty accessing my “master” music library while away from home.

Enter Simply Media a free, small application which does just as the name promises.

My iTunes library back home just shows up in my shared iTunes listing, regardless of where I am. No firewall hackery, nothing to configure, it just works, and works well.

Simplify Media

The iTunes integration is fantastic.

 

Large Backups with Bacula: /tmp Overfilling

I’ve run into several problems backing up our central file servers with Bacula, mostly centered around the sheer number of files (~6 million) a single job must process and store into the MySQL catalog.

I ran into the following error last night, attempting to back up the entire 6TB array as a single job:

  07-Nov 18:10 backup-dir JobId 3: Fatal error: sql_create.c:732 sql_create.c:732 insert INSERT INTO batch VALUES (1580771,3,'/Volumes/0/export/users/kodama/Desktop/GAP/gap4r4/small/small2/','sml800.z','OAAAD DkeW IGk B ih C+ A KZn BAA BY BHLtzL 1sNQO BFnqZZ A A C','0') failed:
  Incorrect key file for table '/tmp/#sql2459_94_0.MYI'; try to repair it

After doing a bit of research, I’ve concluded the /tmp volume, which is only a 256M tmpfs partition is filling to capacity before the job is able to complete.

Restarting the job this morning confirms MySQL is spooling data into /tmp.

  [root@backup tmp]# ls -l /tmp/
  total 332
  -rw-rw---- 1 mysql mysql 319276 Nov  8 09:48 #sql511e_3_0.MYD
  -rw-rw---- 1 mysql mysql   1024 Nov  8 09:48 #sql511e_3_0.MYI
  -rw-rw---- 1 mysql mysql   8722 Nov  8 09:48 #sql511e_3_0.frm

My solution for the time being is to reconfigure mysql to use /var/tmp for it’s temporary storage, rather than /tmp. This places the data on a much larger file system.

# /etc/my.cnf
[mysqld]
tmpdir=/var/tmp

I’m also planning to split the job into smaller jobs, using regular expressions to include only pieces of the home directory tree at a time. This will keep the number of files each job needs to handle under a reasonable threshold.

 

User Level VPN with Leopard

CaminoOne of the small, but incredibly useful features for me in Leopard is that ssh-agent is automatically running for each user account. This relatively small change allows me to log into remote machines without entering my password each time.

Using the SOCKS proxy built into ssh, we’re also able to setup a quick and easy secure tunnel. I wanted to check some sensitive information this morning, but I’m at a coffee shop that doesn’t pass VPN traffic, so I quickly hacked together the following:

Setup a new Location in the Network System Preference Pane to configure the SOCKS proxy at 127.0.0.1, port 4088. This connects most Apple applications to the secure and encrypted tunnel.

Network Preferences Socks ssh Proxy

Next, I configured ssh to automatically setup the SOCKS proxy whenever I type “ssh ford”, which is an alias for my workstation back at the office.

# ~/.ssh/config
host ford
  User mccune
  HostName ford.math.ohio-state.edu
  # Handle sleep/wake robustly with TCPKeepAlive
  TCPKeepAlive no
  Port 22
  # DynamicForward is a SOCKS proxy server.
  DynamicForward 4088
  ForwardX11 no

With this configuration, I’m able to load my SSH public key into the ssh-agent running by default on Leopard, type “ssh ford” to setup the encrypted SOCKS proxy, then change location to “SSH Socks Proxy” to automatically have Mail.app, iChat, Safari and Camino use the secure proxy.

An easy way to verify the proxy is working is to add an IP Address gadget to your personal google home page:

Google ip Address

Finally, with the Network Location module for Quicksilver, you can easily switch back and forth between the encrypted proxy.

Quicksilver SSH Network Location

 

Manually Migrate Tiger FileVault sparseimage to Leopard FileVault sparsebundle

So I’m finally running Mac OS X 10.5 Leopard on my portable. I’ve decided to migrate to the new sparsebundle style FileVault image, and here’s how I did it:

First, make sure you’ve created a FileVault master certificate by setting a master password in the Security preference pane.

Manually create the sparse bundle:

umask 077
export NAME="mccune"
hdiutil create -size 300g \
  -encryption -agentpass \
  -certificate /Library/Keychains/FileVaultMaster.cer \
  -uid 502 -gid 20 -mode 0700 \
  -fs "HFS+J" \
  -type SPARSEBUNDLE \
  -layout SPUD \
  -volname "$NAME" \
  "$NAME".sparsebundle;
chown -R "$NAME":staff "$NAME".sparsebundle

Make sure to set the password on the disk image the same as the password used with the user account, otherwise the system won’t be able to decrypt the image from the loginwindow.

Mount the sparsebundle:

hdiutil mount -owners on -mountrandom /tmp -stdinpass "$NAME".sparsebundle

Copy the contents of your home directory:

rsync -avxHE --progress /Users/mccune/ /tmp/dmg.TYSCwg/

After I did the initial pass with rsync, I logged out of my user account, and logged in using the administrator account in order to run the rsync process a second time, while my profile was in a steady state.

 

LVM Host Tagging with iSCSI

TuxThe quick problem and fix of the day deals with iSCSI storage, CentOS 5, RHEL5, and LVM. As previously mentioned, I’m using LVM tagging to arbitrate logical volume activation among a set of physical hosts all hitting the same storage. This has been working quite well, and appears to a simple and effective solution to the clustered Xen host problem.

We recently installed a new iSCSI target, and my boss complained that it’s LVM logical volumes weren’t active on boot, despite being properly tagged. This is because all block devices are scanned for LVM signatures from within the initial ram disk, not later in the boot process. At this stage, there’s no networking, and the iSCSI initiator hasn’t been brought online yet.

Nothing necessary for boot lives on the iSCSI target, it’s really just a large pool of bits for our backup system, so I decided the most simple solution is to just activate all volumes a second time from /etc/rc.local. This appears to work well and reliably.

  # Append to /etc/rc.local, executed after all other init scripts.
  # Activate all logical volumes tagged with the local machine's hostname.
  lvchange -ay @$(uname -n)