Posts Tagged ‘Linux’

Tomato and AT&T U-Verse Disconnects

I recently ran into an issue with my home network setup where my Linksys WRT54G router running Tomato 1.27 was disconnecting my long-running active TCP connections every 10 minutes or so. After further investigation, this is known to be a common issue resulting from Tomato’s dhcp client performing a unicast DHCP renewal which the firewall blocks or misroutes.

A number of people have published similar reports, but none of the suggested solutions appeared to work reliably for me, so I decided to diagnose, troubleshoot and resolve the issue myself. Here’s how I solved the problem. The notes I gathered while working on this are also located at 2WIRE & Tomato – Google Docs.

If you’d like to stop reading and skip right to the pay off, simply add the following two lines to the firewall script which is located in the web based user interface under administration, scripts, in the firewall tab:

iptables -t nat -I PREROUTING -p udp -i vlan1 --dport 68 --sport 67 -j ACCEPT
iptables -I INPUT -p udp -i vlan1 --dport 68 --sport 67 -j ACCEPT

These firewall rules allow DHCP traffic to and from the Linksys router, regardless if the traffic is broadcast or unicast. Please let me know if these rules are not optimal or could be improved.

Here are some references to other reports of this issue:

My troubleshooting process follows.

I can see in the logs that udhcpc attempts a renewal right up until the lease expires:

Feb 17 15:14:26 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:16:56 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:18:11 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:18:48 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:19:06 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:19:15 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:19:19 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:19:21 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:19:22 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:19:22 tomato daemon.info udhcpc[285]: Lease lost, entering init state
Feb 17 15:19:22 tomato user.info kernel: vlan1: dev_set_allmulti(master, 1)
Feb 17 15:19:22 tomato user.info kernel: vlan1: dev_set_promiscuity(master, -1)
Feb 17 15:19:22 tomato user.info kernel: device vlan1 left promiscuous mode
Feb 17 15:19:22 tomato daemon.info udhcpc[285]: Sending discover...
Feb 17 15:19:22 tomato daemon.info udhcpc[285]: Sending select for 99.29.172.159...
Feb 17 15:19:22 tomato daemon.info udhcpc[285]: Lease of 99.29.172.159 obtained, lease time 600
Feb 17 15:19:22 tomato user.info kernel: vlan1: dev_set_allmulti(master, -1)
Feb 17 15:19:22 tomato daemon.info dnsmasq[12612]: exiting on receipt of SIGTERM
Feb 17 15:19:22 tomato daemon.info dnsmasq[13007]: started, version 2.51 cachesize 150
Feb 17 15:19:22 tomato daemon.info dnsmasq[13007]: compile time options: no-IPv6 GNU-getopt no-RTC no-DBus no-I18N DHCP no-scripts no-TFTP
Feb 17 15:19:22 tomato daemon.info dnsmasq-dhcp[13007]: DHCP, IP range 192.168.3.100 -- 192.168.3.149, lease time 1d
Feb 17 15:19:22 tomato daemon.info dnsmasq[13007]: reading /etc/resolv.dnsmasq
Feb 17 15:19:22 tomato daemon.info dnsmasq[13007]: using nameserver 192.168.4.254#53
Feb 17 15:19:22 tomato daemon.info dnsmasq[13007]: using nameserver 8.8.4.4#53
Feb 17 15:19:22 tomato daemon.info dnsmasq[13007]: using nameserver 8.8.8.8#53
Feb 17 15:19:22 tomato daemon.info dnsmasq[13007]: read /etc/hosts - 0 addresses
Feb 17 15:19:22 tomato daemon.info dnsmasq[13007]: read /etc/hosts.dnsmasq - 16 addresses
Feb 17 15:19:25 tomato daemon.err miniupnpd[12649]: recv (state0): Connection reset by peer
Feb 17 15:19:27 tomato daemon.notice miniupnpd[12649]: received signal 15, good-bye
Feb 17 15:19:27 tomato daemon.notice miniupnpd[13043]: HTTP listening on port 5000
Feb 17 15:19:27 tomato daemon.notice miniupnpd[13043]: Listening for NAT-PMP traffic on port 5351
Feb 17 15:19:27 tomato user.info kernel: device br0 left promiscuous mode
Feb 17 15:19:27 tomato user.info kernel: vlan1: dev_set_allmulti(master, -1)
Feb 17 15:19:27 tomato user.info kernel: vlan1: del 01:00:5e:00:00:02 mcast address from master interface

Working with the solution mentioned in the forums, I added a firewall rule to allow DHCP traffic into the router itself. This is in the INPUT chain. This worked well up until I enabled DMZ mode for my Xbox 360. Once I enabled DMZ mode, the DHCP renewal issue cropped back up and I kept getting dropped. Luckily, I have experience with netfilter and iptables so I know that DMZ is probably implemented in tomato as a catch-all PREROUTING rule to perform NAT on all unknown connections to a specified address. I also know the PREROUTING chain is processed before the INPUT chain, so any catch-all rule there would trump my fix to allow DHCP in the INPUT chain.

This can be verified with tcpdump and wireshark. Luckily, there are pre-compied versions of tcpdump for the mips architecture located at http://ipkg.nslu2-linux.org/feeds/unslung/wl500g/.

In order to get the tcpdump binary onto the router, I had to unpack the ipkg file:

wget http://ipkg.nslu2-linux.org/feeds/unslung/wl500g/tcpdump_3.9.7-1_mipsel.ipk
gzip -dc tar xvzf data.tar.gz
scp opt/bin/tcpdump fw:/tmp

Finally, capturing the data is easy and since we're dealing with DHCP traffic, there's not much worry about filling up the small /tmp filesystem on the router:

/tmp/tcpdump -w /tmp/renew.cap -v -i vlan1 -s 1500 port 67 or port 68

I copied the cap files back to my desktop and fired them up in wireshark. Not too surprising, it's clear as day the request packets are making it out, but the acknowledgement packts coming back from the DHCP server aren't making it to udhcpc.

Screen capture of wireshark displaying repeated attempts to renew the DHCP lease

Adding the explicit rule to the PREROUTING and INPUT tables, the conversation looks much less confusing:

The logs tell a similar tale. Note the lack of the full re-initialization of dnsmasq, upnpd, and the firewall script itself.

Feb 17 15:29:31 tomato daemon.info udhcpc[285]: Sending renew...
Feb 17 15:29:31 tomato daemon.info udhcpc[285]: Lease of 99.29.172.159 obtained, lease time 600

 

Python 2.5.2 RedHat Enterprise Linux 5 RPM’s

In order to support the same version of python across all of our servers, I’ve also build Python 2.5.2 RPM’s for RedHat Enterprise Linux 5 (Tikanga).

This build is far more straightforward than the build for RHEL4, as the system X11 libraries link without patching Setup.dist and RHEL5 comes with a supported version of expat so statically linking  the library into the pyexpat module isn’t required.

The SRPM: python25-2.5.2-1.el5.src.rpm

Build command:

rpmbuild –define ‘__python_ver 25′ –define ‘dist .el5′ -ba ~/redhat/SPECS/python.spec

This package will not conflict with the system python package.  Scripts should use #!/usr/bin/env python25 to make sure the proper python is being used.

 

Python 2.5.2 RedHat Enterprise Linux 4 RPM’s

I’ve successfully built Python 2.5.2 RPM’s for RedHat Enterprise Linux 4 (Nahant).  The package is named python25 as not to conflict with the system’s python package.

Other than some minor tweaks to the patch process to account for the location of X11 libraries and db4.2, the only major change is that the pyexpat module is statically linked against libexpat.a since expat version 1.95.8 is required and not available in RHEL4.  If you build my SRPM, you’ll need to download an SRPM for expat-1.95.8 then build and install expat-devel-1.95.8 or greater.  Once present, the python25 SRPM will statically link in the correct version of the library.

The SRPM: python25-2.5.2-1.el4.src.rpm

 

Apache and strace /usr/sbin/httpd

TuxWorking with Apache today, I ran into an issue where the process would appear to start OK, returning a zero exit status, yet strace was showing a SIGCHLD being caught.

Needless to say, the server wasn’t actually running for any length of time, but I found the following strace command immensely helpful in figuring out the problem.

  strace -o /tmp/httpd.strace -ff /usr/sbin/httpd

Because apache spawns a number of children, strace with -ff attaches to each child and recorded the system calls in /tmp/httpd.strace.$PID

As it turns out, I was receiving the following error in the child processes:

    bind(5, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("0.0.0.0")}, 16) \
    = -1 EADDRINUSE (Address already in use)
 

DD-WRT replaces OpenWRT

TuxOver the past few months, I’ve been getting fed up with stability issues plaguing my OpenWRT based Linksys WRT54GS v2.0 router. Wireless under OpenWRT was very unreliable, often cutting out in the recent version of White Russian I was running.

Based on the advice of a friend, I’ve re-flashed my firmware to DD-WRT v23 SP2, and I must say, I’m quite impressed. The Web interface is very slick and clean, UPnP is working out of the box, QoS is present and configurable, though I haven’t tested it very much yet, the web interface allows SSH public keys to be configured easily, and stores them in NVRAM variables, and my dynamic DNS host name is also easily configured through the web interface.

All in all, I’m finding DD-WRT to be much more developed and polished than OpenWRT. I’ll comment on this post after a week or so in the event I have stability issues.