Subscribe to feed
Blog | About

Archive for Uncategorized

Barcamp Atlanta

Barcamp is coming up again October 17th - 18th. Check the website for additional information or to signup. The first one was a blast and I am very much looking forward to doing it again. Barcamp is ~24 hours of discussion, interesting conversation, learning about all kinds of cool projects, drinking beer (optional), and just having a good time.

I remember last year a big production bug came up around 11pm and I spent the 11pm - 6am hours fixing it but even with that it was a good time.

My goal now is to figure out something more interesting for a session this time around. Tried to do PostgreSQL tuning/admin last time and had all of 2 or 3 people show up for any portion. Perhaps something on Scala, am becoming a pretty huge fan of that language.

Comments (3)

OpenSuSE No More

I’ve installed OpenSuSE on a dozen or so work servers, used it as my previous development environment for about a year, and generally have been a big fan.

However, it has seemed the ‘official’ repositories get more and more out of date (I am running 10.2 mostly and my impression is it has been left to rot) and i’ve grown increasingly frustrated with how slow yast has become at updating its caches of rpms and repositories whenever I want to install or update software. I generally load yast, wait 10 - 15 minutes, then do what I was looking to do. I have the machines set to update themselves every week, do I need to toggle another setting to make them go ahead and update their software lists and repository caches?

That hasn’t been too big of a deal. The machines had been rock solid stable (300 day + uptime) so I didn’t want to fiddle with something that was working. I had a weird experience this week though when I realized what was happening when I rebooted some of the long stable machines.

The first case happened at the office for a machine that wasn’t very important. I rebooted and when the machine came back up there were dozens of errors related to runlevel 3 applications not being able to start because my /var partition wasn’t accessible, networking didn’t start correctly, and the keyboard did not work. For this machine I just blamed it on the HD and requested a replacement from Dell.

Then I went to the datacenter and rebooted a production application server to troubleshoot an amber light and the exact same thing happened. I did not expect that and could not write off that machine as it served several important roles.

I booted up with a live cd and all of the system partitions were fine. Everything could be mounted, fsck came back clean, I could chroot into the SuSE system and stuff worked, I checked over my /boot partition, GRUB configuration and inittab file, I had no idea why it would fail so utterly at boot time.

Our basic (non-Database) server is setup like this:

/boot - primary Linux ext3
swap - primary Linux swap
/ - primary LVM
/dev/system/root as '/' on the LVM partition as ext3
/dev/system/var as '/var' on the LVM partition as ext3

These machines were updating every week but kernel updates are not applied until reboots so my gut feeling was that perhaps something changed due to the kernel upgrade related to LVM. I spent literally 6 hours troubleshooting down this path and was inclined to believe this was the issue because there is in fact a lot of chatter on google about kernel upgrades screwing with LVM. I even tried creating a non-LVM /var, copying the contents of the LVM /var there and booting. That almost worked but the network did not start and I still could not use the keyboard.

At about hour 5 I pulled up the Novell documentation for the init process of OpenSuSE and started working through it step by step chroot’ed in from the live cd.

What was the issue?

OpenSuSE deleted it’s own /etc/init.d/boot script.

Seems impossible right? I’ve watched it happen 3 times now and still have a lot of machines to reboot that I fully expect to have the same problem. Perhaps a penalty for long uptime? I missed it completely when I was checking over inittab initially - I guess I just assumed the core script that kicks off EVERYTHING would not have been deleted by the official update/upgrade process of a mature Linux distribution. I managed to find it by progressively stepping back the initial run level passed to the kernel by GRUB until I could see far enough up the boot process to see the ‘file not found’ message. I didn’t see anyone else having this issue so hoping if someone else hits it they don’t waste half a day chasing false causes and find this post instead.

So now the question is what distribution should go on our servers (a distribution that neuters itself during an upgrade cannot stay). I am pretty fond of Ubuntu server but Sean at the office has pointed me at Arch and I am really, really digging the way they do things. I hope to make a separate post about that at the company blog in the near future.

Comments

Zimbra Migration Postmortem

I posted a short while back about excitement surrounding a migration from Exchange 2003 to Zimbra for our company. The migration has had its ups and downs and now that it has happened and I have had a couple weeks to dig in as both a user and administrator I would like to share our experience.

The general takeaways are that Zimbra isn’t perfect. It does some things worse than Exchange and some things better but the balance, in my opinion, slants heavily in Zimbra’s favor. I’ll break it up into migration and then administration/usage.

The Migration

The migration was a bruiser. It involved a couple nights of failed attempts and then a brutal 6pm - 4am effort to get everything finished well enough to go to sleep. I had a sysadmin helping me that knew his stuff so the details of how to complete it aren’t here (he handled most of the work), just the headaches I saw. The issues included:

  • The bulk migration tool was not able to migrate calendars.
  • The individual .pst importing tool also was not able to migrate the calendars. It would just fail like crazy and then give up because the error count was too high. For users with 2k+ appointments the migration would fail after only a few dozen events. I eventually got these calendars over by doing .pst exports/imports with Outlook itself rather than trying to use server-side migration tools.
  • We had to run the bulk migration over 2 nights because it took a long time. This isn’t a huge surprise because we had 100’s of 1000’s of emails, events, and contacts to migrate but the issue is that the second run re-imported everything imported in the first batch despite settings to the contrary. This essentially created duplicates of all emails and contacts.

To remove the duplicates of emails I used a perl script found at this page (this script actually worked fantastic). For contacts I used the Zimbra CLI to bulk clear the applicable address books and used client apps to re-import cleanly.

Administration/Usage

Zimbra started to shine after the migration ordeal. We immediately had all of our OSX users sync’ing their iCal, Apple Mail, and Address Book apps with the server, I had most of the Outlook users on the Zimbra Outlook connector without much effort, and most things worked well. There were a few issues I encountered.

  • The Outlook connector worked flawlessly in XP Pro but was very difficult to install in Vista. You need to follow the tip here and then just keep trying until it works. If it doesn’t work remove the program and try again. I really hate Vista and the fact that it makes things so hard.
  • The activesync with Windows Mobile is pretty flaky. It fails often for no apparent reason. I settled on using IMAP for email and just sync’ing my contacts and calendar and this seems to work consistently. It was as if it was stumbling over the greater volume of items to sync when the email was part of it.
  • I’m not real happy with the calendar sharing. Without admin intervention a user must share their calendar with each individual user and each of those individuals must login to the web interface to accept the share and see it. These notifications cannot be accepted in Mail/Outlook/Entourage or whatever else. Once these calendars are accepted though you can use almost any app you want as your calendar and that is nice.
  • There are connector apps for almost everything, but many of them are not updated to the latest versions of their target apps and none of them are completely polished and perfect. The Outlook and OSX ones seem to be the best but those also are not without issues.

In general though Zimbra works pretty well. I have calendar and contacts sync’d with my laptop using the OSX sync services and also sync’d to my Windows Mobile phone using activesync - a setup that never would have been possible with Exchange (without Entourage, but Entourage sucks in my opinion).

There are shortcomings but as I have worked through various user issues I have discovered what I believe is Zimbra’s biggest strength - its openness and open source underpinnings. It is a huge, powerful piece of code and between the CLI and the REST API you can do almost anything as an admin. Now that I am getting the hang of it I have created a set of quick scripts to interact with the CLI for doing things like auto-mounting calendars shared with distribution groups (getting around the email acceptance bummer mentioned above). The REST API is great and documented a bit here. It is completely trivial to export people’s contacts or calendars and to constrain what is exported using different parameters using the REST API.

Another big advantage in Zimbra’s favor is the community is quite strong and helpful. They have a wiki, forums, and bugzilla all very active and open.

So this is a bit of a ramble, but overall I am exceptionally happy that we made this switch. Zimbra is not perfect but it is powerful and utterly open making it possible to find workarounds for almost anything and it helps that it runs on Linux as well.

Comments

Potential Workstation Alternative

As nice as laptops can be, sometimes desktops are just nice to have for pure power. This is especially true for tasks like development where you have loads of applications up, probably have a database cranking, and are constantly compiling or deploying or profiling or debugging etc.

We setup a big ball of servers for a new client recently and being around that equipment for a few days really made me appreciate the speed of those boxes. A fresh restart of JBoss and a complete redeploy of our non-trivial .ear takes 1.5 - 2 minutes on my office workstation (respectably equipped) but only 20 seconds on these new web servers. Queries against large row counts that take minutes on the workstation take seconds or fractions of a second on the new database hardware.

Doing fresh restarts and redeploys and testing against large data sets are only necessary occasionally on my local environment but wouldn’t it be great to not care?

That said, I think (and only half-jokingly) that Dell or another vendor should start offering server-based workstations. If not offering them assembled at least pushing the idea.

Here’s a diagram from the front:

Desk Front

Basically take 2 well-equipped 2U 2950s (or maybe the 3U 6950s if you really want to get crazy), strap them to a vent unit that pushes the hot air back behind your desk, attach legs to the bottom and put your monitor and peripherals right on top of the servers.

Here’s a view from the top:

Desk Top

The position of your desk would be pretty crucial as even though the air is being pushed out of the back those 2 machines being backed up against one another is going to generate some heat. You would probably want to either back the desk into a wall and cut a hole into the office of someone you don’t like or situate the desk against an exterior wall and cut a hole to the outside.

Though this is largely a joke, having 2 servers would be pretty excellent if you could figure out what to do about both the airflow and the noise. Dell’s servers, and maybe all others as well, sound like mini jets at startup and don’t quiet down too terribly much when running steady.

Comments

August 31st 1997

This has no place on a technically-oriented blog, and perhaps I am just an unenlightened, uncultured grump - but despite what the media would lead people to believe other things happened in 1997 aside from a member of an irrelevant, though “royal” family passing away. A sampling from wikipedia of events happening around the same time.

  • August 15 - India celebrates 50 years of independence from British rule.
  • August 20 - Souhane massacre in Algeria; over 60 people killed, 15 kidnapped.
  • August 25 - Egon Krenz, the former East German leader, is convicted of a shoot-to-kill Berlin Wall policy.
  • August 26 - Beni-Ali massacre in Algeria; 60-100 people killed.
  • August 26 - The Independent International Commission on Decommissioning is set up in Northern Ireland, as part of the peace process.
  • August 29 - Rais massacre in Algeria; over 98 (and possibly up to 400) people killed.
  • August 29 - Christopher Maier of Lexington, Kentucky is bludgeoned to death by serial killer Angel Maturino Resendiz. Angel also rapes and beats Christopher’s girlfriend, who survives. This is the first of a string of murders that Angel commits.
  • September 3 - Arizona Governor Fife Symington is convicted for various crimes tied to his real estate business, effectively forcing him out of office.
  • September 4 - In Lorain, Ohio, the last Ford Thunderbird for three years rolls off the assembly line.
  • September 5 - Beni-Messous massacre in Algeria; over 87 killed.
  • September 5 - The IOC picks Athens, Greece to be the host city for the 2004 Summer Olympics.
  • September 5 - Mother Theresa of Calcutta dies of heart failure in Kolkata, India.
  • September 6 - A Jean Michel Jarre Oxygene in Moscow concert, celebrating the city’s 850th anniversary, draws 3.5 million people.
  • September 7 - First test flight of the F-22 Raptor.

My policy over the past week, and apparently today as well, is to remove any news source that runs more than 1 story about this event from RSS and/or iGoogle (which is a pretty lame name by the way).

Comments