Subscribe to feed
Blog | About

Archive for Linux

Basic Postfix Queue Management

I had to wrangle with a bunch of overloaded email servers recently and wanted to share my new best friend when this sort of thing happens:

postsuper

This guy is available if your email server is running postfix (at /usr/sbin/postsuper) or if you are running Zimbra which uses postfix under the hood (at /opt/zimbra/postfix/sbin/postsuper in that case). You should be able to run it and access the man page for it as root. This is pretty basic stuff for a postfix admin but I had a lot of trouble finding even basic descriptions of how to go about doing the things that postsuper can do.

Postfix files messages into several queues. The main ones are the following:

  • Incoming/Active - These are the common queues where incoming/outgoing messages live until they finish being delivered or received.
  • Deferred - If messages cannot be delivered they go here and delivery is reattempted until the messages expire.
  • Hold - A queue that you the administrator can move messages to. Messages placed here are not processed, no attempt is made to redeliver them, and they do not expire. As far as I can tell this queue only exists to make life easy on the administrator and gives you a safe place to store things temporarily.

Sometimes an email server will get crushed with volume from an attack, a misconfiguration, a mistake, or something similar and your machine will peg itself at full load while it tries to move the messages. This is when postsuper shines. There is a lot more to postfix and postsuper but these three commands can help you save your server and the people that depend on it for email in certain situations.

  • postsuper -d (deletes messages)
  • postsuper -h (move messages to Hold queue)
  • postsuper -r (requeue messages, can requeue messages in Hold to Incoming/Active)

Each of the above commands take a third argument ‘queue_id’. This value can be ALL (must be all caps) to tell it to apply the operation to all messages in all queues, a dash (-) to tell it to apply the operation to queue_ids provided at stdin, or a specific message’s queue_id. The stdin option is especially nice as you can create a file with a queue_id on each line and then have postsuper process all of those messages in one run. You can also do ALL [queue name in lowercase] to apply the operation to all messages in a specific queue. For example, postsuper -d ALL hold would delete all messages in the Hold queue. Do NOT leave the queue name off if you are trying to do this else the ALL will snag everything in all queues.

Using the commands above, this is how you could deal with a single host, or a small set of hosts that is slamming your server with messages. Say these messages are not important and do not need to be read but they are backing up the postfix queues and preventing delivery of real email to your users.

(1) postsuper -h ALL

  • Pushes all messages (the junk and the valid ones) to the Hold queue so you can breathe and think.
  • Empties out the Incoming/Active queues so that new messages can be effectively delivered.
  • At this point you need to stop the hosts sending the volume you don’t want, either by blocking the hosts if it is an attack or by fixing the issues if it is a misconfiguration or issue with a server you have control over.

(2) Next you would want to delete all of the messages you don’t want from the Hold queue. There are several approaches here, but this is one:

  • Create a file containing the information on the bad messages:
    mailq | grep theEnemyHostName > bad_messages.txt
  • Write a script to generate a new file from bad_messages.txt that contains just the queue id on each line, nothing else. Use perl or whatever you are comfortable with. The queue id is generally the first chunk of output for each message printed by mailq so you likely just need to grab the first eleven characters of each line.
  • postsuper -d - < list_of_queue_ids.txt. This will delete all of the messages you identified.

(3) Requeue all of the remaining, valid messages with postsuper -r ALL.

All of the above applies to Zimbra as well. Even though it has that fancy web-based administrative interface and a real nice way to view and filter your mail queues that interface always blows up and pukes errors when I try to do an operation involving any kind of heavy message volume. So if you get in a similar jam the CLI is what you have to use. The postsuper command is also very fast so even if you have 10’s or 100’s of thousands of messages jamming things up it should not be a problem.

Comments

OpenSuSE No More

I’ve installed OpenSuSE on a dozen or so work servers, used it as my previous development environment for about a year, and generally have been a big fan.

However, it has seemed the ‘official’ repositories get more and more out of date (I am running 10.2 mostly and my impression is it has been left to rot) and i’ve grown increasingly frustrated with how slow yast has become at updating its caches of rpms and repositories whenever I want to install or update software. I generally load yast, wait 10 - 15 minutes, then do what I was looking to do. I have the machines set to update themselves every week, do I need to toggle another setting to make them go ahead and update their software lists and repository caches?

That hasn’t been too big of a deal. The machines had been rock solid stable (300 day + uptime) so I didn’t want to fiddle with something that was working. I had a weird experience this week though when I realized what was happening when I rebooted some of the long stable machines.

The first case happened at the office for a machine that wasn’t very important. I rebooted and when the machine came back up there were dozens of errors related to runlevel 3 applications not being able to start because my /var partition wasn’t accessible, networking didn’t start correctly, and the keyboard did not work. For this machine I just blamed it on the HD and requested a replacement from Dell.

Then I went to the datacenter and rebooted a production application server to troubleshoot an amber light and the exact same thing happened. I did not expect that and could not write off that machine as it served several important roles.

I booted up with a live cd and all of the system partitions were fine. Everything could be mounted, fsck came back clean, I could chroot into the SuSE system and stuff worked, I checked over my /boot partition, GRUB configuration and inittab file, I had no idea why it would fail so utterly at boot time.

Our basic (non-Database) server is setup like this:

/boot - primary Linux ext3
swap - primary Linux swap
/ - primary LVM
/dev/system/root as '/' on the LVM partition as ext3
/dev/system/var as '/var' on the LVM partition as ext3

These machines were updating every week but kernel updates are not applied until reboots so my gut feeling was that perhaps something changed due to the kernel upgrade related to LVM. I spent literally 6 hours troubleshooting down this path and was inclined to believe this was the issue because there is in fact a lot of chatter on google about kernel upgrades screwing with LVM. I even tried creating a non-LVM /var, copying the contents of the LVM /var there and booting. That almost worked but the network did not start and I still could not use the keyboard.

At about hour 5 I pulled up the Novell documentation for the init process of OpenSuSE and started working through it step by step chroot’ed in from the live cd.

What was the issue?

OpenSuSE deleted it’s own /etc/init.d/boot script.

Seems impossible right? I’ve watched it happen 3 times now and still have a lot of machines to reboot that I fully expect to have the same problem. Perhaps a penalty for long uptime? I missed it completely when I was checking over inittab initially - I guess I just assumed the core script that kicks off EVERYTHING would not have been deleted by the official update/upgrade process of a mature Linux distribution. I managed to find it by progressively stepping back the initial run level passed to the kernel by GRUB until I could see far enough up the boot process to see the ‘file not found’ message. I didn’t see anyone else having this issue so hoping if someone else hits it they don’t waste half a day chasing false causes and find this post instead.

So now the question is what distribution should go on our servers (a distribution that neuters itself during an upgrade cannot stay). I am pretty fond of Ubuntu server but Sean at the office has pointed me at Arch and I am really, really digging the way they do things. I hope to make a separate post about that at the company blog in the near future.

Comments

Zimbra Migration Postmortem

I posted a short while back about excitement surrounding a migration from Exchange 2003 to Zimbra for our company. The migration has had its ups and downs and now that it has happened and I have had a couple weeks to dig in as both a user and administrator I would like to share our experience.

The general takeaways are that Zimbra isn’t perfect. It does some things worse than Exchange and some things better but the balance, in my opinion, slants heavily in Zimbra’s favor. I’ll break it up into migration and then administration/usage.

The Migration

The migration was a bruiser. It involved a couple nights of failed attempts and then a brutal 6pm - 4am effort to get everything finished well enough to go to sleep. I had a sysadmin helping me that knew his stuff so the details of how to complete it aren’t here (he handled most of the work), just the headaches I saw. The issues included:

  • The bulk migration tool was not able to migrate calendars.
  • The individual .pst importing tool also was not able to migrate the calendars. It would just fail like crazy and then give up because the error count was too high. For users with 2k+ appointments the migration would fail after only a few dozen events. I eventually got these calendars over by doing .pst exports/imports with Outlook itself rather than trying to use server-side migration tools.
  • We had to run the bulk migration over 2 nights because it took a long time. This isn’t a huge surprise because we had 100’s of 1000’s of emails, events, and contacts to migrate but the issue is that the second run re-imported everything imported in the first batch despite settings to the contrary. This essentially created duplicates of all emails and contacts.

To remove the duplicates of emails I used a perl script found at this page (this script actually worked fantastic). For contacts I used the Zimbra CLI to bulk clear the applicable address books and used client apps to re-import cleanly.

Administration/Usage

Zimbra started to shine after the migration ordeal. We immediately had all of our OSX users sync’ing their iCal, Apple Mail, and Address Book apps with the server, I had most of the Outlook users on the Zimbra Outlook connector without much effort, and most things worked well. There were a few issues I encountered.

  • The Outlook connector worked flawlessly in XP Pro but was very difficult to install in Vista. You need to follow the tip here and then just keep trying until it works. If it doesn’t work remove the program and try again. I really hate Vista and the fact that it makes things so hard.
  • The activesync with Windows Mobile is pretty flaky. It fails often for no apparent reason. I settled on using IMAP for email and just sync’ing my contacts and calendar and this seems to work consistently. It was as if it was stumbling over the greater volume of items to sync when the email was part of it.
  • I’m not real happy with the calendar sharing. Without admin intervention a user must share their calendar with each individual user and each of those individuals must login to the web interface to accept the share and see it. These notifications cannot be accepted in Mail/Outlook/Entourage or whatever else. Once these calendars are accepted though you can use almost any app you want as your calendar and that is nice.
  • There are connector apps for almost everything, but many of them are not updated to the latest versions of their target apps and none of them are completely polished and perfect. The Outlook and OSX ones seem to be the best but those also are not without issues.

In general though Zimbra works pretty well. I have calendar and contacts sync’d with my laptop using the OSX sync services and also sync’d to my Windows Mobile phone using activesync - a setup that never would have been possible with Exchange (without Entourage, but Entourage sucks in my opinion).

There are shortcomings but as I have worked through various user issues I have discovered what I believe is Zimbra’s biggest strength - its openness and open source underpinnings. It is a huge, powerful piece of code and between the CLI and the REST API you can do almost anything as an admin. Now that I am getting the hang of it I have created a set of quick scripts to interact with the CLI for doing things like auto-mounting calendars shared with distribution groups (getting around the email acceptance bummer mentioned above). The REST API is great and documented a bit here. It is completely trivial to export people’s contacts or calendars and to constrain what is exported using different parameters using the REST API.

Another big advantage in Zimbra’s favor is the community is quite strong and helpful. They have a wiki, forums, and bugzilla all very active and open.

So this is a bit of a ramble, but overall I am exceptionally happy that we made this switch. Zimbra is not perfect but it is powerful and utterly open making it possible to find workarounds for almost anything and it helps that it runs on Linux as well.

Comments

Zimbra Anticipation and Exchange Hatred

I have mentioned it in passing before, but almost every server associated with our company is running Linux (and Mac has managed to take over the workstations surprisingly fast - only 2 windows machines being used now). The last hold out on the server side was the Exchange server we setup when we first got an office that for obvious reasons had to be running Windows Server 2003. This was the same server I had the raid fun with.

Finally, after a year+ now of me hating Exchange solo, the requests and general feeling of the office has shifted against it across development AND sales and the migration to Zimbra is scheduled to be completed next week. I couldn’t be happier about it. Among many things I am most looking forward to administering a Linux machine, having a better web client (that doesn’t change by browser), and Apple iSync support. I am also looking forward to the Blackberry support which despite my unwanted but nontrivial experience with Windows servers I absolutely could not make work with Exchange.

To properly send Exchange on its way I thought I would enumerate some of the many reasons I hate it :)

  • It runs on Windows. Windows is decent for a workstation but makes for an awful server in my opinion. Perhaps it comes down to experience, but I feel that the Windows approach to server administration (meaning hundreds of obscure windows, tabs, and buttons) requires more effort to learn, involves completely unnecessary abstractions over known technology, and makes everything you need to do take longer. They are unstable, require reboots to update (wtf?), and you have to use remote desktop to administer them. Enough about Windows as a server in general, back to Exchange.
  • There is no reasonable method for setting up a catch all. Read this page if you need to do it and prepare to be disgusted.
  • There is no reasonable method for forwarding email. How did they mess this one up so badly? It seems that the ability to setup an email forward would be a core feature of server software designed to send and receive email. To do this you have to create a dummy contact with the forward email, then create an exchange user account (with a different name and username else there is a collision), then configure that exchange user account to forward its mail to the dummy contact record, which will then cause the email to be forwarded to the final destination. Completely ridiculous as it bloats the active directory listing with loads of dead entries and takes too many steps to setup.
  • I have had a lot of trouble with Exchange’s SMTP connectors where HTML emails headed towards external email accounts (via forwarding hack mentioned in previous bullet) back up in the queues for absolutely no reason and prevent messages from being delivered for hours sometimes.
  • It doesn’t support iSync as far as I know.
  • It doesn’t have spam filtering built in (it kind of does but it does an awful job in our experience).
  • The web client is pretty terrible, and if you try to access it with anything other than IE is takes a severe dive to awful. In the non-IE mode you can’t search, can’t create folders, can’t create rules, and it is genuinely unusable.

The only big strength Exchange offered, and the reason we used it to begin with, was the calendar synchronization and thankfully Zimbra has arrived to offer an alternative to the mess that is Exchange. Zimbra is now feature rich, stable, and validated by huge installations such as the one at Georgia Tech. The feedback and reviews are glowing and the documentation makes it clear that all the little things I hate about Exchange because they take too long or are too cludgy are quick command line or file editing steps.

I’ll post again once I have put some hours of usage in with some content that doesn’t mention Exchange once and instead talks about Zimbra. I think it is safe to say though that if you are starting a company just skip Exchange from the start. You can get hosted Zimbra just as you can hosted Exchange if you don’t want to manage your own server.

Comments (2)

Startup Technology Expenses

One aspect of a software startup that cannot be escaped is money must be spent on technology and development of technology. Whether this is a good or bad thing depends on if you ask the engineer or the accountant. My general rules of thumb are:

  • Purchases that help people do their jobs better or faster are worth paying for.
  • Before spending money on something look for an open source alternative that is cheap or free. Often you will find something better or only slightly inferior to the commercial item.
  • If you are going to spend money on something, the price-to-substance ratio is important.

And now a smattering of thoughts and plugs for each rule of thumb in the context of our company that is full of my personal opinions. I do realize that the earliest days of a startup largely must ignore most of this list. For example, when you don’t have an office yet (and everybody works from their homes) you don’t really worry about getting comfortable chairs, good machines, etc. for that office.

Purchases that help people work

  • Screen real estate is important. I used to think this meant 2 screens but have refined this to mean total resolution. With my macbook pro and spaces I went from using 2 computers and 3 monitors to just 1 laptop and I feel more efficient now. I like to give 2 monitors to any person that wants one - especially engineers, designers, and QA.
  • Good chairs are worth paying for. I’ve worked places in the past that gave their engineers hand me down garage sale garbage to sit on. The nature of a software company means people are going to spend a lot of time sitting and the chairs need to be good enough that people don’t notice them all day (and often longer given the nature of startups). Aerons are great if you can get a deal on them but there are solid options in the $200 - $300. CWC sells better quality furniture at the best price.
  • Don’t skimp on workstation hardware. I personally think the mac path is worth the premium for developers. On a per-item basis the price is virtually equivalent but given Dell’s willingness to haggle and price slash (especially if buying multiple items) a premium does remain. I think it is worth it.

Open Source

  • We use Java and I think it is better than .NET and it is free. You can build it on Windows/Linux/Mac and you can deploy it to all 3 as well. I think PostgreSQL is better than SQL Server (and MySQL). The Microsoft lock in has never made any sense to me and I feel the Java community is a great place in that the number of unqualified engineers is relatively small and it is full of extremely qualified people. Java also scales vertically or horizontally very, very well. It has the whole 10,000 frameworks/libraries to choose from “problem” that .NET does not have but that is okay in my opinion. We went with Spring/Hibernate/DWR and it has worked out great.
  • PostgreSQL is fantastic. The developers are accessible and helpful and the community is strong. We’ve run it up to a 1TB database and it handles it just fine. You obviously have to run it on a reasonable machine as load increases but it scales vertically wonderfully and there are addons for replication. Check out Slony and/or Mammoth Replicator if you need that replication, we haven’t yet. Visit this site for installing Postgres on your local mac workstation.
  • Linux is the way to go for servers. I don’t think the Linux/Dell combo can be beaten on the server side.

Price-to-Substance Ratio - Some Examples

  • IntelliJ IDEA is worth its cost. It is magical and exceeds a plugin-ridden eclipse install for features out of the box and I think the editing experience and source control interaction are superior.
  • Despite stability issues I think the Leopard incremental upgrade to OSX was worth it for productivity overall. Spotlight and Spaces have changed my workflow completely.
  • Dell provides a fantastic ratio here. I would strongly recommend them for server hardware, especially their latest models. Solid architecture, solid raid controllers, RAM, etc. If you go with Dell get in sync with a Small Business team. It will save you money and streamline the process as you get to talk to the same people every time. Their business lines of laptop (Latitude) and desktops (Optiplex) are also solid.
  • Good consultants and contractors are worth their rates for focused, time-constrained assistance. You have to be careful though because there are a large number of unqualified people posing as consultants and contractors that aren’t worth the time it takes to arrange a contract. If you find somebody you can work with and does a good job keep using them as needed.
  • Parallels is worth its very manageable price for providing IE6/IE7 testing to mac-using developers. See this post for help setting up the free VMs provided by Microsoft for doing this testing.
  • FlexBuilder isn’t worth the cost. When I used it a long while back it was $700+ with charting and had marginally more functionality than notepad2. Following that link, it looks like they are pumping Flex 3 now. The fact that Flex 2 has profound issues makes this especially troublesome.
  • Flex Data Services pricing defies all reasoning. $20k per CPU. Same for pretty much any other product that charges per-CPU. If anyone knows of ANY per-CPU product that is worth paying for let me know. I recently priced out a better WYSIWYG editor for portions of our product and they wanted pricing per CPU for a text editor.
  • And finally, I think sharp, qualfied engineers that you can interact with in person in the US are superior to any offshore team. When you consider the time differences, communication barriers, and general lack of quality offshore I believe a 5 man team of people that know what they are doing and work together here could out perform a 50 man team of offshore cube farm drones. I have 3 specific experiences (admittedly not that many) working with offshore teams. 2 ended in utter failure to complete the task, and 1 was bailed out of before it got too far along because even the onshore PM/BA assigned were completely clueless and ineffective. I feel like the offshoring development companies live in an alternative universe where you just keep a neutral look on your face through meetings and shuffle out inferior product making fixes until the customer is too frustrated, tired, or so accustomed to the low quality that they start to believe the software is good and consider the project a “success.”

So there you have a smattering of my thoughts. I expect to elaborate on many of these items in separate posts in the future. You can likely tell by the tones which items I find most interesting and/or alarming.

Comments (2)

Big ext3 partitions in openSUSE 10.2

I realize this is a pretty niche topic but spent several hours today trying to figure out how to create a couple 3.4TB ext3 partitions in openSUSE servers and wanted to share what worked.

The biggest tip is don’t try to use the openSUSE installer to partition and/or format the big partition. In my case it screwed things up no matter how I attempted to tweak settings. One sequence that does work is this:

  • Install openSUSE, but don’t touch the big disk (i’ll call it /dev/sdb for this post).
  • Once installed, login as root and run parted /dev/sdb. This is a great tool that I only discovered today. This page provides a good overview plus documentation and the help system in the tool pretty much tells you anything you need to know.
  • From the parted prompt type mklabel gpt.
  • Type mkpart primary start end. This creates a primary partition beginning at ’start’ and ending at ‘end’. These can be fixed MB amounts or percentages. In my case this was mkpart primary 0 100%. This creates just the partition, it does not setup a file system. In this example the new partition would be /dev/sdb1.
  • Type quit. The parted tool has great commands for making file systems as well but they don’t support ext3.
  • Now back at a regular prompt type mkfs.ext3 -b 4096 /dev/sdb1, let it crank for awhile, and you’ll have your big ext3 partition. The -b argument is specifying the block size. ext3 maximum size is determined by block size, more information on the wikipedia page.
  • Mount the file system to wherever you want and add the relevant entry to /etc/fstab

Hopefully this will save somebody else a couple hours - the key in my case was to not let the openSUSE installer play any role in setting up the partition.

Comments

openSUSE 10.2 autoyast

I’ve become a pretty huge fan of openSUSE. The installer is excellent, it just works really well and I really like having the option of yast to manage most aspects of the system, even when working from a command line. In comparison to RHEL it has more file system options, newer/more rpms in the official repositories, and in my opinion yast is superior to RHEL’s up2date. If support is an issue you can get Suse Enterprise Linux preinstalled by hardware vendors (including Dell) as well as enterprise support from Novell.

Though the openSUSE installer is pretty solid manually booting, configuring, installing, and updating an OS can get old really fast especially if you are installing on machines meant to have the same or similar roles. As part of the effort to improve our ability to manage more machines at work I decided to explore two tools to make life easier:

  • Setting up our own installation server
  • Using autoyast to automate 95% of the install for new machines

I generally followed the guidance offered at this novell.com page but want to walk through the specific process I went through as well as some specific gotchas and details in the hopes of helping out anybody else trying to do the same with 64bit openSUSE 10.2 on servers. By “on servers” I mean “no x windows”.

Setting up the Installation Server

If you have an existing openSUSE box setting up the installation server is pretty easy. Here are the steps involved in setting the server up and linking it to the official Novell yast repostories so your new installations get updated packages.

  • Run yast and goto Software -> Software Management
  • Search for and install yast2-instserver
  • Exit and restart yast and goto Miscellaneous -> Installation Server
  • From here you will be walked through the process of copying the files from your installation media to the HD and exposing the sources with FTP, HTTP, or NFS
  • For this particular example I went with FTP, openSUSE installed and attempted to configure vsftp
  • I had to manually /sbin/service vsftpd start to make it work.
  • By default vsftp was configured to allow only anonymous access with read-only permissions, and /srv/ftp was set as the root of what anonymous can see on the disk, so the config was perfect by default.
  • The full path to the 64bit installation source CD contents was /srv/ftp/sources/suse-10.2-64bit/. It is a good idea to give the source directory a specific name as that allows you to add alternate sources (like 32bit) to the same installation server in the future.
  • Go to /srv/ftp/sources/suse-10.2-64bit/CD1 and create a new file named add_on_products.
  • Edit this new file and enter any number of source repositories that you want to be included in new installs - 1 on each line. In my case it looked like this:
    http://download.opensuse.org/distribution/10.2/repo/oss
    http://download.opensuse.org/distribution/10.2/repo/non-oss
    http://download.suse.com/update/10.2
  • Sources entered here will also automatically be registered as installation sources for the new machines. If you aren’t using 10.2 your source repositories will be different. Check this page for all of them.
  • That wraps up the installation server. Assuming the vsftp service started up you are good to go.

At this point, you can setup new openSUSE machines by installing against this server. You would need to boot the machine with some sort of openSUSE installation media (the DVD, CD1, a properly setup usb key, or the minimal install CD) to get to the installation menu. From there hit F4, enter your FTP installation server and the /sources/suse-10.2-64bit/CD1 directory, press enter, and then continue with the installation. Having the installation server is really nice because you can control and manage a single, consistent set of rpms.

Setting up autoyast

Just having a central installation server is great but with autoyast you can almost completely automate installation of new openSUSE servers. This works by creating an autoyast control file at which you point new installations. The control file can include instructions for disk partitioning, installed software, services, custom config files, and directions to run extra scripts at various stages of the installation. The link at the top of this post provides a pretty good overview and the documentation here is very helpful as well. That documentation provides almost all of the information you need so where details are excluded from the following look there.

In my specific case (an autoyast file for JBoss servers) the process went like this:

  • Uploaded the latest versions of JBoss and Java (yast didn’t have 1.5), init.d scripts for JBoss, as well as our custom /etc/profile.d/environment.sh file to the installation server under a different directory accessible through FTP.
  • Wrote a script meant to run after new installs to download and configure the above. Really just a bunch of wgets, copying, linking, chmod/chown changes. This was going to be downloaded and run in the init-scripts stage of the autoyast install.
  • Setup a fresh install of openSUSE exactly as I wanted it for a JBoss server and ran yast2 autoyast from the command line.
  • Selected Tools -> Create Reference Profile
  • Selected the areas I cared about including. Note that selections here are in addition to a default set of information that includes partitioning and installed packages. In my case Firewall, Online Update Config (I enabled this on the reference server), Local Security, and User Management made sense.
  • Next was to add a custom sshd_config file. With the reference profile loaded, went to Miscellaneous -> Complete Configuration Files and then alt-E for configure.
  • Alt-w for new, file path of /etc/ssh/sshd_config for the new installs, and the loaded the contents of my existing sshd_config file for the contents.
  • Lastly, I wanted to run the script I mentioned above as an init-script. These are scripts which run after installation is complete and networking is functional on a new server. init-scripts cannot be configured through the autoyast tool so I did File -> Save As and generated my baseline autoyast file.
  • If you see warnings about the format of the generated xml file (the autoyast control file) ignore them. The Suse team has issues with their schema files.
  • Finally, I edited the autoyast file and added my init-script to the end. It looked like this:

    <scripts>
      <init-scripts config:type="list">
        <script>
          <location>ftp://myserver/myscript.sh</location>
          <interpreter>shell</interpreter>
        </script>
      </init-scripts>
    </scripts>

  • Then I just uploaded this file to the same FTP server so it was accessible during new installs.

Though the number of steps I just listed seems long, these autoyast files are really very quick to make. You could create any number of them for different machine roles and make them all available for new installs.

Setting up a New Server

Now that you have an installation server (FTP-based in this specific case) and all the autoyast files and other resources a new machine could need, you can setup a new machine from scratch by doing the following:

  • Boot from the openSUSE DVD, CD1, or minimal installation CD. With some more work you can setup a bootable usb key or use the PXE boot capability of newer machines to boot from a network resource.
  • Once you see the installation menu, Hit F4, enter your FTP installation server and the /sources/suse-10.2-64bit/CD1 directory, press enter.
  • Move the cursor over the Installation option and type autoyast=ftp://[installserver]/[autoyast-file]. What you type appears in the command line options along the bottom of the screen.
  • Press enter and walk away from the machine for awhile so the installation can complete.

Now, when I set this up, GRUB wouldn’t boot the newly installed machine. It turned out that the kernel version I was running on the reference server (and from which I generated the initial autoyast file) was different from the kernel provided by the installation server. This meant in my autoyast file the GRUB configuration portion was trying to reference a file (vmlinuz-2.6.18.2-34-default) that didn’t exist. So make sure your installation server is tied to the official repositories and make sure your reference machine is fully up to date before creating the baseline autoyast file.

I used this same approach to create configurations for JBoss, e-mail, and basic openSUSE-based servers.

Comments (2)

Zenoss Core

I recently installed Zenoss Core (available at zenoss.com) at the office for the purpose of monitoring the handful of machines we have there. My goal with this post was to be exhaustive but I decided instead to give a quick summary and some tips as the documentation is quite good already. A community site with forums, a blog, and a wiki can be found at community.zenoss.com and it is pretty lively. I’ve asked 2 questions thus far and received prompt and helpful responses and discussion in return.

In the past my tools for monitoring machines have consisted of Nagios and patchworks of shell scripts. Shell scripts aren’t particularly manageable for a large number of machines and I found Nagios to be too difficult to configure and learn. Nagios is a really powerful tool used by a lot of people though so don’t discount it based solely on my unqualified opinion. To learn more about Nagios go here.

That said, I was trying to find a new approach and slashdot recently ran a post about Zenoss. After giving it a shot and finally getting it installed I have to say I really love this thing for monitoring network devices and servers. It is fairly easy to install and learn, has a great interface and charting out of the box, is incredibly configurable and customizable, has an active and helpful community, great documentation, and if needed can run Nagios plugins.

Installing on OpenSuse 10.2 64bit

As of a few days ago 64bit rpms are not available for installing Zenoss so you will need to go with the source tarball found on the download page here. Once you have the source where you want it on the machine that will do your monitoring you can use the instructions here to perform the installation. There really isn’t anything missing from their instructions, but I didn’t take everything literally and it made things take longer. If you have trouble with the install be sure to check the zenbuild.log file. Here are the issues I faced.

  • If during the install you get messages along the lines of wrong ELF class: ELFCLASS32 then you are probably trying to use the 32bit rpm to install Zenoss on a 64bit OS. Go grab the source tarball instead.
  • You need MySQL 5.0.22 or greater. Though this post doesn’t cover the rpm installation in depth if you are able to use the rpm and it complains about this dependency even when you are 100% certain you have a qualified MySQL version installed (such as happened for me on a 32bit OpenSuse box) just force it with --nodeps and you should be fine.
  • You really do need Python 2.3.6 or 2.4. If you use Yast with OpenSuse you will not be able to use the Python it provides (its 2.5). If this is the case go to the 2.3.6 download page on python.org, grab the tarball and do a ./configure, make, make install. You should now be set. Make sure running which python points you to the 2.3.6 version. You should not need to uninstall Python 2.5 from your machine. When I tried to do this Yast complained about all sorts of dependencies.
  • If during the install you get messages related to mkzopeinstance.py then you have Python 2.5 installed and not 2.3.6 or 2.4. You need to go grab and install the correct version.
  • If during the install you get messages containing undefined symbol: Py_InitModule4 it probably means the Zenoss installer is seeing and running a Python 2.5 interpreter and trying to load Python 2.3.6 or 2.4 modules. Make sure any symlinks or references to Python 2.5 are converted.

Other Tips

  • If you get stuck during the install feel free to post here or better yet ask in the community forums for Zenoss.
  • Once everything is up and running you will want to work through the quick start guide on the Zenoss documentation page - obviously skipping the vmware related stuff if you’ve just installed your own copy.
  • If when adding hosts to monitor Zenoss gives you a no snmp found for ip = x.x.x.x you need to make sure an snmp agent is installed and accessible on the host. I found these instructions to be quite helpful to make it work quickly. Skip down to step 19 and remember those instructions are for Ubuntu so you’ll probably be using yast on OpenSuse instead of apt-get.
  • One thing that threw me briefly with the snmpconf tool was the snmpd.conf file it generates is written to whatever directory you ran the tool from. Make sure you cd to /etc/snmp/ before running the tool or copy the generated .conf file to its correct place.
  • After changing your snmp configuration you will want to do a /sbin/service snmpd restart.
  • To monitor windows machines through snmp you will need to enable it as described here.
  • The only other tip I can offer is to think about the monitoring as being done from inside of the device or machine. Through snmp Zenoss will be able to check for actual processes, disk usage, and machine information. My experience with monitoring applications like this was so limited that I had the mindset of wanting to monitor everything through IP ports. This isn’t the way to think about it - though Zenoss allows this as well. By setting up some processes and alerting rules you can have some pretty effective monitoring in place very quickly and you will have more options doing it this way.

    No matter how small your office or how limited your production environments your customers (whether internal or external) should never be able to surprise you with news of down time. Zenoss is a smart, effective tool that can be used to monitor 1 or tons of devices and I have been very pleased with it thus far.

Comments (2)