Fun With PuTTY

Like many people in IT, I find myself working with several types of operating systems on a daily basis.  For example, I need to have a Windows workstation, but I spend nearly all of my time working on Linux servers.  Given this setup, I wanted to find the most convenient and effective way to access and automate from my Windows system.  PuTTY to the rescue!

What is PuTTY?

PuTTY is a free terminal/SSH client for the Windows platform.  Along with the main client, several utilities are available which emulate various GNU/Unix programs:

  • PuTTYtel – Telnet client
  • PSCP – Secure file copy
  • PSFTP – Secure file browsing and copying
  • Plink – Command line version of PuTTY
  • Pageant – SSH authentication agent
  • PuTTYgen – SSH key management

PuTTY Connection Manager

PuTTY Connection Manager (or PuTTY CM for short) is wrapper built to provide tabbed PuTTY windows, as well as connection databases and login macros.  It is difficult to describe how useful this program is for people who work with a large number of servers on a daily basis.  All I can say is to go try it out and see for yourself.  The connection database is stored with a DAT extension, but is actually an XML file and can easily be modified with a text editor.  For security, you also have the option of encrypting the database.

Setting Up PuTTY on Your Machine

First, download the PuTTY utilities that you would like from http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html.  You may download the utilities individually or download the ZIP file containing all the utilities.  Move or extract the files into your C:\Windows directory.  This will place all the programs in your system’s path, so that you can easily reference them on the command line.

The PuTTY CM tool may be difficult to find online, but it’s out there.  When you get the executable, run it first from one of you user directories (e.g. My Documents).  Then move the executable and libraries to C:\Windows.  You can also just leave it in the user directory, as it is not used from the command line and doesn’t actually need to be in your system’s path.

Using PuTTYgen and Pageant

Public key authentication is a very secure and convenient way to handle SSH connections, especially when scripting.  I will not get into much detail with public key authentication here, but there are many great resources online that can explain it in more detail for beginners.  My purpose with this section is to explain that PuTTY allows for simple generation and deployment of both SSH.com and OpenSSH keypairs.  Most Linux systems will use OpenSSH keys, while many other Unix-like systems use SSH.com.  If you do not already have one, use the PuTTYgen tool to create a new keypair:

  1. Open the PuTTYgen utility
  2. Set the number of bits in a generated key to 2048
  3. Click “Generate”
  4. Move your cursor around the blank area until enough entropy is generated to create the private key
  5. Enter an encryption passphrase for the newly generated key (optional, but highly recommended)
  6. Click the “Conversions” menu item, and export as an OpenSSH key

The public key displayed in the window should go into the .ssh/authorized_keys file in your home directory of every system for which you would like to use public key authentication.

At this point, you have two choices on how you can use public key authentication.  You can either load the key with each new PuTTY session, or you can use Pageant to store the key in memory and use it for all future PuTTY sessions.  With the first option, you have to decrypt the private key each time you log into a system.  With Pageant, you only have to decrypt the private key once, to load it into memory.

The most convenient way to use Pageant is to have it automatically start and load the private key when you log in to your system.  To do this, open the “Startup” directory in your start menu, and create a link with the target:

C:\Windows\pageant.exe C:\<path to your key>\yourkey.ppk

Each time you log in to Windows, you should be prompted to enter the passphrase for your key, after which it will be loaded into memory for all the other PuTTY tools to use.  If you log out or shut down your computer, the key will remain safely encrypted on your hard drive.

Scripting with PuTTY

PuTTY scripting can easily be done with Windows PowerShell.  To start using PowerShell for scripting, you must update your policy to allow PowerShell scripts to be executed.  To do this, open up a PowerShell session as an administrator, and type in “Set Execution-Policy Unrestricted”.  PowerShell will ask for a confirmation, after which you will be able to execute scripts.

Once you are able to execute scripts, get to work!  I generally like to do as much logic as possible in PowerShell (taking advantage of objects and .NET libraries), then execute as few commands as I have to on the remote Linux servers.  If you prefer or need to run complex shell statements and scripts on the Linux machine, you can use HERE documents or even just upload and run the script on the server.  It’s really up to you.  Here’s some things I have done:

  • GUI based JBoss deployments (backup/deploy multiple WAR files between environments or your workstation with just a few clicks)
  • JBoss/Tomcat container restarts… great for middle-of-the-night issues
  • Specialized content deployment
  • Retrieving and extracting data out of XML files

Hopefully this has helped generate some ideas on how you can automate some of your more repetitive tasks and frees you up to tackle bigger things.  The important thing to remember is that running a Windows workstation does not mean you can’t easily work with Unix/Linux systems!

Stopping the Apache Killer

I recently became aware of a serious issue in Apache that has started to gain some notoriety over the last few days.   This vulnerability allows people to commit a very simple and effective DoS attack against most Apache servers out there.  Just to clarify how effective this attack is, I was able to take down a a decently powered web server in just a few seconds with an attack that could easily be carried out on a cheap laptop.  To clarify a little further, when I say “take down”, I’m not talking just about Apache.  The attack causes Apache to take up all of the physical memory and start going right into the swap space.  At this point, the whole server will become unresponsive and may even need to be rebooted.

The vulnerability stems from the way that Apache handles range headers.  These headers allow the client to request specific parts of the document, rather than the whole file.  This type of request is often used with download managers, media streaming, and PDF readers.  Unfortunately, Apache is not very picky about how it responds to requests for multiple ranges, and this can be exploited by requests that have many sets of overlapping ranges.  Apache will dutifully respond as best as it can, consuming a lot of processing and memory in the process.  It doesn’t take many of these abusive requests to take up all of the available resources on the server, which is why this attack can be so devastatingly easy to carry out.  This problem has been in Apache for a long time and was even discovered several years ago.  However, last week, someone decided that it was time to publicize it a little more.  The result was a Perl script, appropriately named “Apache Killer”.  This script basically sends these abusive range headers in multiple concurrent requests.  Running the script with 50 threads is suggested, which as I mentioned before, is more than sufficient to take down even the most powerful servers in a matter of seconds.  I have attached the text of this script for those who are interested in using it for testing their systems.

So now that I have pointed out how serious this issue is, the question is what can be done about it?  Fortunately, ASF released a new version of Apache, which fixes this issue by essentially ignoring the range header if the total range exceeds the size of the document requested.  This is the ideal fix, but it will probably involve downloading and compiling the server manually, as it will probably take awhile for it to show up in repos.  A simpler, but more ham-handed way of fixing the issue is configuring Apache to not accept the range headers at all.  This will cause issues with download managers and other special clients, but it will certainly stop any chance of this attack happening. Personally, I have found that the best solution is something of a compromise between the two I already mentioned.  Basically, this solution involves adding the following lines to your main Apache configuration:

          SetEnvIf Range (?:,.*?){5,5} bad-range=1
          RequestHeader unset Range env=bad-range
          RequestHeader unset Request-Range

This configuration will unset the range header if more than five ranges are sent, which is unlikely to happen in a legitimate request.  However, it is important to remember that it is possible for a legitimate request to have more than a few ranges, so you may need to adjust that number to fit your situation.  This should also only be used until you are able to update Apache, which has provided a more elegant solution.

Blocking Bad Crawlers with Apache mod_evasive

Back in March, I wrote an article on a great little Apache module for blocking automated scans and attacks on your system.  Since that article was written, the module has blocked and notified me of a couple vulnerability scans being run against my web server.  Basically, it has been doing exactly what I expected of it.  However, what I didn’t expect was a very pleasant side effect that the module would bring.

A few weeks ago, I received an email from my web server that it had blacklisted an IP address.  I logged onto the machine and checked the logs to see when and where the attack was ran.  As it turns out, the “attack” was actually coming from an aggressive crawler from a company called Brandwatch.  A quick search on Google revealed dozens of complaints from webmasters about the overly aggressive crawling rate of the brandwatch crawler (identified in user-agent strings as ‘magpie’).  The 4 or 5 entries in the access log (before mod_evasive was activated) indicated a crawling rate of over 3 pages per second!

The good news is that mod_evasive blocks all high rate scans of the system, whether they are malicious or just plain annoying.  If you’re having a problem with aggressive crawlers, check out the above article on mod_evasive and give it a shot.  I didn’t have bad crawlers in mind when I first implemented it, but it seems to work very well for them.

To install and use mod_evasive, just follow the instructions in my previous article.  The settings given in the article work well for me, but you can tweak them to fit your own needs.  Enjoy!

Diaspora on openSUSE

A possible milestone in the world of social networking was reached a few days ago, as the developers announced the pre-alpha release of the Diaspora project.  While there is ample instruction on their wiki for installing a seed on Ubuntu, there is no documentation on getting one running on SUSE Linux, especially in Apache.  However, after some tinkering around, I have successfully installed and set it my own Diaspora seed.  Here is what I did for openSUSE 11.3:

Warning

This application is pre-alpha quality (i.e. minimal functionality).  While the Diaspora team has released the source code and some basic installation instructions, there is absolutely no support right now.  In addition, the application has a lot of security holes, as this article points out.  On top of that, you will have to install a lot of software, which always carries the risk of messing up your system.

While I am happy to answer some basic questions about what I did, I am not a Ruby expert and will probably not be able to help on most specific problems.  If you do this, it is at your own risk!

General Dependencies

Diaspora uses Ruby on Rails for its platform.  If you don’t have it installed already, expect to spend a lot of time on installation.  The following lines should get you started, and the rest of the Ruby libraries will come near the end.

zypper in apache2-devel ruby rubygems gcc-c++ libxslt-devel ImageMagick git
gem install rake
gem install rack
gem install bundler

Apache

Diaspora is designed to run under mod_passenger, so you will probably need to install this module if you want to use Apache.  The best way to do this is by downloading and compiling the source code.  Here is what I did:

wget http://rubyforge.org/frs/download.php/71376/passenger-2.2.15.tar.gz
tar -xzvf passenger-2.2.15.tar.gz
./passenger-2.2.15/bin/passenger-install-apache2-module
mv ../ext/apache2/mod_passenger.so /usr/lib64/apache2/

You will want to replace the download URL and file names with the current latest version from http://www.modrails.com/install.html.  After you have compiled and moved the module into the Apache module directory, just add “passenger” to the modules section in /etc/sysconfig/apache2.  Once you have done that, set up your Apache configuration according to the online manual.

MongoDB

Diaspora uses MongoDB for its data storage.  I don’t know of any openSUSE repositories that carry this, but the binaries seem to work just fine.  Simply download and extract the Linux binary tar file from http://www.mongodb.org/downloads.  Once you have done this, you have two choices for the database location.  One is to create the default location, /data/db.  The other is to define your own database location and to add a parameter when starting up the server.  I opted for the second, so here is what my startup command looks like:

nohup ./mongod --dbpath /opt/db/ & > /dev/null/

Diaspora

Finally, we get to Diaspora itself.  To install and set up the application, change to the directory where you want to install it and use the following commands:

git clone http://github.com/diaspora/diaspora.git
cd diaspora
bundle install && bundle install devise.git

You might want to get up and stretch, as that last line could take awhile to run.

Running the Application

What you do here depends on whether you are planning on running Diaspora with the included development server, or with Apache.  If you are running it as a development server, executing ‘bundle exec thin start’ should be sufficient (you can test on port 3000).  If you want to shut it off, just replace the “start” with “stop” in the above command.

To run Diaspora in Apache, just follow the passenger module online manual, using Diaspora’s public directory in place of the examples.  Make sure that the entire application directory is owned by ‘wwwrun’.

While the application is definitely pre-alpha quality, I see a lot of promise.  The Diaspora team is dedicated, skilled, and well funded.  I wish them the best of luck in this project.

Lucene Search Coming to openSUSE Wiki

Most openSUSE users are aware that a new version of the English wiki was released back in July, with the other wikis soon to follow.  Among many other changes, the new wiki came with a laundry list of new features.  However, users have noticed that one important feature was still missing in the new wiki… a decent search engine.

I mentioned in a previous article that I am working on a replacement for the default search engine.  Finally, after overcoming some technical hurdles and getting some servers upgraded, the new search engine is ready to go live.  If all goes well, this should happen within the next day.  An article will be written on openSUSE news when the new search goes live, but I also wanted to write about some technical details of the Lucene search engine for those who might be interested.

  • Performance - Testing shows that the new search engine can process about 20 queries a second on the staging server, with the average search running in about 0.2 seconds.  It should be noted that end users will not see searches run this fast, as the wiki takes some time to process the results and display the page to the user.  Even so, this search can handle much more load, and do it much more quickly than the default search, which uses the MySQL search capabilities.
  • Suggestions and Fuzzy Searching – Rather than relying on an external dictionary (e.g. aspell), the search engine uses internal algorithms to build and use a suggestion index based on the wiki content.  There are a number of advantages to this approach:
    • Suggestions are relevant to the content of wiki.  For example, my last name (Ehle) would be flagged as an error by any standard dictionary.  However, a search for my last name with this engine will not generate a suggestion because that word exists in the wiki.  Also, a search for “Ehl” will generate a suggestion for “Ehle” because it is most similar to the search term.  As a bonus, the spelling index is built along with the main index, so new words are automatically added.
    • Suggestions can be performed on phrases as well as words.  For example, a search for “novel linux” would produce a suggestion for “novell linux” even though “novel” and “linux” are both valid terms on their own.
    • Suggestions work for any language.  It doesn’t matter if a good dictionary is available for the language, or if the wiki uses multiple languages (languages.opensuse.org).
    • This approach allows for fuzzy searches, which are searches that automatically include results for similar terms.  The same index that is used for suggestions is leveraged for fuzzy searches, which would not be possible with spelling dictionary.
  • Related Searches – Two articles are considered to be related if they are both referenced in at least one other article.  Thus, if Nvidia and ATI are both referenced in an article about video drivers, those two articles are considered to be related to each other and will show in the other article’s related search.  This feature will become more useful as articles are added to the wiki.
  • Stemming and Synonyms – Stemming lets the search engine use the stems of search terms (“run” in place of “running”) and is available for the more common languages (English, German, Spanish, etc.).  Synonym searching lets the search engine use synonyms of the search term (“operating system” in place of “OS”) and is only available for English.  Synonym searching is not yet enabled, but it likely will be after some additional testing.
  • Indexing – For practical purposes, only full indexing will be performed.  For now, the indexes will be built once a day, but this will probably be adjusted as time goes on.  A full index takes a little over a minute to build, but this will increase to between 5 and 10 minutes as the old wikis are migrated to the new wiki system.

That is about all I can think of for now.  Be sure to watch for the announcement, which will contain information for end users.

openSUSE 11.3 Impressions

While openSUSE is my preferred distribution for server installations, my desktop use of it has been somewhat more sporadic.  However, while reformatting my laptop from Mint to a Windows 7-Linux dual boot, I decided to give 11.3 a try.  Here is a short post about my experience and impressions:

Network Installation

I have tried this in the past and have generally gotten poor results.  However, I really like the concept of network installations, so I thought I would see if any improvements were made with 11.3.  Alas, my experience was similar to how it was with previous releases.  The first attempt failed entirely, so I tried it again.  The second network installation finished, but I could only boot in failsafe mode after it was done.  It seems that some packages get corrupted or don’t install at all.  With this kind of success rate, it still just isn’t robust enough for real world use.  Onto a different installation method…

LiveUSB

Long story short, this was a fantastic experience.  I downloaded the LiveCD image, installed the imagewriter utility, and it all worked as advertised.  I was really impressed with how simple of a process this actually turned out to be.  As an added bonus, the installation went about 1 1/2 times faster than it would have with a LiveCD.

KDE Desktop

While I am personally more of a Gnome user, I am really starting to enjoy what the latest in KDE has to offer.  In addition, the openSUSE developers have done a great job of polishing the distribution’s presentation.  The GRUB menu, splash screen, login window, and desktop look unified and generally very good.  I made a couple of desktop tweaks to fit my preferences, and the customizing experience was much better than with previous KDE versions.  The only problem is that the power manager does not recognize changes in my AC adapter status.  This is a known and filed bug, so I hope this gets fixed sometime in the near future.

Overall Impression

Even in the short couple of years that I have been using openSUSE, I have seen it come a long way.  Network installation still has issues, it took a little while to find information on installing the Broadcom wireless driver, and the power manager could use a little work.  Other than those details, I would have to say that this release is really solid and provides a really clean user interface.  Keep up the good work, openSUSE!

Fun with the openSUSE Wiki

The last couple of weeks have kept me pretty busy with the openSUSE wikis, but the results have been well worth the effort.  Here’s a quick rundown of the improvements made since the 11.3 release:

  • Set up Google Custom Search as an alternative to the default MediaWiki search
  • Submitted a site map to Google
  • Set up a cron to regenerate the site map daily, so new articles can get indexed more quickly
  • Fixed the issue with the new login form
  • Set up protection against spambots
  • Set up a 301 (Permanent) redirect from wiki.opensuse.org to en.opensuse.org
  • Created languages.opensuse.org

Here are a couple of items on the to-do list:

  • Set up Lucene as a drop in replacement for the default search engine
  • Upgrade to MediaWiki 1.16
  • Get the other languages off the old wiki farm and onto the new

A big thanks to the wiki team for their requests, suggestions, and development efforts!  Their work has made the openSUSE wiki system one of the best out there.

Locking Down Install and Edit Files on WordPress

The ability to add/update themes, plugins, and your core installation right from the web interface is a great feature of WordPress.  However, when you are the webmaster for nearly a dozen corporate blogs (each with several admins), this feature can become something of a liability.  What do you do if you need to provide admin rights to other users, but you want to keep installation and upgrade abilities reserved for yourself?

Since I couldn’t find any plugins that could do exactly what I wanted, I developed a couple of shell scripts to accomplish my goal.  The first one, lock.sh looks like this:

#!/bin/bash
echo “Locking Admin Files…”;
cd /srv/;
find -name update-core.php -exec chmod 000 {} \;
find -name theme-editor.php -exec chmod 000 {} \;
find -name theme-install.php -exec chmod 000 {} \;
find -name plugin-editor.php  -exec chmod 000 {} \;
find -name plugin-install.php  -exec chmod 000 {} \;
echo “Finished”;

The second one, unlock.sh, is very similar:

#!/bin/bash
echo “Unlocking Admin Files…”;
cd /srv/;
find -name update-core.php -exec chmod 644 {} \;
find -name theme-editor.php -exec chmod 644 {} \;
find -name theme-install.php -exec chmod 644 {} \;
find -name plugin-editor.php  -exec chmod 644 {} \;
find -name plugin-install.php  -exec chmod 644 {} \;
echo “Finished”;

When it comes time to upgrade, install, or edit, I just run unlock.sh.  This opens up the permissions on the files and allows me to use the web interface for what I need to do on any of the blogs.  When I finish, I just run lock.sh, which removes the permissions and makes it impossible for anyone to use the web interface for installing, upgrading, or editing on all of the blogs.  They are very simple scripts, but they have been very useful to me.

Recursive HTTP Download with wget

There have been a few times when I needed to download an entire directory from a website.  This isn’t a big task if it only involves a few files, but once you get into lots of files or multiple directories, it can become quite a pain.  Recently, I had a need to download well over 100 files from dozens of directories.  I was definitely going to have to find a program to do this for me.

Up to this point, I had been using wget, a handy little GNU utility, to get files from web and FTP servers.  I had recently learned how to recursively FTP files, and the though occurred to me that it may also be able to recursively get documents via HTTP as well.  Sure enough, after a little research, I found out that it could do it very easily.  It will look at each file that it gets, scans for HTML links, and uses them to download the other documents.  Since some of the files pointed to other sites, I needed to find a way to make sure the retrieval stayed inside the directory that I specified.  Sure enough, wget had an option for that too!  Here is how I used it to get everything under otherdirectory, with nothing outside of it:

wget -r -np ‘http://user:password@www.example.com/directory/otherdirectory/’

The -r is for recursive retrieval and the -np is to limit the retrieval to the specified site and directory.  If the site is protected by basic, digest, or NTLM authentication, you can authenticate with user:password.  That’s it!

There are tons of other options for HTTP and FTP document retrieval, including throttling, timestamps, link conversion, and mirroring.  There are too many to list here, so here is the link to the wget manual:

http://www.gnu.org/software/wget/manual/html_node/index.html

Apache and nginx

A couple of weeks ago, I had the interesting challenge of trying to keep a fairly large and busy WordPress site (www.divinecomedy.net) running on a Slicehost 256 slice.  I had actually moved the site from the old hosting provider to Slicehost and things were great for the first few days.  However, as Divine Comedy’s final show of the semester approached, I got a report that the site was down.

One of the first things that I noticed when checking out the server was that each Apache process was taking between 30 and 40 MB of memory (this WP installation was very large).  It was pretty obvious what was going on.  If any more than a few processes were spawned, the server would run out of memory and start swapping like crazy, rendering the site unusable.  What wasn’t obvious was how to get the problem fixed.  I did all kinds of tuning on Apache and MySQL in an attempt to reduce the overall memory usage.  This improved things, but the huge amount of memory used on each Apache process combined with the number of concurrent users was still proving to be too much.  That is when I decided to try a combination of Apache and nginx.

In case you aren’t familiar, nginx is a lightweight web server.  It’s built around the same processing model as Light HTTPd, but it has fewer bugs and an even easier configuration.  Like Lighty, it’s real strength lies in serving static content quickly and efficiently.  Since most of the sight was static content, I figured that it would make sense to front the site with nginx and only pass dynamic and rewrite requests back to Apache.  Thus, the site could handle a lot of users without needing too many memory hogging Apache processes.  I implemented it two days before the show and the results were every bit as good as I hoped.  The site went through the weekend of the show, and every day afterwards, without a single issue.  Later, I tried running an Apache JMeter test on it and saturated my own internet connection without any noticeable affect on the site.  So if you are interested in how to do something like this for yourself, here is how I did it:

nginx

I just went with the default configuration settings and set up a virtual host file that looked like this:

server {
#This is the front end server, so listen on 80
listen   80;
server_name  www.divinecomedy.net;

access_log  /var/log/nginx/localhost.access.log;

location / {
root   /srv/htdocs;
index  index.php index.html index.htm;
proxy_redirect     off;
proxy_set_header   Host             $host;

# If it isn’t a valid file or directory, send it to Apache for the rewrite
# Works for WordPress pretty URLs
if (!-f $request_filename) {
proxy_pass http://localhost:81;
}
if (-d $request_filename) {
proxy_pass http://localhost:81;
}
}

# proxy the PHP scripts to Apache listening on localhost:81
#
location ~ \.php$ {
proxy_pass http://localhost:81;
proxy_redirect     off;
proxy_set_header   Host             $host;
}
}

Apache

Configure it the way you normally would for your site, then set the listen directive  and virtual host, if applicable, to localhost port 81.  You may also want to tune or re-tune your MPM options to reflect the fact that you’re using Apache much less now.

That’s about all there is to it.  Later on, I will also talk about how I moved the backend Apache to mpm_worker and fcgi, with more mixed results.