Archive for June, 2006

introducing guardMailto

Thursday, 29 June 2006

The Problem

If you put something like this in your website:

<a href="mailto:test@example.com">email me!</a>

spammers can identify that test@example.com is an email address with automated tools that review thousands of websites (analogous to the “spiders” used by search engines like Google and Yahoo). They can add your address to their databases. And eventually, they will surely do so. In fact, it turns out that spammers are pretty good at extracting anything that looks like an email address, whether it’s in a mailto: link or not.

First Generation Solutions

There are already several libraries (and some commercial products) that use JavaScript to dynamically create mailto: links. I even wrote one myself a couple of years ago when I was providing web hosting for a number of small businesses.

The underlying theory is that while spammers could easily get your email address by going to the site, loading the page, and looking at the results, very few spammers will invest the time necessary to do so. If the automated tool doesn’t return an email address, they won’t check each site by hand. The economics of spamming are contingent on the incremental cost of sending each email being very, very low — anything that requires the spammer to invest time eats into the profit margin.

Most of these products used the principal of graceful degradation: If JavaScript was enabled in your browser, you got a functional email link. If JavaScript wasn’t available, you got alternate content provided in a <noscript> tag, if the page author provided it — or maybe nothing at all. Code for this approach — including my “munge” library — generally followed the following format:

A JavaScript library would be included in the <head> section of the document:

<script type="text/javascript" src="emailLib.js></script>

(Don’t worry if you’re not familiar with reading and writing JavaScript code. You don’t need to be a JavaScript expert to use guardMailto, and you can safely skip these examples.)

Then, in the body of the page, there would be some inline JavaScript that used document.write() to insert the link into the page:

<script type="text/javascript">
document.write(makeAnEmailLink("stringEncryptingEmailAddress"));
</script>

And, if you were lucky, there’d be a <noscript> tag that would provide a fallback for users without JavaScript:

<noscript>
email: test (at) example (dot) com
</noscript>

Towards a Second Generation Solution

I observed several serious problems with my “munge” library:

  • It was too complicated and hard to use. Programmers seemed comfortable with it, but web designers often weren’t, and “ordinary” people struggled mightily with it.
  • It was inflexible. It worked ok for text links that were all by themselves, but it didn’t readily support linking an image, for example.
  • It didn’t play particularly well with many “what-you-see-is-what-you-get” (WYSIWIG) web page editors.
  • It put the burden of creating the fallback <noscript> support on the page author. Many page authors were confused about how to do this effectively, and many just didn’t bother.

In the intervening time, I’ve learned a lot about object-oriented JavaScript, and I’ve become an advocate of progressive enhancement as an alternative to graceful degradation. (unobtrusive JavaScript is a related buzzword; I prefer “progressive enhancement” because it’s a design philosophy that’s applicable to more than just JavaScript.)

Progressive enhancement reverses the paradigm of graceful degradation: start with a basic html page that works “as-is,” and then enhance the functionality of the basic page for browsers that support it. Progressive enhancement makes it impossible to strand users of older browsers by omitting a <noscript> tag. It also better supports users who have a modern browser but may have JavaScript disabled or restricted for reasons of security and/or corporate policy. Finally, it often better addresses the accessibility needs of users with visual or other impairments.

Conceptually, the progressive enhancement model looks like this.

We still start with a library included in the <head> section:

<script type="text/javascript" src="emailLib.js></script>

In the body of the page, we have the link information in a format that is easily human-readable, but less easily machine-readable:

email me at <span id="contactAddress">test (at) example (dot) com</span>

And then we have some JavaScript that turns the human-readable text into a clickable link:

<script type="text/javascript">
makeAnEmailLink("contactAddress","stringEncryptingEmailAddress")
</script>

That JavaScript code can be inline in the page body, but it can also be somewhere else, so we can separate the page content from the presentation logic.

There’s one significant catch: You can’t turn the text into a link until after the browser has displayed it in the page. That makes this code a good candidate for an “onload” event, which the browser will run after the page finishes loading.

Enough Talk, Where’s the Code?

You can download either the raw javascript guardMailto.js or a gzip archive guardMailto.js.gz.

guardMailto is released under The MIT License.

Instructions for adding guardMailto to your site are covered in using guardMailto.

passing false when the default is true

Saturday, 24 June 2006

Lately, inspired by the prototype and scriptaculous libraries, I’ve frequently been defining JavaScript functions that accept an anonymous options object, so they can be called like this:

mynamespace.myFunction(requiredParam1, requireParam2, {
overrideSomeDefault:true,
customCallback: mynamespace.myOtherFunction
});

And I’ve gotten fond of boolean short-circuits for parsing the options object and assigning defaults:

options.overrideSomeDefault = options.overrideSomeDefault || false;

Since everything except null, 0, “”, NaN, and undefined is “truthy” in JavaScript (see Simon Willison’s great presentation “A (Re)-Introduction to JavaScript” either as pretty slides or texty notes) this is fast, terse, and readable.

Unless for some reason you explictly need to pass one of the “falsey” values. I hit this first trying to make an autocompleter library work the way I wanted it to, something along these lines:

autocompleter.init(myTarget,myCallback,{minCharsToMatch:0});
...
autocompleter {
init:function(_target,_callback,options) {
options = options || {};
options.minCharsToMatch = options.minCharsToMatch || 1;
...

Oops, I was stuck with minCharsToMatch of 1, which is not what I wanted.

I also think there are times when it makes more sense for a property to have a default value of true than false. It’s clearer (imo) for “enabled” to default true than for “disabled” to default false.

Here’s my attempt at a solution. If it’s reasonable to expect a “falsey” option value, you can do this:

option.aBool = ("undefined" != typeof option.aBool ? option.someBool : true);

Not as pretty, and not as fast. But should get the job done.

fixing wordpress

Friday, 23 June 2006

I kid. I love WordPress. But I always have to mess with it, and I figured it was worth documenting how I always mess with it. If you’re not me, you may or may not find this useful, but if you are me, it should save you some time and help you make sure you (I) hit everything the first time through.

Basic Hardening

  • Get rid of xmlrpc.php and wp-trackback.php.
    (disallows trackbacks, but should protect against remote procedure attacks)
  • Rename wp-register.php and wp-login.php to something else.
    Change all occurences of wp-login.php in the file to whatever you renamed it to (so that the forms invoke the right action). This may seem pretty paranoid, but otherwise it’s vulnerable to brute force attacks.
  • Protect wp-admin with .htaccess
  • Install Spam Karma
    “Kinda mean” setting seems to work well
  • Remove meta links in sidebar.php

basic functional customization

  • options:permalink
    touch .htaccess (in main blog directory) world-writable, set reasonable permalink structure, and remove world-write from .htaccess
  • options:writing
    make sure visual editor and emoticons are both unchecked
  • links
    delete all the bogus default links

style tyranny

  • grep the template directory for jS and change all “F jS, Y” to “j F Y” because I prefer day/month/year order.
  • grep the template directory, add to templates (and change all occurences of class=”narrowcolumn” to class=”widecolumn”), because I want nav on all pages.
  • Change most occurences of “center” in style.css to left
    (headings, mostly)
  • Change all occurences of “justify” to left
  • Get rid of the huge header image. Need to adjust
    #headerimg .description (left margin)
    h1, h1 a, etc. — link, left-margin, alignment and padding
    #header
    #headerimg
  • get rid of the horrible bullet characters, don’t forget to set:
    text-indent:none
  • If you want to allow lists within comments (and why not?):

    .commentlist ol li {
    list-style:decimal;
    }
    .commentlist ul li {
    list-style:disc;
    }

    (apply font-weight:normal, too)

  • Distinguish external links:

    a.external:hover, a.ext:hover {
    color: #147;
    text-decoration: none;
    border-bottom:1px dashed #147;
    }

hacking

In post-editing mode, I don’t like the fact that the categories selection controls are collpased by default when the editing page loads. Currently, I’ve got the “more meta” group of editing controls (Discussion ,Password-Protect Post, Post-slug ,Categories, Post Status) set to dispay opened. This is suboptimal: I rarely want to password-protect posts or customize the slug, and I’d be happy to have those hidden. But having the categories hidden is a pain that leads to posts published with incorrect tags.

I also don’t like my fix, which will be overwritten whenever WP is upgraded. In wp-includes/js the file dbx-key.js initializes 2 dbxGroup control sets, one called “meta” and one called “advanced”. If the default state (7th parameter) for “meta” is set to “open” vs. the default “closed” then the “more meta” control set will be initialized in the expanded state.

I suppose I could write a javascript library that inserts itself into the onload stack and opens just the categories box, but that also seems likely to be fragile against software upgrades.

I have a very hacky little “plugin” that overrides some of WP’s defaults for marking up text and comments. It should stop text strings like www.example.com from being marked up as links automatically, it should add class=”external” to links in comments, and it should format ellipses (. . .) the way my editor tells me they should be formatted.
Here’s what’s in it right now:

function dmw_texturize($data) {
# leave my ellipses alone!
$data = str_replace(". . .",". . .",$data);
return $data;
}
function dmw_rel_nofollow( $text ) {
// dmw hack: class ext applied to links that are rel nofollowd.
$text = preg_replace('|<a (.+?)>|i', '<a class="external" $1 rel="nofollow">', $text);
return $text;
}
# do the regular wp_texturize, then undo some particular things
add_filter('the_content', 'dmw_texturize');
# makes www.something into a link. just don't do this at all.
remove_filter('comment_text', 'make_clickable');
# get rid of the standard wp_rel_nofollow, replace with my version.
remove_filter('pre_comment_content', 'wp_rel_nofollow');
add_filter('pre_comment_content', 'dmw_rel_nofollow');

TODO

Code in comments has proliferating slashes all over the place. Need to figure out how to have this not happen w/o compromising the extra security applied to content text, or trying to write improperly escaped strings into the db.

upgrading sendmail 8.12.8 to 8.13.7

Thursday, 22 June 2006

I’m sure I didn’t do this the way I was supposed to, but here’s what I did. Seems to be working (knock on silicon).

Tried yum, rpm, no dice.
Untarred archive from sendmail.org

service sendmail stop

/$srcpath/sh Build

/$srcpath/sh Build install

(had to make paths for man pages in order to get a clean build)

cd /etc/mail

edit sendmail.mc to add FEATURE(`greet_pause', `1000')

(incidentally, here’s a discussion about the order for whitelisting senders with greetpause)

make -C /etc/mail

make complained about missing $path/sendmail-cf/feature/great_pause.m4

backed up sendmail-cf directory to sendmail-cf.8.12.8, copied new cf from usr/src/sendmail-8.13.7/cf

make -C /etc/mail

make complained about mssing $path/sendmail-cf/hack/popauth.m4

copied that from the sendmail-cf.8.12.8 backup

make -C /etc/mail

is happy

service sendmail restart

confirmed version 8.13.7 via

/usr/sbin/sendmail -d0 < /dev/null | grep -i version

and headers on test emails look good.