This was a quick, fun exercise to remind me that I can still write Perl. It fetches the list of TLDs from IANA, does a quick bit of munging, then renders a regex which should match any valid FQDN:

#!/usr/bin/env perl
use strict;
use warnings;

use LWP::Simple;

my $fqdn_regex;

if (my $content = get('')) {
  $fqdn_regex = '(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:';
  $fqdn_regex .= join('|', grep (!/^(#|xn)/i, (split /\n/, lc($content))));
  $fqdn_regex .= ')';

my $regex = $fqdn_regex . '(?:\s|\/|$)';
print "$regex\n";

Several caveats:

Maybe I’ll extend it for completeness and/or rewrite it in Ruby someday. Until then, it’ll always be ~/bin/tld_regex for me.

blog comments powered by Disqus


18 August 2009