Polymorphism Script
by xdroop
#!/usr/local/bin/perl # A script to demonstrate polymorphism. # # Written 23 May 2000 by xdroop # # This script is a demonstration only. Be careful with it. # I deny any responsibility for any variants or decendants # (including the demonstration itself after being run once). # # This script was written after the polymorphic variant of # the ILOVEYOU Outlook .vbs worm appeared. The media reported # that the polymorphism demonstrated was merely writing itself # along with about 100 random comment characters interleaved # inside it. That got me thinking -- every time it ran, it # would increase its size, and would quickly become too large # to spread effectively. # # A better strategy is to remove _all_ the existing comments, # then sprinkle random lines of comments every so often # as the script re-wrote itself. # # You see, the way the majority of these email virus detectors work is # they scan attachments looking for "signatures" -- that is, a known # sequence of characters at a known offset from the beginning or end # of a file. By varying the number and length of the comments, any # character constants tend to move around, making signaturing a hit-or-miss # proposition at best. # # The next defense isn't exactly polymorphism. It attempts # to make the script shorter and harder to read by attaching # short lines together. This will only work so often; eventually, # all the lines in the script will be of a length that the # script will refuse to attach any more together. # # By combining these techniques, you end up with a script that # hovers around a certain size when run repeatedly -- but isn't # of a predictable size. # # The third defense was an off-the-cuff idea. To make the # script even harder to read and follow, why not rename all # the variables and subroutine names as they were hit? # # Finally, the script has a couple of long string constants which are used # to select the characters permitted in random comments and mangled # variable/subroutine names, and the script re-writes these constants each # time it is run. # # Put together, this script looks remarkably like line noise # after run a couple of times... # # There are lots of ways that the code could be improved. The # script has only been tested on itself, so can't be counted on # to morph abstract perl code. The loadArray subroutine in particular # can't handle any characters which have special meaning in regexps. # I didn't do this to prove how elite I am, nor to show how hot a # coder I was. I am neither elite nor a hot coder. I am merely a # system administrator who fought both Melissa and ILOVEYOU and found # the common defenses lacking. The ideas contained herein are interesting # problems, and the idea of defending against them is a similarly # interesting problem. I like interesting problems. # # The reason why I did this in perl is because right now, the script is # completely harmless. There is no trivial way to turn it into a world # eating email virus, although it could be extended to trojan fairly # trivially. This is merely a technology demonstrator. Someone suitably # clever could write it up in .vbs -- I can't, since I don't know .vbs. # # Consider the ramifications if ILOVEYOU had the following improvements # in it's original release: # - select subject line at random from an email already in victim's inbox # or use a blank subject line # - a polymorpher similar to this # # This theoretical virus would have eaten the outlook world alive, since # none of the immediate defenses (attachment signaturing, blocking known # carrier subject lines) would have worked. As it was, the common # subject line block for Melissa could be defeated if the infected # victim computer used a different encoding for their text. We saw # numerous examples of where a brazilian system would send an email # with the literal subject line # Subject: =?iso-8859-1?Q?Important_Message_From_$BRAZILIAN_VICTIM?= # ...and there was no way to make sendmail see that as an infected message. # # ALL attachments would have to be denied, a prospect which # wouldn't be terrible to most security concious administrators. # VBS would have to be squashed in Outlook for once and for all. # The script was written for perl 5.005_03 and has been run under # linux and solaris. My guess is that it will run on any unix-like OS. # It may run on windows -- you'll definitely need to change the 'cp' # command to something Windows can do for you -- but I don't really # care, since I don't run Windows... # # Greetz to cyclone and dr dave -- two kindly gents from the old country. # Know where your $FUZZY_SQUEEKY_PINK_THING is, guys? Didn't think so. # # don't worry about all the comments in the code -- running the # script once will fix that :) # # this is our name. $BASENAME = $0; $BASENAME =~ s|\\|/|g; if ($BASENAME =~ m|(.*/)(.*)|) { $BASENAME = $2; } # this is the maximum lines since comment $mlsc=int(rand(4)); # this is the maximum line length, in characters $mll=75; # this is a list of variable names we don't want mangled. # note that this isn't an exaustive list, just enough # to make the script mangle itself. @reserved=("0","","2","_","1",); # this is a list of characters for use in fake comments. # since comments have more spaces than anything else, # there are lots in the source string. $commentSource='`1234567890 -=~!@# %^&_ qwerty uiop QWERTYUIO P{}|a sdfghj kl;ASDFGHJ KL:"zx cv bnm,. /ZXC VBNM<> '; # this is a list of characters for use in variable and subroutine # names. It is different because there are no spaces and things in subroutine # names (duh!) $secondSource='1234567890qwertyuiopasdfghjklzxcvbnmMNBVCXZASDFGHJKLPOIUYTREWQ_'; # Load the selector arrays. See the comment with the sub definition. @a=&loadArray($commentSource); @b=&loadArray($secondSource); # if we exist if (-e $BASENAME) { # make a copy of ourselves `cp $BASENAME old-$BASENAME`; # open the files, complain if we can't open(IN,"<old-$BASENAME") or &die($!); open(OUT,">$BASENAME") or &die($!); # we haven't seen any comments yet $lsc=0; # for each line in the script while(<IN>) { # remove \n chop; # remove leading whitespace s/^\s*//; # don't bother if there is nothing left next if (!$_); # if we are not looking at a comment (\043 is the char code for #) if (!/^\w*\043/) { # if we have not seen a comment in a while if ($lsc > $mlsc) { # print the line being constructed and reset variables print OUT "$output\n"; undef $output; $n=&nc(); print OUT "$n\n"; $lsc=0; $mlsc=int(rand(4)); } # it's been another line since we saw a comment $lsc++; # go change all the variable and subroutine names $r=&tokenizer($_); # a clumsy bit of code to see if our candidate line is too long $c_out="$output$r"; if (length($c_out) > $mll) { # it is too long, print the current line and then # stick the new stuff in the holding area print OUT "$output\n"; $output=$r; } else { # it isn't too long, so glue 'em together $output=$output . $r; } # go do it again next; } else { # right, this is a comment # so, if we have a she-bang (like #!/usr/local/bin/perl) if (/^\#\!/) { # If we have not printed one already, just print it if (!$SHEBANG) { print OUT "$_\n"; $SHEBANG=1; } } } # if we get here, it is a comment line that doesn't have # a she-bang, so it gets discarded by inaction. Back to the # top of the while loop! } # print the stored output line since we are done print OUT "$output\n" if ($output); # we are done close OUT; } # sub nc generates random comments # forgive the variable names, it was written before the # variable name mangler was written. sub nc { local($s,$r,$i,$c,$l); # store the length of the array with the comment characters # (from right at the top) $l=@a; # $i is the index we'll generate randomly # $c is the number of characters already placed in the comment # $s is the string we are building, its a comment so put a # in it $i=0;$c=0;$s="\043"; # $r is the actual length of the comment we'll build $r=int(rand(75))+1; # while we are not done while ($r > $c) { $c++; # pick a character $i=int(rand($l)); # glue it on $s=$s.$a[$i]; } return $s; } # sub tokenizer was originally going to be a dragon-book # tokenizer, and I even had a basic rough out going, but # then I realized that I could just use regular expressions # to check to see what I had. sub tokenizer { # the string to mangle local ($string)=shift @_; # $return is the string we will return # $char is a holding area while we loop through things local ($return,$char); # while we still have string to work with while($string=~/^(.)/) { # grab the result of the match $char=$1; # strip the held character off the front of the string $string=~s/^.//; # check for trigger states if($char eq "\$" or $char eq "\@" or $char eq "\%" or $char eq "\&" or $char eq "s" or $char eq "\047") { # right, we think we have something worth mangling. # First thing to check is whether we are dealing with one of # our two possible "signature" strings -- $x and $z. If # we are, we can re-order the strings at random so that there # is no signature. if ($char eq "\047") { # try to pick the rest of the string out if($string =~ /^(.*)\047/) { $candidate = $1; # check to see if it is one of the signature strings. if ($candidate eq $commentSource or $candidate eq $secondSource) { # scramble it $scrambled=&scramble($candidate); # ...now tack it on the output string and clean up the # source string. $return=$return.$char.$scrambled; $string=~s/.*\047//; } } # if we get here, we are in one of two cases: either we have # a string which isn't a signature, or we have a string which # isn't a string (probably a hit from the messy code, above). # In both cases, we need to slap the remaining " character # on the output string (either to close the string we just # rewrote, or to pass the beginning of our harmless string # on through) -- and then kick out of this trigger state # detector to the top of while loop. $return=$return.$char; next; } # special handling: subroutines. With every other trigger # you can just glue the trigger on the return string, but # the subroutine trigger is three characters long. So we # check for the whole trigger, then doctor both the source # and return strings so that they will work with the mangler # code written for the other trigger states. if ($char eq "s") { # if this is a 'sub' if ($string =~ /^ub /) { # put the characters s,u,b, space intothe return string $return=$return."s"."ub "; # hack it off the source string $string=~s/^ub //; } else { # ok, it isn't a subroutine, false alarm, glue it on # the return string and go back to thetop of the loop. $return=$return.$char; next; } } else { # it isn't a sub, but it is one of the other trigger # states -- we're good, glue the trigger on the return # string. $return=$return.$char; } # zap the name of the target from last time (important!) undef $varname; # clumsy loop time. Grab the next character and if itisn't # a non-name character, glue it on the variable name and # hack it off the source string. $string=~s/^(.)//; $char=$1; while ($char =~ /[a-zA-Z0-9_]/) { $varname=$varname.$char; $string=~s/^(.)//; $char=$1; } # assume that the variable name isn't a reserved name $OK=1; # check each reserved name. If it matches our name we # just built, we can't mangle it. foreach $name (@reserved) { $OK=0 if ($name eq $varname); } if ($OK) { # let's go mangle it! If we have not see this name before... if(!$lookup{$varname}) { # we go create a new name. $lookup{$varname}=&getNewVarName(); } # and now, the mangling. $varname=$lookup{$varname}; } # glue the mangled (or not) varname on the output string. $return=$return.$varname; # we are still holding a character from the last loop, # glue it back on the input string and we go again. $string=$char.$string; next; } # we don't have a trigger state. Just glue it on the output string. $return=$return.$char; } # we are out of input string, return it. return $return; } # sub getNewVarName generates the new variable/subroutine names. sub getNewVarName { # $name is the name we are building # $count is the number of characters we still have to add # $index is the index into the array of acceptable characters # $alength is a place to hold the length of the array local ($name,$count,$index,$alength); # hold the length $alength=@b; # determine how many characters to use -- between 3 and 8 $count=int(rand(6))+3; # while we are not done while ($count > 0) { # another character $count--; # ok, if this is the first character in the name, we can't # use any of the special characters (which in this context # means 0-9 and _) because they have special meaning. So # we loop through the randomizer until we get one that isn't # special. if (length($name) < 1) { while($b[$index] =~ /[0-9_]/) { $index=int(rand($alength)); } # got a character, use it $name=$b[$index]; # back to the top of the loop with ya! next; } # pick a card any card $index=int(rand($alength)); # glue it on $name=$name.$b[$index]; } # return it to the breathless masses return $name; } # # scramble the supplied string so that it is different. sub scramble { local ($string,$scrambled,$count,$char,$number); # $string is the the input string. $string = pop (@_); # $count is the number of characters in our string. $count=length($string); # $scrambled is the scrambled string # $char is the character we are currently dealing with # $number is our random number between 0 and $count. while ($count) { $number=int(rand($count)); $string=~m/^.{$number}(.)/; $char=$1; $string=~s/$char//; $scrambled=$scrambled.$char; $count--; } return $scrambled; } sub loadArray { # here's an opportunity for improvement. I use the arrays # to store single characters to make random selection # easier. Clumsy, yet effective. local ($string,$char,@array); $string=pop(@_); undef @array; while ($string) { $char = chop $string; push (@array,$char); } return @array; } # # A (braindead) undertaker. These two are from my # template that I use for all my perl scripts. sub die { local ($gripe); $gripe = pop(@_); &warn("fatal:$gripe"); exit 1; } # # A (braindead) friend for our undertaker. sub warn { local ($gripe); $gripe = pop(@_); print STDERR "$BASENAME:$gripe\n"; }Code polymorphism.pl