Tuesday, May 11, 2004


Self-Reproducing Code

The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)I was reading the Green Hills commentary on Linux insecurity (see one of yesterday's blog entries). There was a reference to Ken Thompson's classic article ACM Classic: Reflections on Trusting Trust. In it, he discusses why compilers -- written in the language they compile -- can't be trusted.

As an exercise, he proposes the following programming challenge: ...the problem is to write a source program that, when compiled and executed, will produce as output an exact copy of its source. If you have never done this, I urge you to try it on your own. The discovery of how to do it is a revelation that far surpasses any benefit obtained by being told how to do it. The part about "shortest" was just an incentive to demonstrate skill and determine a winner....

Strangely enough, I don't think I'd ever attempted such an exercise before. So after a longer programming session than I'd originally envisioned, I created the following self-reproducing PHP script which I release to the world with all the caveats and limitations of liability specified by this agreement.

Anyhow, here it is. I'd be interested to find out other languages in which this has been attempted (and I'd be happy to publish source here and credit the authors). Email me if you've got a candidate. My lessons learned for PHP are:

- Automated variable expansion in double-quoted strings can be a hindrance in an exercise such as this
- Escapements, escapements, escapements!

<?php
function Esc($s) { return (str_replace(chr(0x27), chr(0x5c).chr(0x27), $s)); }
$aSrc = array(
'echo("<?php\r\n");',
'echo(\' function Esc($s) { return (str_replace(chr(0x27), chr(0x5c).chr(0x27), $s)); }\'); echo("\r\n");',
'echo(\' $aSrc = array(\'); echo("\r\n");',
'for ($i = 0; $i < sizeof($aSrc); $i++) {',
' echo(chr(0x09).chr(0x09)."\'".Esc($aSrc[$i])."\',".chr(0x0d).chr(0x0a));',
'}',
'echo("\t);\r\n");',
'for ($i = 0; $i < sizeof($aSrc); $i++) {',
' echo(chr(0x09).$aSrc[$i].chr(0x0d).chr(0x0a));',
'}',
'echo("?>\r\n");',
);
echo("<?php\r\n");
echo(' function Esc($s) { return (str_replace(chr(0x27), chr(0x5c).chr(0x27), $s)); }'); echo("\r\n");
echo(' $aSrc = array('); echo("\r\n");
for ($i = 0; $i < sizeof($aSrc); $i++) {
echo(chr(0x09).chr(0x09)."'".Esc($aSrc[$i])."',".chr(0x0d).chr(0x0a));
}
echo("\t);\r\n");
for ($i = 0; $i < sizeof($aSrc); $i++) {
echo(chr(0x09).$aSrc[$i].chr(0x0d).chr(0x0a));
}
echo("?>\r\n");
?>


Email me if you've got another one like this. Remember, its output must produce an exact copy of its source without using the file-system or other intermediate storage. It has to generate the source on its own!

No comments: