I last wrote about using evo-devo to compose music,
I had gotten stuck on the problem of implementation. In particular, I couldn’t figure out how to write a seed organism that would develop into a simple composition that I could then use to evolve other tunes. I also wasn’t sure how to get the various genes to actually work together, not at a level at which I could start coding.
After some thought, it occurred to me that enzymes and proteins act sort of like functions in software: they bind to molecules (take arguments), which they can then modify, and sometimes release another molecule into the surrounding medium (return a value). So I just needed to come up with the software equivalent of an enzyme.
One way to think of the problem is as a set of boxes: each box represents a function, which takes zero or more inputs, and has an output. The output of one function might be connected to the input of another. If you play the electric guitar, you can think of it as a bunch of effects pedals and splitters joined together in all sorts of ways.
Functions are easy. Every programming language has those. But how to represent the connections between them? Again, biology suggests an answer: binding sites.
Next to every gene on a strand of DNA is a sequence of nucleotides that aren’t part of the gene, but which still play a role: there are proteins that look for that sequence and attach themselves to it. Promoter proteins activate the gene, by binding to the DNA strand on one side, and locking onto passing ribosomes so that they’re more likely to get caught in position and start transcribing that particular gene (or something like that). Suppressors also bind to the identifying sequence by the gene, but they then block any ribosomes that might try to transcribe that gene, and thereby prevent it from being expressed (again, or something like that). Still other proteins latch onto other molecules floating around the cell.
It all comes down to chemistry and shapes: the arrangement of atoms in the protein is such that when it runs into its target molecule, some of the protein’s atoms and some of the target’s atoms are aligned and bind more strongly than they normally would. This, in turn, can affect the molecular orbitals of both the protein and its target, and cause either one or both to change shape. This can tear the target apart into two more useful molecules, or it can uncover another surface on the protein that can then bind to another molecule, or any number of other things.
Software doesn’t have shapes in this sense. But it does have regular expressions. We can imagine the software equivalent of a protein as a tuple of the form
ABCD, add, DCCA, BBBC, CCAB
Each sequence of four letters is an identifier. The core is the add function, which takes two numbers and adds the together. If it is passed the numbers 2 and 3, it’ll return a tuple of the form 5, CCAB (because that’s the last item on the list). Now, where do the numbers 2 and 3 come from? Simply, there has to be a tuple in the environment that looks like: (2, DCCA), which will bind to the first argument; and another, (3, BBBC), which will bind to the second argument. The sequence ABCD is the function’s identifier: if there’s a tuple of the form (promoter, ABCD) floating around the environment, it will bind to the function and activate it.
We can go a step further and introduce globs: a tuple of the form (promoter, ABC?) can activate any function identified by ABCA, ABCB, ABCC, or ABCD.
You can think of the identifiers as the call letters of a radio station: our add function listens to station ABCD, and when it hears a signal, it takes whatever’s being broadcast on stations DCCA and BBBC, adds them together, and starts broadcasting the result on station CCAB.
There would presumably have to be special proteins for communicating with the cell’s neighbors (i.e., special functions set up by the environment that listen on radio station DCBA and relay anything they hear to adjacent cells). Others determine which notes the cell will play (there might be an array of 128 of these, one for each note, each with an argument that determines the note’s length).
(Yes, this constitutes a sort of blackboard architecture. But that’s what I’d planned all along.)
Setting up this sort of abstract architecture also means that I don’t have to worry too much about coming up with a first cell: I can just start wiring things up at random and let evolution take over.
In the comments, Mcoletti asked how to measure fitness. Obviously, ultimately it’ll have to come down to asking a human which of a set of compositions sounds best. But before that, a lot of the crap can be culled automatically: we might favor pieces between 3 and 7 minutes long, for instance. I bet music theory should provide some insight into identifying harmonious and discordant combinations of notes. We could also restrict ourselves to just pieces that can be played by humans or other such constraints (e.g., no chords on the keyboard part that span more than an octave; no chords on the bass or recorder parts; no chords on the guitar part with more than six notes, or which require impossible fingering to play).
And, of course, we could harness the power of the Internet: during the course of a day, create a generation of candidate pieces, present them in pairs to web site visitors (à la Myspace, but hopefully less obnoxious), and ask them to choose which of the two sounds better. Or rank a composition in the style of hotornot.