
Simulated Aesthetics and Evolving Artworks:A Coevolutionary Approach
The application of artificial-life principles for artistic use has its origins in the early works of Sommerer and Mignonneau, Sims and Latham. Most of these works are based on simulated evolution and the determination of fitness according to aesthetics. Of particular interest is the use of evolving expressions, which were first introduced by Sims. The author documents refinements to the method of evolving expressions by Rooke, Ibrahim, Musgrave, Unemi, himself and others. He then considers the challenge of creating autonomously evolved artworks on the basis of simulated aesthetics. The author surveys what little is known about the topic of simulated aesthetics and proceeds to describe his new coevolutionary approach modeled after the interaction of hosts and parasites.
It would be prohibitive to attempt a comprehensive survey of all the artistic endeavors that have been influenced or inspired by artificial-life principles, both for reasons of space and because of the difficulty of documenting so many of the works that have been exhibited. Any such survey, however, should bring to light two important themes: (1) the incorporation of emergent behaviors into artistic works and (2) the exploration of simulated evolution for artistic purposes. The theme of emergent behavior forms the cornerstone for many interactive works, including the installations of Christa Sommerer and Laurent Mignonneau [1] and Rebecca Allen [2]. Such works may possibly trace their origins to the MIT Media Lab ALIVE project [3]. Fueled by rapid advances in autonomous robotics, the growth of the gaming industry and the popularity of such toys as Tomagotchi and Furbies, so-called behavior engines and their emergent behaviors continue to make their presence felt in the world of fine art. On the other hand, the exploration of simulated evolution in the fine arts has not received such emphasis. I wish to survey its origins and development in greater detail.
Michael Tolson, co-founder of the digital-effects company Xaos Tools, won the prestigious 1993 Prix Ars Electronica award for his series of still images entitled Founder's Series. The series was computer generated with the aid of evolved neural nets. Since Tolson's software was proprietary, details of precisely how this was done are fragmentary. In print, Tolson described his method as applying the genetic algorithm to populations of neural nets in order to breed intelligent brushes [4]. This is slightly misleading. In fact, Tolson's neural nets were released onto background images, where they could sense and react to cues introduced by the artist. By responding to such cues, areas of the background image could be modified according to the brush procedures the neural nets had been bred to implement. As a SIGGRAPH panelist, Tolson showed videotape of the breeding stages of a population of neural nets that were trained to be photosensitive. The tape revealed that the neural nets were set in random motion on the surface of a background image. When patches of pure white were added to the image as cues, the photosensitive neural nets would streak toward these patches, dragging along underlying image colors [5]. Tolson's efforts seem not to have been duplicated, however [6].
A second example involving evolution of neural nets is Henrik Lund's Artificial Painter [7]. It has a much different flavor: Each neural net is coded as a bit string so that the genetic algorithm can be applied to assist in the simulated evolution. In this case, however, the computer-generated image is obtained from just one neural net by mapping the net's output response to a color at every cell (i.e. every pixel) of the background image, thus using the entire background image.
In order to gain some understanding of the principles underlying the simulated evolution of populations of neural nets, we must turn to techniques that originated with Richard Dawkins.
Dawkins introduced the fundamental concept of user-guided evolution in a seminal paper given at the first Artificial
An example of a phenotype generated from the genotype of a binary basis function. (© Gary Greenfield) This basis function (after Maeda) is defined on the unit square. Its postfix expression is: V0 V1 B9.
[End Page 283]
Computer-generated coevolved image, 2000. (© Gary Greenfield) A degenerate image, presumably a local optimum in image space, that arose during coevolution by exploiting an initial flaw in the aesthetic measure of fitness. Such striped, or often cloud-like, images present a rugged host fitness landscape to the parasite filters because of their many local discontinuities.
Life conference (1989) [8]. The paper described his Biomorphs program, an interactive program he eventually marketed commercially, which allowed users to guide the evolution of a population of two-dimensional drawings by interactively assigning their fitness values, thus deciding which drawings would survive and reproduce. Each drawing was generated by a recursive Pascal drawing routine whose drawing parameters constituted the drawing's genotype.
Subsequently, Karl Sims combined Dawkins's user-guided evolution concept with a method for generating computer images based on evolving expressions, thereby obtaining an artificial-life-inspired art medium [9]. For Sims, an image, representing the phenotype, was generated from a LISP expression, representing the genotype. The user, viewing the population of phenotypes, interactively assigned aesthetic fitness values to the phenotypes (i.e. the user rated the images according to their aesthetic appeal) so that mating and mutation of the genotypes of the most-fit individuals could take place in accordance with the rules of the underlying artificial genetics. It should also be noted that evolving expressions play a fundamental role in John Koza's optimization method, known as Genetic Programming [10]. Concurrent with Sims, William Latham, with technical assistance from Steven Todd, began exhibiting computer-generated, synthetic, three-dimensional organic sculptures, also created using Dawkins's evolutionary paradigm requiring the user to assign fitness based on aesthetics [11]. The difference between the two approaches was that Sims's genotypes were (binary) trees implemented as LISP expressions, while Latham's genotypes were parameter sequences implemented as bit strings.
Because Sims's original design was too computationally demanding to be of general use, Sims's successors proceeded to refine his methods for designing and implementing image generating systems based on evolving expressions and simulated evolution using fitness by aesthetics. Perhaps most well-known are the works of Steven Rooke [12], though we were contemporaries [13]. Tatsuo Unemi made a very restrictive X Windows version of a Sims-style system available via the Web [14]. Michael Papka et al. designed a Sims-style system to evolve 3D polygonal isosurfaces as an application to help test the immersive environment CAVE [15]. Frank McGuire described a Sims-style 3D polygonal modeling system [16]. Aladin Ibrahim invoked Sims's methods to evolve Renderman shaders, texturing and coloring routines that are used for animation [17]. Ken Musgrave developed a Sims-style prototype for MetaCreations, a graphics software company [18]. Ted Bedwell and David Ebert investigated the possibility of using Sims's method to evolve implicit surfaces [19]. John Mount exhibits via the Web a Sims-style system based on quaternion maps [20]. Additional examples are described by Andrew Rowbottom [21]. Undoubtedly there are many more examples of which I am not aware.
Where is this leading? In the future, one might envision a virtual reality populated by artists (populations of image-producing agents) and art critics (populations of image-consuming agents), who will decide which artworks shall be on display for the humans visiting the virtual art galleries. But for this to occur some means must be found to automate simulated evolution so that a user need not be the hands-on decision-maker directing the simulated evolution. Three initial forays have been made into this area. They are considered below.
Simulated Aesthetics
In 1994, three Carnegie Mellon University graduate students-Shumeet Baluja, Dean Pomerleau and Todd Jochem-published the results of their efforts to fully automate a Sims-style system [22]. After implementing a bare-bones Sims-style image generation system, they collected images generated during a user's session in order to establish a database of such images for that user. Next, 400 randomly selected images from the user's database were numerically rated by that user for their aesthetic value. Rated images were then resolved to 48 X 48 pixels and training and testing sets of rated images were extracted, with equal representation given to low-, medium- and high-ranked images. The researchers trained and tested a neural net using these sets and, finally, had the net guide the interactive evolution by assigning the fitnesses, without human intervention, to populations of images generated by their Sims-style system. In this way, the net substituted the user.
The effort expended, together with the number of different experiments the authors performed, was impressive. According to the authors, however, the results were "somewhat disappointing"; "mixed and very difficult to quantify." They concluded that it was difficult for the neural nets to learn or discern any aesthetic principles. The authors also noted that the neural nets' exploration of image space was "very limited and largely uninteresting." They suggested that the greatest potential for such automated approaches might lie in pruning away uninteresting images and directing the human "assistant" to more promising ones, because their nets did a good job of aesthetically rating poor images but were erratic when aesthetically rating good images.
Shortly thereafter, Rooke undertook similar experiments, but using a much different experimental design. Rooke's objective was to evolve art critics, which he called "image commentators," to perform aesthetic evaluations of the images [End Page 284] his Sims-style system generated, which he could then use to guide the image-evolution process [23]. Rooke planned to present his critics with visually promising, as opposed to random, starting populations of genotypes. Thus, unlike those of Baluja et al., his critics were not forced to start from scratch. Each critic, itself an expression, but one capable of numerically combining many image-processing operations on arbitrary tesselations of pixels in the image phenotype, assigned an aesthetic fitness value to each image in the population. The training set Rooke used consisted of 100 previously evolved images together with Rooke's own aesthetic rankings of those images. Since populations of critics were also populations of expressions, albeit expressions packed with location-finding primitives and statistical measurement primitives, Rooke's critics could be evolved using the artificial genetics of expressions. Rooke evolved his critics until they could duplicate the fitness rankings of his training set to within an acceptable tolerance. To put his critics to work, Rooke gave them his top-ranked images from 20 successive generations of an evolutionary run. After each subsequent generation, the oldest of these images would be removed and the image from the current population with the best aesthetic fitness as judged by the top-ranked critic would be kept. Thus, after 20 generations the critics were in complete control. Evolution of the critics was sustained by matching their ranking ability against the sliding window of 20 images, while evolution of the images themselves was sustained by the rankings of the critics. Rooke let his critics guide the evolution for 300 generations.
Rooke judged his art critics to have been capable of learning his aesthetics, but, once again, they seemed incapable of using this "knowledge" to explore new areas of image space. One plausible explanation is that Rooke's critics were being trapped in eddies of image space. Rooke suggested that it might be necessary to work side by side with his critics and intervene every so often to put them back on track by reassigning aesthetic fitness rankings to the current image set, and then re-evolving the critics-a human-assisted coevolution scenario. Following the 1997 Digital Burgess Conference, Rooke and Steve Grand initiated an on-line discussion [24] about the viability of using artificial-life coevolution techniques for aesthetics. One idea that emerged was the need of a "physics" for aesthetics, meaning a theoretical framework from which aesthetic principles could be derived or tested.
Computer-generated coevolved images, 2000. (© Gary Greenfield) These coevolved images were culled from two different runs of my coevolution simulation. They were coevolved starting with small random host image populations, but with running times lasting up to 6,000 generations. Noteworthy is the ability of the coevolutionary system to produce diverse imagery by exploring different evolutionary trajectories in image space.
A tangential development, which I feel plays an important role in understanding the aesthetics of visual images, is the work of Tony Belpaeme [25], who attempted to evolve expressions whose internal nodes consisted entirely of classical image-processing primitives in an attempt to discover new and useful digital filters for computer vision. The aesthetic, or fitness criterion, used for the experiment was how successful the filter was at distinguishing between the images of his test set. One intriguing outcome from Belpaeme's experiments was that the evolved filters turned out to be very small expressions, possessing very few internal nodes. One might wonder if there was a hidden bias towards computational efficiency incorporated into the fitness criterion. The explanation Belpaeme offered was that chaining of image processing functions caused a significant loss of image information content, thereby reducing the fitness of expressions with many nodes.
Such prior work helps indicate why I think the problem of automating the fitness-decision step for evolving images is a difficult one, and why I see coevolution based on the Sims method as a significant challenge. Before taking up this challenge, we must review some of the developments of artificial-life coevolution research. The first artificial-life coevolution simulation was published by Danny Hillis [26], who applied coevolutionary techniques to an optimization problem based on sorting. Subsequently, [End Page 285] Sims gave a stunning example of learning in an artificial environment based on coevolution [27].
Computer-generated coevolved images, 2000. (© Gary Greenfield) These are coevolved images from runs permitting larger sized genotypes but with the simulation allowed to run for only 1,500 generations. The host population size was 30 with three parasites per host. The images shown here were selected from a pool of the most-fit images obtained by culling from the host population every 200 generations. These examples demonstrate the ability of the system to produce visually complex imagery by exploring different evolutionary trajectories in image space.
Using directed graphs for genotypes, Sims constructed virtual "creatures" to compete in virtual contests of "capture the flag." Sims made mesmerizing videos of the evolved behavior of his creatures.
Following Sims's example in investigating "creature" evolution in artificial environments supporting various kinds of artificial physics, others have caused impressive behaviors to evolve, including artificial walking and swimming [28].
There are several obstacles to overcome when trying to adapt previous coevolutionary research to the problem of evolving images based on fitness by aesthetics. First, Hillis's sorters are straightforward to assign a fitness to. Clear optima are recognizable. Sorters either sort or do not. Similarly, Hillis's coevolving population of parasites is either clearly successful at preying upon the sorters by finding examples of difficult lists for them to sort or it is not. Second, the behavior generators coevolved by Sims and his successors seem to depend on competition between individuals of the same species for a resource, or for successful completion of a task, which is a far cry from selecting promising visual images.
Further complicating matters, the very nature of coevolution and its underlying principles has recently begun to be questioned. Dave Cliff and Geoffrey Miller point out the difficulty of recognizing and measuring the so-called Red Queen effect, which results when two coevolving populations change each other's fitness landscapes [29], while Sevan Ficici and Jordan Pollack question the sustaining power of the coevolutionary arms race by analyzing mediocre stable states and the prevalence of evolutionary cycling [30].
At last I am able to clearly formulate the problem I try to address: Given a Hillis coevolutionary framework, with one species consisting of host images generated using the Sims method of evolving expressions and another species consisting of image parasites whose design is predicated by the concept of image filtering, in order for image parasites to prey upon the host images based on aesthetics, how will the parasites be judged? In other words, what will the algorithmic assessment of aesthetic fitness be? Frieder Nake, in commenting on early attempts by Max Bense and Abraham Moles to use the Shannon concept of information theory, which allowed them to numerically quantify the "message content" of a visual image, as the guiding principle for an analysis of the aesthetic processes, concluded: "Although some exciting insight into the nature of aesthetic processes was gained this way, the attempt failed miserably. Nothing really remains today of their theory that would arouse any interest for other than historical reasons" [31].
In a recent issue of the Journal of Consciousness Studies, Vilayamur Ramachandran and William Hirstein sparked considerable debate by putting forth a number of computational aesthetic principles, the primary one being exaggeration [32]. Interestingly, the computational formulation of exaggeration that they propose is strikingly similar to the fitness measure used in the artificial-life sexual selection experiments of Gregory Werner and Peter Todd [33]. In considering the challenge of evolving a population of images (hosts) and a population of aesthetic observers (parasites), one is heartened by the words of an artificial-life visionary, Thomas S. Ray, who wrote,
We do not know yet, if we can ever expect evolution in the digital medium to express a level of creativity comparable to what we have seen in the organic medium. However, it is likely that evolution can only reach its full creative potential, in any medium, when it is free to operate entirely by natural selection, in the context of an ecological community of co-evolving replicators [34]. [End Page 286]
Inspiring words. Below, I shall take up this coevolutionary challenge using aesthetic fitness.
Images from Expressions
I now formally describe how I generate phenotypes (images) from genotypes (expressions). A genotype is defined to be an expression tree E written in postfix form. The leaves of the tree are chosen from a set consisting of constants with values ranging from 0.000 to 0.999 in increments of 0.001 together with the variables V0 and V1. The internal nodes are chosen from sets of unary and binary primitives. We often refer to primitives as basis functions. A unary primitive is a function from the unit interval to itself, and a binary primitive is a function from the unit square to the unit interval. A left-to-right stack evaluation procedure assigns each point (V0,V1) of the unit square a value E(V0,V1) in the unit interval, which is then assigned to a color. For convenience, I resolve phenotypes at a resolution of 100 X 100 pixels. The nodes of the genotype, sometimes referred to as alleles or nucleotides, possess arity-technically, the number of arguments each basis function requires: zero for terminals, one for unary basis functions, two for binary basis functions. Three of my binary basis functions were adapted from John Maeda [35] (see Fig. 1) but most were holdovers from previous image-generation systems I have designed [36].
Parasites for Images
My motivation is as follows: A visually "interesting" image is one that causes our filtering apparatus-our eyes-to generate anomalies for our brain to process. Recent findings by Stine Vogt [37] offer evidence in favor of this hypothesis. I want my images to evolve in such a way that they affect our filtering apparatus. Thus, I attach digital filters to fixed locations on the image, convolve small areas, or patches, of the image with the filters, and then compare the convolved patches with the original patches. I seek images for which the convolved patches are significantly different from the original patches. The interpretation is that the filter is parasitic upon the host image, attempting to blend with the image at the location at which it is attached-the patch-while the host image attempts to repel the parasite by making it visible as a blemish. Apart from the goal of generating images rather than filters, the crucial difference between my model and Belpaeme's is that I use only one simple type of filter to process a small part of an image, while he uses an expression tree made up of many different complex types of filters to process entire images. I provide a brief overview of the necessary details [38].
Given a 100-X-100-pixel host image defined digitally by the values hi, j lying in the interval [0,1] and obtained using the expression E(V0,V1), where V0 = i/100 and V1 = j/100, at a fixed location L, I extract a 10 ≤ 10 patch pi, j with 1 ≤ i, j ≤ 10. A parasite is represented as a 3 x 3 matrix of integers (fi, j) with 0 ≤ i, j ≤ 2, whose values are restricted to lie in the interval [-P, P], where the constant P is currently chosen to be P = 8. The neighborhood of the patch is the 12 X 12 region of the image consisting of the original patch surrounded by a 1-pixel-wide border. When I pass the filter over the neighborhood, I obtain a convolved patch vi, j. To make precise the comparison between the original patch and the convolved patch I assign the fitness H to the host image by tallying those pixels for which |vi, j - pi, j| exceeds the host's exposure threshold T. A threshold of T = 0.05 seems to work well. Because the patch is 10 X 10 we are able to define the fitness for the parasite to be P = 100 - H. When several parasites are attached to a host, the host's fitness is the average of its fitness values taken over all its attached parasites.
Artificial Genetics for Hosts and Parasites
I use the standard genetic operators for the host expression genotypes. The host crossover operator exchanges subtrees between two host genotypes, and the host point mutation operator causes every node of the host genotype to have a small probability of being replaced using a different basis function selected from among the set of basis functions of the same arity. I am not aware of any artificial genetics having been previously implemented for (3 X 3) image filters. Because I felt that filters/parasites should be viewed as exceedingly primitive organisms I did not use any mating operators. Instead reproduction was accomplished by cloning the parasite and subjecting the clone, with some small probability, to a small number of transcription operators (e.g. exchange of two rows or columns, shifts of a row or column, or exchange of two entries) before passing it to the parasite point mutation operator, which, with some small probability, allows for one or more entries of the matrix to be perturbed.
Coevolution of Evolved I Mages
My coevolutionary scenario can now be straightforwardly described. During initialization I fix the locations that will be available for parasites to attach to. I generate a random population of hosts and attach randomly generated parasites to each host at each of the available locations. Parasite populations are specific to the locations where they are attached, analogous to the way a species of fish might have wholly different parasites for specific internal organs. This means that for each location, every host has a parasite attached at that location, and the collection of all parasites attached at that location constitutes a distinct parasite population. At each time step, fitness updates are calculated and the least-fit hosts are removed from the population. Matings between randomly chosen surviving hosts are used for replacements. Similarly, for the parasite populations, at each location the least-fit parasites are removed from their hosts and their replacements are determined by cloning the most-fit survivors from that location's population. Moreover, to further reinforce the notion that parasites are a primitive, more rapidly evolving species than hosts, all the parasites of a population are subject to some small chance of mutation at each time step. A newly conceived host inherits the parasites that were attached to the host it has replaced [39].
Since a host's parasites survive on patches representing only one percent of the total area of the host, and since the phenotype can be processed as a digital image rather than a visual image, the co-evolution implementation is fast. Of course, to monitor the coevolution, I must cull the host population and examine the phenotypes. Typically I cull one or two host images with the highest fitnesses every 200 time steps.
Discussion and Examples
Since my goal is to obtain visually interesting images, I have found it necessary to impose one additional constraint on the genetics of my system. Before describing it, I note that by having individual parasite populations irritate the hosts locally while the hosts can only react globally (that is, the basis functions used as nodes in the genotype are globally defined), a tension arises between these [End Page 287] local irritations and the global response. Unfortunately, there are two obvious and uninteresting ways for hosts to fight the local invasion. The first is to evolve thin vertical bands in the phenotype with many contrasting values, so that a parasite cannot adjust to the resulting global discontinuities; the second is to create tiny islands of such discontinuities resembling a cloud of droplets. I call such anomalies degenerate images (see Fig. 2). Close examination of their genotypes reveals that they consist of one or two binary basis functions heavily modulated by unary basis functions. As is so often the case with the genetic algorithm, the hosts quickly found and exploited this flaw in the design of my fitness calculations. I countered by restricting the percentage of unary basis functions that could appear in a genotype. The purpose of allowing any unary basis functions at all was to permit visual "smoothing." Since the binary basis functions offer a variety of visual contrasts, forcing the hosts to incorporate sufficiently many of them into their genotypes helped sustain the visual complexity I needed in order to evolve interesting images.
I tested my coevolutionary simulation using populations of 30 hosts, with as many as five locations for parasites per host, yielding up to five separate parasite populations of 30 parasites each (see Color Plate B No. 2). During one representative simulation run, lasting 1,000 time steps, 10,030 hosts and 79,505 parasites were examined, although only 10 hosts were culled. No human could make aesthetic fitness decisions for so many host images in any reasonable amount of time. Given the fact that I place an upper bound on the size of the host genotype and restrict the number of unary primitives allowed in the host genotype, it is remarkable that only 24 times during this run was a mating attempt between hosts unsuccessful in the 10 tries allotted for achieving a valid host mating using crossover. Does cycling ever occur during my coevolution? Yes, at least to some extent. Hybrids and variants of the degenerate images similar to those mentioned above may appear and reappear during the course of coevolution. However, I have visual evidence that, even if I start with random populations, subsequent evolutionary trajectories escape from these uninteresting degeneracies. My explanation is that since parasites in some sense "chase" their hosts over the fitness landscape, or to put it another way, hosts ward off parasites by fleeing to new regions of image space, hosts under evolutionary pressure flee parasites by following different trajectories in image space; hence, newly emerging, fitter hosts will tend to be quite different from fitter hosts from earlier epochs (see Fig. 3).
Future Work
Benchmarking of the coevolutionary simulation's capabilities would be of interest. To make reasonable comparisons between images evolved by humans using Sims-style systems and images evolved from (coevolutionary) Sims-style systems incorporating simulated aesthetics will require considerable additional effort. The task is further complicated by the practice of users seeding runs from archived gene banks and by the wide disparity in sets of basis functions that artists and researchers use. Because I did not use seeding, images such as those shown in my figures are probably best thought of as organisms culled from the "primordial ooze." Even so, it is gratifying to see the image complexity that can arise using coevolution during simulation runs that permit hosts to use larger genotypes, but restrict the number of generations that the simulation is allowed to run for (see Fig. 4). I am currently working to extend my model so that hosts are organized spatially into subpopulations, or demes, so that mating takes into account spatial limitations. At intervals hosts and parasites from one deme can share genetic information with neighboring demes. Such a model would be more faithful than my current one to the original co-evulationary design of Hillis.
Gary R. Greenfield (educator), Department of Mathematics and Computer Science, the University of Richmond, Richmond, VA 23173, U.S.A. E-mail: lt;ggreenfi@richmond.edu>.
References and Notes
Her brushes are Java applets. One intriguing aspect of her work is that her brushes are dynamic, meaning that the lines they draw on a computer screen can modify themselves over time, giving rise to, in one demonstration, art that can erase itself!
Footnotes
An earlier version of this paper was presented at the Seventh International Conference on Artificial Life (Alife VII), 1-6 August 2000, Portland, OR, U.S.A. First published in M.A. Bedau, J.S. McCaskill, N.H. Packard and St. Rasmussen, eds., Artificial Life VII: Proceedings of the Seventh International Conference (Cambridge, MA: MIT Press, 2000). Reprinted by permission.