Category: biology

  • Speciation and Information Theory

    For the past two semesters, I’ve been doing some exploratory work marrying speciation with information theory in the framework of the Polyworld artificial life simulator. The simulation gives us a nice framework for mathematically “pure” evolutionary theory and exploration of neural complexity. We’ve applied clustering algorithms to the genetic information, revealing evidence of both sympatric and allopatric speciation events. The key algorithmic intuition is that genes which are highly selected for will conserve, while those which are not will descend to a random distribution (and thus high entropy), so each dimension (gene) can be weighted by its information certainty to alleviate the curse of dimensionality.

    The work was accepted as a poster and extended abstract for the Genetic and Evolutionary Computing Conference (GECCO), and was accepted as a full paper for the European Conference on Artificial Life (ECAL). The full paper is substantially revised from the initial GECCO submission, and provides an introduction to several problems of biological, computational, and information theoretic importance. The visualizations, including several videos showing the cluster data, were especially fun to create, and I’m proud of the finished product.

    There are still several more research directions from this work: the allopatric and sympatric effects have not been differentiated, only one environment was analyzed (consistent with past work on evolution of complexity), the clustering algorithm’s thresholds were not explored for hierarchical effects, alternate clustering algorithms were not explored (future open-source project for me: clusterlib), … Still, the present work is encapsuled, the source is in the Polyworld trunk, and it was accepted for publication.

    Abstract, citation, and paper follow.

    Complex artificial life simulations can yield substantially distinct populations of agents corresponding to different adaptations to a common environment or specialized adaptations to different environments. Here we show how a standard clustering algorithm applied to the artificial genomes of such agents can be used to discover and characterize these subpopulations. As gene changes propagate throughout the population, new subpopulations are produced, which show up as new clusters. Cluster centroids allow us to characterize these different subpopulations and identify their distinct adaptation mechanisms. We suggest these subpopulations may reasonably be thought of as species, even if the simulation software allows interbreeding between members of the different subpopulations, and provide evidence of both sympatric and allopatric speciation in the Polyworld artificial life system. Analyzing intra- and inter-cluster fecundity differences and offspring production rates suggests that speciation is being promoted by a combination of post-zygotic selection (lower fitness of hybrid offspring) and pre-zygotic selection (assortative mating), which may be fostered by reinforcement (the Wallace effect).

    Jaimie Murdock and Larry Yaeger. Identifying Species by Genetic Clustering. In Proceedings of the 2011 European Conference on Artificial Life. Paris, France, 2011. [paper]

  • Two New Publications

    This past week brought two publication deadlines, a conference submission deadline, and preparation for a software demo at Harvard. Needless to say, I am exhausted, but it was well worth the effort.

    The first publication is a 2-page summary of work I’ve been doing with Prof. Larry Yaeger looking at speciation mechanisms in artificial life simulations. This was a condesnation of a paper submission for the Genetic and Evolutionary Computing Conference, and I’m really pleased with how much we were able to squeeze in. Abstract, citation, and link follow:

    Artificial life simulations can yield distinct populations of agents representing different adaptations to a common environment or specialized adaptations to different environments. Here we apply a standard clustering algorithm to the genomes of such agents to discover and characterize these subpopulations. As evolution proceeds new subpopulations are produced, which show up as new clusters. Cluster centroids allow us to characterize these different subpopulations and identify their distinct adaptation mechanisms. We suggest these subpopulations may reasonably be thought of as species, even if the simulation software allows interbreeding between members of the different subpopulations. Our results indicate both sympatric and allopatric speciation are present in the Polyworld artificial life system. Our analysis suggests that intra- and inter-cluster fecundity differences may be sufficient to foster sympatric speciation in artificial and biological ecosystems.

    Jaimie Murdock and Larry Yaeger. Genetic Clustering for Species Identification. In Proceedings of the Genetic and Ecolutionary Computation Conference (GECCO) 2011. Dublin, Ireland, 2011. [paper]

    The second publication is an expansion of the work on ontology evaluation presented last year at the 2010 International Conference on Knowledge Engineering and Ontology Development (KEOD) in Valencia, Spain. We’ve completely rewritten the section on our volatility score, and tightened up the language throughout. The 20-page behemoth will be published as a chapter in an upcoming volume of Springer-Verlag’s Communications in Computer and Information Science (CCIS) series. Abstract, citation, and link follow:

    Ontology evaluation poses a number of difficult challenges requiring different evaluation methodologies, particularly for a "dynamic ontology" generated by a combination of automatic and semi-automatic methods. We review evaluation methods that focus solely on syntactic (formal) correctness, on the preservation of semantic structure, or on pragmatic utility. We propose two novel methods for dynamic ontology evaluation and describe the use of these methods for evaluating the different taxonomic representations that are generated at different times or with different amounts of expert feedback. These methods are then applied to the Indiana Philosophy Ontology (InPhO), and used to guide the ontology enrichment process.

    Jaimie Murdock, Cameron Buckner and Colin Allen. Evaluating Dynamic Ontologies. Communications in Computer and Information Science (Lecture Notes). Spencer-Verlag. 2011. [chapter]