2011 in Music

Once again it is time to do a musical year-in-review. I feel some of my scrobble counts are off this year due to the launch of the Amazon Cloud Player, which I’ve been using at work. Of course, my 2009 play counts were also off due to sporadic iPod syncing, but this is still fairly accurate.

’11 Artist ’10 Change
1 Cold War Kids 68 (+67)
2 The Avett Brothers 1 (-1)
3 Death Cab for Cutie 21 (+18)
4 Wilco 7 (+3)
5 Radiohead 2 (-3)
6 Kanye West 78 (+72)
7 Say Hi 8 (+1)
8 Daft Punk 34 (+26)
9 Kings of Leon 9 (–)
10 Counting Crows 12 (+2)

Right below this list of top artists is a significant number of new discoveries. In the folk scene, I’ve been listening to Ryan Adams, The Head and the Heart, and The Goat Rodeo Sessions. In the indie scene, I’ve been listening to Death Cab for Cutie’s newest album, Cold War Kids, and Florence + the Machine. Sonic Youth has been an awesome discovery — Goo is making weekly appearnces in my playlists.

TV on the Radio is the coolest band I’ve discovered this year. Their arrangements are superb, and I really like their use of horns. The first song I heard (and subsequently fell in love with) is “Things You Can Do”. The new album, Nine Types of Light, has an accompanying movie that is an essential viewing for fans of Waking Life. Also, the movie has some amazing quotes: “It’s an unspeakable name. You don’t say it, you just look at it.”

The biggest musical change this year may not be reflected in play counts, but rather in consumption practice. I’ve been going to way more concerts in the past few months including Paul Simon, Punch Brothers, Gillian Welch & David Rawlins, Taj Mahal, Cold War Kids, They Might Be Giants, Main Squeeze, End Times Spasm Band, Joe Pug, and Say Hi. Bloomington has an astonishing number of bands come through, and because it’s a smaller town, we get to see them in smaller venues.

I’ve also continued switching to Amazon MP3, which has gotten even better with the advent of the Amazon Cloud Player, with clients for Windows, Max OS X, Linux, and Android. It’s nice having easy, instant access to my music anywhere. My only complaint is that the Amazon MP3 Downloader doesn’t have a 64-bit Linux client.

Comments off

Talks

Last week I wrote and then gave two lectures on “Categorization” and “Practical Parallelism”. It was a ton of fun to prepare them, and actually giving them made me realize how much I miss teaching. Abstracts and slides follow.

Categorization

Student Organization for Cognitive Science (SOCS)
November 15, 2011 @ 5:30pm

Abstract: Categorization is a fundamental problem in cognitive science that goes by a multitude of names: In artificial intelligence, categorization is known as clustering; in mathematics, the problem is partitioning. There are many applications in linguistics, vision, and memory research. In this talk, I will provide a brief overview of exemplar vs. prototype models in the cognitive sciences (Goldstone & Kersten 2003), followed by an introduction to three different general-purpose clustering algorithms: k-means (MacQueen 1967), qt-clust (Heyer et al 1999), and information-theoretic clustering (Gokcayso & Principe 2002). Open-source Python implementations of each algorithm will be provided.

Slides

Practical Parallelism

CS Club Tech Talk
November 17, 2011 @ 7pm

Abstract: In this talk, I will give a brief overview of several key parallelism concepts and practical tools for several languages. After this talk, attendees should have the resources to recognize and solve “painfully parallel problems”. Topics will include: threads vs. processes, Amdahl’s Law, shared vs. distributed memory, synchronization, locks, pipes, queues, process pools, futures, OpenMP, MapReduce, Hadoop, and GPU programming.

Slides

Comments off

A New Chapter

In July, the Indiana Philosophy Ontology (InPhO) Project was awarded a new NEH-DFG Bi-lateral Digital Humanities Program grant with the University of Mannheim for linking and populating digital humanities databases. Our current grant ends in December, so this brought tons of relief, injecting $172,215 into the project. The DFG’s contribution of €126,400 allows InPhO co-founder Mathias Niepert to return to the project, along with his team at the University of Mannheim. All in all, the project will be able to continue for another two years.

As a result of the grant, I was offered a full-time, salaried faculty position as a Visiting Research Associate with the IU Cognitive Science Program, continuing work on the InPhO Project. During this time, I will be working on new methods of knowledge representation and machine learning with applications in document classification, ontology evaluation, and taxonomy alignment, bringing the digital humanities into the Linked Open Data initiative. I’ll also be working on a new bibliography management system for the Stanford Encyclopedia of Philosophy, using a tool developed for Cognitive Science Program faculty publication records.

I started the new position on August 16th. The new full-time job, plus the move to a my own 1-bedroom apartment, along with joining the band, have me falling more and more in love with Bloomington. For the first time in a long, long time, I’m satisfied with where I am. Looking forward to this new chapter of post-college life.

Comments off

Summer 2011

Figured it’s been another 4 months, so it’s time for another life update. This was an incredibly productive summer with the open-sourcing of the InPhO Project, an extremely successful refactoring, and two publications hitting press. It was also fun, as I started gigging with Afro-Hoosier International and took a road trip up California 1 with my brothers. All in all, a great bookend on this past chapter of life.

Work

All of the InPhO code has been open-sourced and uploaded to GitHub in two repos. The inpho repo contains our data mining code, while the inphosite repo contains our API and website. Most of the code in the inpho repo was newly ported from Java so that we could use NLTK and integrate with the ORM. We hired a new undergraduate, Evan Boggs, to help refactor the code, and after a long summer, were able to cut 10,000 lines form the code base.

In July, I quit the Syriac Reference Portal (SRP), after several months of work deploying Semantic MediaWiki and the new COGS Bibliography Engine. I learned a lot about generalizability of the InPhO code, and what the humanities side of digital humanities needs, but ultimately the data provenance goals of the historical community are still an open question for semantic web research and standardization, and I want to focus my research efforts elsewhere. I hope the project finds success and will continue to support it through work on the COGS Bibliography Engine.

Publications-wise, the work on speciation and clustering was accepted as a full paper at the European Conference on Artificial Life (ECAL). I’m really pleased with the biological narrative we were able to weave, and am working on some further work with Larry Yaeger and Sean Dougherty on adapting the clustering tool to larger datasets. Also, Colin and I’s paper on the InPhO API from last year’s Chicago Colloquium on Digital Humanities and Computer Science was finally published.

Play

In May, I joined Afro-Hoosier International, a local afropop and world music dance band. Five gigs in, it’s been crazy fun to play sax with other people again. We’re an 11-piece band, with three horns, three vocalists, keyboard, guitar, bass, kit, and auxillary, and we groove. We’ll be hitting the studio sometime soon to put togehter an album — I’m really pumped. This is a recording from my second gig with the band in Bryan Park:

At the end of July, I finally got to take a little vacation from the grind. For the first time ever, both of my brothers and I headed out to California at the same time to visit my Dad. While we were there, we took a road trip up the North Coast on California 1 to the Avenue of the Giants, the Black Sands, and Arcata in Humboldt County. We managed to make no plans at all, and took things at a completely leisurely pace, stopping and going as we pleased. I kept my cell phone and e-mail turned off for a record 5 days.

Comments (1)

InPhO for All: Why APIs Matter

This month Colin Allen and I published “InPhO for All: Why APIs Matter” in the Journal of the Chicago Colloquium on Digital Humanities and Computer Science (JDHCS). It’s a short piece setting up the API development narrative for digital humanists. Abstract, citation, and paper link follow.

The unique convergence of humanities scholars, computer scientists, librarians, and information scientists in digital humanities projects highlights the collaborative opportunities such research entails. Unfortunately, the relatively limited human resources committed to many digital humanities projects have led to unwieldy initial implementations and underutilization of semantic web technology, creating a sea of isolated projects without integratable data. Furthermore, the use of standards for one particular purpose may not suit other kinds of scholarly activities, impeding collaboration in the digital humanities. By designing and utilizing an Application Platform Interface (API), projects can reduce these barriers, while simultaneously reducing internal support costs and easing the transition to new development teams. Our experience developing an API for the Indiana Philosophy Ontology (InPhO) Project highlights these benefits.

Jaimie Murdock and Colin Allen. InPhO for All: Why APIs Matter. In Journal of the Chicago Colloquium on Digital Humanities and Computer Science (JDHCS). Evanston, Illinois, 2011. [paper]

Comments (1)

Reflections on Privacy

For many people, the primary privacy concern is the "no parents" concept – we don’t care who sees things as long as our "parents" don’t see it (where parents can be anyone we don’t want to see things – professional contacts, straight-edge acquaintences, terrorists, Julian Assange, etc.). This is what I term the exclusive privacy model: start with the public and begin cutting people out. However, this "public minus parents" idea doesn’t make sense. Online, you just have to logout to see this information. Offline, all someone has to do is talk. Facebook was originally marketed this way: here is a place to post information where only Harvard/Ivy League/college students can see it.

This exclusive model is the most common privacy misperception. Information spreads, and by consciously recognizing this privacy becomes synonymous with trust. For example, you send an e-mail, confide in a friend, or upload a photo. This is private information, but is capable of being shared or forwarded in any number of ways, both online and offline (e.g., gossip). Its reach is mitigated by social convention and our own discretion.

Google+ gets this inclusive privacy model right. First, it always explicitly states who an item is being shared with, not who is being excluded. When resharing an item that was shared with a limited circle, it notifies you of the original intent, highlighting the priviledge and trust placed in you. Just like an e-mail program’s forward button, each piece of content has a share button and the API will allow for all data to be federated outside of Google+. However, you also can disable the reshare for each posting. Someone else can always copy-paste your content, but it won’t be computationally linked to you.

Privacy isn’t just about information, it’s about image as well. Google+ enables full control over your profile. Instead of posting to your wall or tagging you in a photo, people communicate with you directly through limited shares which do not appear on your public profile. Photo tags don’t appear in your albums until they are approved. A box in the upper right corner of your profiles allows you to view it as any other user. Voyeurism is all but eliminated, as you do not see a constant stream of external interactions. Facebook has some of these settings, but they are not as pervasive in the profile.

The Next Step

Google+ seems to have figured out a better way to handle privacy – both in terms of information and image – but the next social networking revolution is targeting: I don’t care who sees what I post, but I am self-conscious about overloading people with irrelevant information. My ideal publishing model wouldn’t be about circles of people, but streams of tagged content. If there existed a service where you could follow a person, but mute certain content streams (such as local events, politics, etc.), we’d have perfection. For example, friends in Kentucky don’t care about tornadoes around Bloomington. Professional contacts may be extremely interested in my philosophy and technology content, but don’t care about what concerts I’m going to. People who aren’t in the same circles (hometown friends, college friends, professional contacts, etc.) may share interests in internet humor or politics, while others consider unfollowing me because of it. None of this information is private, but I don’t want to innundate the world with extraneous chatter. If a social network can figure this out, that’s where I’ll plant my flag.

Comments off

Speciation and Information Theory

For the past two semesters, I’ve been doing some exploratory work marrying speciation with information theory in the framework of the Polyworld artificial life simulator. The simulation gives us a nice framework for mathematically “pure” evolutionary theory and exploration of neural complexity. We’ve applied clustering algorithms to the genetic information, revealing evidence of both sympatric and allopatric speciation events. The key algorithmic intuition is that genes which are highly selected for will conserve, while those which are not will descend to a random distribution (and thus high entropy), so each dimension (gene) can be weighted by its information certainty to alleviate the curse of dimensionality.

The work was accepted as a poster and extended abstract for the Genetic and Evolutionary Computing Conference (GECCO), and was accepted as a full paper for the European Conference on Artificial Life (ECAL). The full paper is substantially revised from the initial GECCO submission, and provides an introduction to several problems of biological, computational, and information theoretic importance. The visualizations, including several videos showing the cluster data, were especially fun to create, and I’m proud of the finished product.

There are still several more research directions from this work: the allopatric and sympatric effects have not been differentiated, only one environment was analyzed (consistent with past work on evolution of complexity), the clustering algorithm’s thresholds were not explored for hierarchical effects, alternate clustering algorithms were not explored (future open-source project for me: clusterlib), … Still, the present work is encapsuled, the source is in the Polyworld trunk, and it was accepted for publication.

Abstract, citation, and paper follow.

Complex artificial life simulations can yield substantially distinct populations of agents corresponding to different adaptations to a common environment or specialized adaptations to different environments. Here we show how a standard clustering algorithm applied to the artificial genomes of such agents can be used to discover and characterize these subpopulations. As gene changes propagate throughout the population, new subpopulations are produced, which show up as new clusters. Cluster centroids allow us to characterize these different subpopulations and identify their distinct adaptation mechanisms. We suggest these subpopulations may reasonably be thought of as species, even if the simulation software allows interbreeding between members of the different subpopulations, and provide evidence of both sympatric and allopatric speciation in the Polyworld artificial life system. Analyzing intra- and inter-cluster fecundity differences and offspring production rates suggests that speciation is being promoted by a combination of post-zygotic selection (lower fitness of hybrid offspring) and pre-zygotic selection (assortative mating), which may be fostered by reinforcement (the Wallace effect).

Jaimie Murdock and Larry Yaeger. Identifying Species by Genetic Clustering. In Proceedings of the 2011 European Conference on Artificial Life. Paris, France, 2011. [paper]

Comments (1)

Spring 2011

With the passing of another semester comes another life update post. Even though I am no longer a student, being embedded in academia means progress is still measured by semesters.

Recently, I was awarded the Provost’s Award for Undergraduate Research and Creative Activity, which was a really nice capstone on my undergraduate experience. Since I did not walk at graduation, the Honors Convocation was a good opportunity to give my family closure on this chapter of my life.

Throughout these few months, I’ve been busy writing up a storm – one week in April saw 30 pages of manuscripts submitted. My previous post details the accepted poster summary on "Genetic Clustering for Species Identification" and the accepted book chapter on "Evaluating Dynamic Ontologies". There are two more papers in review and preparation right now. One is an expansion of the speciation work for a (hopeful) full-paper presentation. The other details work on taxonomy alignment carried out this semester.

I’ve still been travelling a ton. In December, I headed to Berkeley for my first California Christmas with Dad and Justin and my first non-business trip in 4 months. Three weeks later, I went back to California for a site visit at the Stanford Encyclopedia of Philosophy, Big Data Camp, and the O’Reilly Strata Conference. Strata was amazing – learned a ton, and met some really great people. Definitely planning to go again next year. I was scheduled to go to the Digital Humanities API Workshop but snow delays forced me to cancel, and last minute logistics chagnes made PyCon and ThatCamp SE impossible to attend. These three were certainly disappointments, but after being in an airport every month for 8 months, it was kind of nice to stay rooted for a while. Earlier this week, I visited Princeton University and Beth Mardutho: The Syriac Institute, as part of my work with the Syriac Reference Portal.

On a more personal note, the diaspora of friends has been steadily widening since graduation, including my roommate of 3 years. This has been disturbed, however, by just as many friends changing their plans to either stay in Bloomington or move back. While we will no longer have a single house to hang out in all the time, I’m excited about the social continuity next year.

Comments off

Two New Publications

This past week brought two publication deadlines, a conference submission deadline, and preparation for a software demo at Harvard. Needless to say, I am exhausted, but it was well worth the effort.

The first publication is a 2-page summary of work I’ve been doing with Prof. Larry Yaeger looking at speciation mechanisms in artificial life simulations. This was a condesnation of a paper submission for the Genetic and Evolutionary Computing Conference, and I’m really pleased with how much we were able to squeeze in. Abstract, citation, and link follow:

Artificial life simulations can yield distinct populations of agents representing different adaptations to a common environment or specialized adaptations to different environments. Here we apply a standard clustering algorithm to the genomes of such agents to discover and characterize these subpopulations. As evolution proceeds new subpopulations are produced, which show up as new clusters. Cluster centroids allow us to characterize these different subpopulations and identify their distinct adaptation mechanisms. We suggest these subpopulations may reasonably be thought of as species, even if the simulation software allows interbreeding between members of the different subpopulations. Our results indicate both sympatric and allopatric speciation are present in the Polyworld artificial life system. Our analysis suggests that intra- and inter-cluster fecundity differences may be sufficient to foster sympatric speciation in artificial and biological ecosystems.

Jaimie Murdock and Larry Yaeger. Genetic Clustering for Species Identification. In Proceedings of the Genetic and Ecolutionary Computation Conference (GECCO) 2011. Dublin, Ireland, 2011. [paper]

The second publication is an expansion of the work on ontology evaluation presented last year at the 2010 International Conference on Knowledge Engineering and Ontology Development (KEOD) in Valencia, Spain. We’ve completely rewritten the section on our volatility score, and tightened up the language throughout. The 20-page behemoth will be published as a chapter in an upcoming volume of Springer-Verlag’s Communications in Computer and Information Science (CCIS) series. Abstract, citation, and link follow:

Ontology evaluation poses a number of difficult challenges requiring different evaluation methodologies, particularly for a "dynamic ontology" generated by a combination of automatic and semi-automatic methods. We review evaluation methods that focus solely on syntactic (formal) correctness, on the preservation of semantic structure, or on pragmatic utility. We propose two novel methods for dynamic ontology evaluation and describe the use of these methods for evaluating the different taxonomic representations that are generated at different times or with different amounts of expert feedback. These methods are then applied to the Indiana Philosophy Ontology (InPhO), and used to guide the ontology enrichment process.

Jaimie Murdock, Cameron Buckner and Colin Allen. Evaluating Dynamic Ontologies. Communications in Computer and Information Science (Lecture Notes). Spencer-Verlag. 2011. [chapter]

Comments (1)

Farewell to a Friend

Today is the memorial service for Helga Keller, a dear friend who has changed so many lives. Since the first weekend in Bloomington, Helga has been a surrogate grandmother for me. We met after church the first weekend I was here and had an instant bond: A German immigrant, her first home in America was my hometown – Murray, Kentucky – and she knew many friends from home. She also was the administrative assistant for Douglas Hofstadter, one of the major inspirations for coming to Indiana. These common bonds of faith, people, and place brought us together throughout the years.

There are so many things for which I am extremely grateful to her. One day I sent her an e-mail inquiring about the CopyCat program and where I could find articles about it. She responded with a three-page e-mail, with the article attached, links  to all subsequent research, contact information for all of the authors, an offer to introduce me to them, and an invitation to the CRCC lab meetings. As if that weren’t enough, the next time I encountered her she gave me an autographed copy of the book the study appeared in, along with photocopies of the articles mentioned in the e-mail.

This is but one of many stories of her overwhelming kindness and dedication. May she rest in peace.

Comments (1)