We are The 123s!

This semester I’ve started playing sax with The 123s, a local blues and rock’n’roll band. I’ve really been enjoying our setlist, and we just uploaded two songs to YouTube.

Our next gig is Friday, March 29th, at The Back Door, playing at the Blues on Blues benefit for the Trained Eye Arts Center from ~5:45-7pm ($5). I’m most excited about this summer though: we’re headlining at The Bishop on Thursday, May 23rd!

2012 in Music

This year, music has again become more than a consumptive activity. Through Afro-Hoosier, Canterbury, and my own noodling, I feel like I’m actively listening for arrangements and harmonies, and it feels wonderful to make that transition as a musician.

So what have I been listening to? The top 10 are pretty indicative:

’12 Artist ’11 Change
1 The Avett Brothers 2 (+1)
2 Wilco 4 (+2)
3 Radiohead 5 (+2)
4 Paul Simon 48 (+44)
5 John Mayer 24 (+19)
6 The xx (–)
7 Kings of Leon 9 (+2)
8 The Black Keys 12 (+4)
9 Bright Eyes 11 (+2)
10 TV on the Radio 20 (+10)

The Avett Brothers are riding on the strength of The Carpenter, which is a stellar album. John Mayer also rests on the strength of Born and Raised, which is easily my family’s favorite album of 2012. My experience with Paul Simon reflects that of seeing Sufjan Stevens – concert in November, followed by “woah this is really interesting” for the rest of time. The diversity reflected in his songwriting is amazing. The xx were the coolest “new” sound: very minimalist and sparse, with hip-hop beats and interesting guitar interplay. Their eponymous debut album is a must have.

New Discoveries (YouTube playlist): Alabama Shakes (blues), The Lonely Forest (alt rock), Cage the Elephant (rock), Passion Pit & Handsome Furs (80s revival synth-pop), Portugal. the Man (psychedelic/rock), Of Monsters and Men (folk), Joshua Radin (singer/songwriter), Ben Howard (contemporary), BADBADNOTGOOD (jazz fusion, heavy electronic/hip-hop influences), Morphine (bass/bari sax/drum trio), Kid Cudi (hip-hop), Nero (dubstep), Above & Beyond (trance), and Shpongle (psychedelic/trance).

Concerts: Above & Beyond, Radiohead, The Black Keys, Shpongle, Todd Snider, Outside Lands (Beck, Andrew Bird, Justice, Thee Oh Sees, Antibalas, Alabama Shakes, Portugal. the Man, Sigur Ros, Die Antwoord, Explosions in the Sky), The Avett Brothers, Victor Wooten.

Embracing Open Technologies

As a computer scientist, my software and hardware environment are the most critical part of my professional life. Furthermore, as a digital native, this landscape is the strata upon which many of my interactions are built. Just as in our physical life, our digital life should inhabit healthy surroundings. Thus, I’ve entered a period of deep contemplation about the services I use, and have started embracing the ethos of the GNU Project: the tech we use should reflect the values we hold. To this end, there are three gradual shifts to my computing environment: adopting Linux, migrating to GitHub, and deactivating my Facebook account.

Ownership, Context, Responsibilities

The first notion is one of ownership, and there are two aspects: licensing and data. Open-source licensing solves many distribution problems, allowing system-wide update managers that upgrade all my software at once, rather than being bombarded with popup windows for each application. However, not all software works this way, and so we must confront the ambiguous reality of digital rights management (DRM). Last month, I had to replace my motherboard, which triggered Windows to inform me that I may have been a victim of software piracy. This is because the license is tied to the physical installation of the software, rather than the intellectual property of the ability to use the software. App stores, such as the Steam Platform, solve this problem by tying the software to the user, rather than the installation. So long as DRM does not interfere with the portability of my intellectual property, I am comfortable with it.

The cloud is a double-edged sword when it comes to ownership and portability. On the one hand, by distributing data across multiple servers, we gain reliability and ubiquitous access, at the expense of security. However, many cloud storage implementations (e.g., Dropbox) do not follow file transfer standards in place since the 80s, locking you into their proprietary service and software. In contrast, services like GitHub offer remote hosting, but do not lock you into their system – your data is always portable. Amazon MP3 also offers portability through un-encrypted, unlimited download MP3s. By adhering to standards, applications guarantee openness of data, so long as the standards are published and APIs are available.

However, standards, even when published, require compliance and ubiquity, and it is here that Facebook fails. While championing the Open Graph protocol for data, Facebook follows the old Microsoft approach to standards: “Embrace, extend, and extinguish.” Messages are the clearest example of this. Every user on Facebook automatically has an e-mail address @facebook.com. This address though is not accessible via the standard IMAP or POP protocols, but can receive messages form any address, locking them into the Facebook ecosystem. We are digital sharecroppers, handing over content with false promises of ownership, constantly undermined by forced changes to benefit corporate interests.

The context of these messages has also rapidly changed. While they were once analogous to e-mail, they are now analogous to chat, a widely different medium (with the Jabber/XMPP open standard giving a facade of openness). Wall posts have undergone similar context shifting – from the early days of wall-to-wall conversations, to status comments, to the timeline – and all the while not offering easily accessible search. Control over context is a critical right for digital interactions, a point argued best by danah boyd. With nearly one billion users, Facebook is a self-described “social utility”, which vests a social responsibility for their users. Given their rejection of this responsibility, I have deactivated my Facebook account, in favor of controlling my own context at my personal web page. It is my hope that future social networks will maintain a balance between the free-for-all of MySpace pages and the rigor of Facebook profiles.

We also must have right to be forgotten. Facebook maintains negative-space data, and based on network structure alone it is possible to infer unreported profile data and unregistered users. Klout auto-computes their metric for all Twitter users, regardless of whether they have registered for the service, driving thousands of registrations just to opt-out, forcing people to hand over their personal data regardless of their participation. This is a major problem for all social applications. The power of social applications is mighty, and maintaining user control is critical, lest we unintentionally surrender our identity to others.

Dimensions of Services

While I’ve sketched out some specific considerations, there are a few general principles to extract. It’s important to note that the above arguments have little to do with the notion of privacy, highlighting that the principle of openness is very different from the principle of publicity. It is possible to have an open system which is private. For example, private GitHub repositories are inherently open: the fundamental data, the code, is all accessible to the user, while private repositories may keep them from the public. Privacy and openness are also separate from commercial interests and cost. GMail is a private, open, free, commercial system, adhering to the very same IMAP protocol as all other mail servers, but it is monetized for the company, despite storing private information and being a free service. When it comes to privacy, we must first start with openness, because privacy is built on trust. If you are not trusted with access to your own data, how can you trust that system with it?

Contemplating services within this framework still has issues: how do I deal with Steam, which is a closed, private, commercial service? The last aspect is portability. While my software is locked to the Steam service, it is not locked to a particular computer. Richard Stallman even makes a well-tempered argument that Steam can be beneficial for the Linux ecosystem by offering certain freedoms of choice, and the company itself has made a huge commitment to open-source development – rapidly improving Linux graphics drivers.

Containing the Semantic Explosion

Yesterday afternoon, I delivered a talk to the PhiloWeb Workshop at the WWW2012 Conference titled “Containing the Semantic Explosion” with Cameron Buckner and Colin Allen. It is an overview of the InPhO Project architecture, known as dynamic ontology, and a preview of some forthcoming data mining tools. [slides]

The explosion of semantic data on the information web, and within digital philosophy, requires new techniques for organizing and linking these knowledge repositories. These must address concerns about consistency, completeness, maintenance, usability, and pragmatics, while reducing the cost of double experts trained both in ontology design and the target domain. Folksonomy approaches address concerns about usability and personnel at the expense of consistency, completeness, and maintenance. Upper-level formal ontologies address concerns about consistency and completeness, but require double experts for the initial construction and maintenance of the representation. At the Indiana Philosophy Ontology (InPhO) Project, we have developed a general methodology called dynamic ontology, which alleviates the need for double experts, while addressing concerns about consistency, completeness and change through machine learning over a domain corpus, and concerns about usability and pragmatics through human input and semantic web standards. This representation can then be used by other projects in digital philosophy, such as the Stanford Encyclopedia of Philosophy (SEP) and PhilPapers, along with resources outside of digital philosophy enabled by the LinkedHumanities project. [slides]

Grad School: The Right Place

If you like where you live, if you like what you do,
If you like what you’re seein, when you’re lookin at you,
If you like what you’re sayin, when you open your face,
Then you got the right feeling, you’re in the right place.
Monsters of Folk – “The Right Place”

In November, I delivered two lectures to student organizations on campus and realized that I really miss teaching. Despite the amazing flexibility of a career in research and development, I won’t be able to find fulfillment until I am working with students. The only way to realize that goal is to become a professor, and in order to realize that I need a PhD, so I applied to graduate schools in December.

After visiting the available options, I’ve decided to continue my studies at Indiana University, pursuing the Joint PhD In Cognitive Science and Computer Science. All in all, IU just feels like the right place. I’m well-positioned to make a lasting impact, both in my own studies and in the community, and there’s no break for moving to a new city and building a new professional network. Plus, there is a large amount of social and financial stability in Bloomington, which helps maintain my sanity.

As for now, I’m off to Lyon, France to give a presentation titled “Containing the Semantic Explosion”, covering work with the Indiana Philosophy Ontology Project. An abstract and slides will follow later this week.

2011 in Music

Once again it is time to do a musical year-in-review. I feel some of my scrobble counts are off this year due to the launch of the Amazon Cloud Player, which I’ve been using at work. Of course, my 2009 play counts were also off due to sporadic iPod syncing, but this is still fairly accurate.

’11 Artist ’10 Change
1 Cold War Kids 68 (+67)
2 The Avett Brothers 1 (-1)
3 Death Cab for Cutie 21 (+18)
4 Wilco 7 (+3)
5 Radiohead 2 (-3)
6 Kanye West 78 (+72)
7 Say Hi 8 (+1)
8 Daft Punk 34 (+26)
9 Kings of Leon 9 (–)
10 Counting Crows 12 (+2)

Right below this list of top artists is a significant number of new discoveries. In the folk scene, I’ve been listening to Ryan Adams, The Head and the Heart, and The Goat Rodeo Sessions. In the indie scene, I’ve been listening to Death Cab for Cutie’s newest album, Cold War Kids, and Florence + the Machine. Sonic Youth has been an awesome discovery — Goo is making weekly appearnces in my playlists.

TV on the Radio is the coolest band I’ve discovered this year. Their arrangements are superb, and I really like their use of horns. The first song I heard (and subsequently fell in love with) is “Things You Can Do”. The new album, Nine Types of Light, has an accompanying movie that is an essential viewing for fans of Waking Life. Also, the movie has some amazing quotes: “It’s an unspeakable name. You don’t say it, you just look at it.”

The biggest musical change this year may not be reflected in play counts, but rather in consumption practice. I’ve been going to way more concerts in the past few months including Paul Simon, Punch Brothers, Gillian Welch & David Rawlins, Taj Mahal, Cold War Kids, They Might Be Giants, Main Squeeze, End Times Spasm Band, Joe Pug, and Say Hi. Bloomington has an astonishing number of bands come through, and because it’s a smaller town, we get to see them in smaller venues.

I’ve also continued switching to Amazon MP3, which has gotten even better with the advent of the Amazon Cloud Player, with clients for Windows, Max OS X, Linux, and Android. It’s nice having easy, instant access to my music anywhere. My only complaint is that the Amazon MP3 Downloader doesn’t have a 64-bit Linux client.


Last week I wrote and then gave two lectures on “Categorization” and “Practical Parallelism”. It was a ton of fun to prepare them, and actually giving them made me realize how much I miss teaching. Abstracts and slides follow.


Student Organization for Cognitive Science (SOCS)
November 15, 2011 @ 5:30pm

Abstract: Categorization is a fundamental problem in cognitive science that goes by a multitude of names: In artificial intelligence, categorization is known as clustering; in mathematics, the problem is partitioning. There are many applications in linguistics, vision, and memory research. In this talk, I will provide a brief overview of exemplar vs. prototype models in the cognitive sciences (Goldstone & Kersten 2003), followed by an introduction to three different general-purpose clustering algorithms: k-means (MacQueen 1967), qt-clust (Heyer et al 1999), and information-theoretic clustering (Gokcayso & Principe 2002). Open-source Python implementations of each algorithm will be provided.


Practical Parallelism

CS Club Tech Talk
November 17, 2011 @ 7pm

Abstract: In this talk, I will give a brief overview of several key parallelism concepts and practical tools for several languages. After this talk, attendees should have the resources to recognize and solve “painfully parallel problems”. Topics will include: threads vs. processes, Amdahl’s Law, shared vs. distributed memory, synchronization, locks, pipes, queues, process pools, futures, OpenMP, MapReduce, Hadoop, and GPU programming.


A New Chapter

In July, the Indiana Philosophy Ontology (InPhO) Project was awarded a new NEH-DFG Bi-lateral Digital Humanities Program grant with the University of Mannheim for linking and populating digital humanities databases. Our current grant ends in December, so this brought tons of relief, injecting $172,215 into the project. The DFG’s contribution of €126,400 allows InPhO co-founder Mathias Niepert to return to the project, along with his team at the University of Mannheim. All in all, the project will be able to continue for another two years.

As a result of the grant, I was offered a full-time, salaried faculty position as a Visiting Research Associate with the IU Cognitive Science Program, continuing work on the InPhO Project. During this time, I will be working on new methods of knowledge representation and machine learning with applications in document classification, ontology evaluation, and taxonomy alignment, bringing the digital humanities into the Linked Open Data initiative. I’ll also be working on a new bibliography management system for the Stanford Encyclopedia of Philosophy, using a tool developed for Cognitive Science Program faculty publication records.

I started the new position on August 16th. The new full-time job, plus the move to a my own 1-bedroom apartment, along with joining the band, have me falling more and more in love with Bloomington. For the first time in a long, long time, I’m satisfied with where I am. Looking forward to this new chapter of post-college life.

Summer 2011

Figured it’s been another 4 months, so it’s time for another life update. This was an incredibly productive summer with the open-sourcing of the InPhO Project, an extremely successful refactoring, and two publications hitting press. It was also fun, as I started gigging with Afro-Hoosier International and took a road trip up California 1 with my brothers. All in all, a great bookend on this past chapter of life.


All of the InPhO code has been open-sourced and uploaded to GitHub in two repos. The inpho repo contains our data mining code, while the inphosite repo contains our API and website. Most of the code in the inpho repo was newly ported from Java so that we could use NLTK and integrate with the ORM. We hired a new undergraduate, Evan Boggs, to help refactor the code, and after a long summer, were able to cut 10,000 lines form the code base.

In July, I quit the Syriac Reference Portal (SRP), after several months of work deploying Semantic MediaWiki and the new COGS Bibliography Engine. I learned a lot about generalizability of the InPhO code, and what the humanities side of digital humanities needs, but ultimately the data provenance goals of the historical community are still an open question for semantic web research and standardization, and I want to focus my research efforts elsewhere. I hope the project finds success and will continue to support it through work on the COGS Bibliography Engine.

Publications-wise, the work on speciation and clustering was accepted as a full paper at the European Conference on Artificial Life (ECAL). I’m really pleased with the biological narrative we were able to weave, and am working on some further work with Larry Yaeger and Sean Dougherty on adapting the clustering tool to larger datasets. Also, Colin and I’s paper on the InPhO API from last year’s Chicago Colloquium on Digital Humanities and Computer Science was finally published.


In May, I joined Afro-Hoosier International, a local afropop and world music dance band. Five gigs in, it’s been crazy fun to play sax with other people again. We’re an 11-piece band, with three horns, three vocalists, keyboard, guitar, bass, kit, and auxillary, and we groove. We’ll be hitting the studio sometime soon to put togehter an album — I’m really pumped. This is a recording from my second gig with the band in Bryan Park:

At the end of July, I finally got to take a little vacation from the grind. For the first time ever, both of my brothers and I headed out to California at the same time to visit my Dad. While we were there, we took a road trip up the North Coast on California 1 to the Avenue of the Giants, the Black Sands, and Arcata in Humboldt County. We managed to make no plans at all, and took things at a completely leisurely pace, stopping and going as we pleased. I kept my cell phone and e-mail turned off for a record 5 days.

InPhO for All: Why APIs Matter

This month Colin Allen and I published “InPhO for All: Why APIs Matter” in the Journal of the Chicago Colloquium on Digital Humanities and Computer Science (JDHCS). It’s a short piece setting up the API development narrative for digital humanists. Abstract, citation, and paper link follow.

The unique convergence of humanities scholars, computer scientists, librarians, and information scientists in digital humanities projects highlights the collaborative opportunities such research entails. Unfortunately, the relatively limited human resources committed to many digital humanities projects have led to unwieldy initial implementations and underutilization of semantic web technology, creating a sea of isolated projects without integratable data. Furthermore, the use of standards for one particular purpose may not suit other kinds of scholarly activities, impeding collaboration in the digital humanities. By designing and utilizing an Application Platform Interface (API), projects can reduce these barriers, while simultaneously reducing internal support costs and easing the transition to new development teams. Our experience developing an API for the Indiana Philosophy Ontology (InPhO) Project highlights these benefits.

Jaimie Murdock and Colin Allen. InPhO for All: Why APIs Matter. In Journal of the Chicago Colloquium on Digital Humanities and Computer Science (JDHCS). Evanston, Illinois, 2011. [paper]