Author: Jaimie Murdock

  • Graduation

    Final grades are in, so I can finally announce that on December 17, 2010, I graduated from Indiana University with dual degrees and honors in Cognitive Science and Computer Science after 7 semesters.

    I’m extraordinarilly excited to finally be done with coursework so I can focus entirely on research. I’ll be continuing work with Prof. Colin Allen and the Indiana Philosophy Ontology Project (InPhO), completing our integration with the Stanford Encyclpedia of Philosophy (SEP) and working on further refinements of the dynamic ontology methodology, generalizing our methods for use in other disciplines. Starting in January, I’ll be working with Prof. David Michelson at the University of Alabama to redeploy the InPhO for the Syriac Reference Portal (SRP). This will hopefully lead to extended collaborations in the digital humanities.

    Additionally, I’m planning to continue contributing to Prof. Larry Yaeger‘s Polyworld Project, which is an Artificial Life simulation that provides a framework for replicable studies in evolution, genetics, and neural networks. I’ve been working on methods of species identification, using information-theoretic measures of genetic distance. This has led to a series of complexity improvements to a popular clustering algorithm used in bio-informatics. I’ve also built a data-access library in Python to facilitate analysis and visualization of experimental data.

    Sometime last year I started travelling all the time. My work was presented in Valencia, Evansville, and Chicago, and I further went to DC, Berkeley, Palo Alto, Louisville, Nashville, and Madrid. So far I’ve got three big trips planned for 2011: Santa Clara in January for the O’Reilly Strata Conference, DC and Philadelphia in February, and Atlanta in March for PyCon. Over the summer, I’ll hopefully be headed to conferences in Ireland and San Fransisco, but we’ll see how that goes.

    Past that, my future plans are predicated on the results of my Fulbright proposal. If this comes through, I’ll head to Karlsruhe, Germany in July to spend a year as a research assistant, developing methods for ontology-driven machine translation and sentiment analysis in collaboratively-generated corpora. Either way, 2011 should be a great year!

  • Published!

    In June, my paper "Two Methods for Evaluating Dynamic Ontologies" was accepted to the 2nd Knowledge Engineering and Ontology Development (KEOD) Conference in Valencia, Spain on October 25-28. The paper was co-authored with Cameron Buckner, a graduate student in Philosophy, and Colin Allen, a Professor in Cognitive Science and History & Philosophy of Science, and details some of our work with the Indiana Philosophy Ontology (InPhO) Project.

    This paper is the culmination of two summers of research on knowledge representation. If you’re interested in the InPhO project, section 3 of the paper is a reasonably accessible summary. The paper as a whole deals with a subproblem in ontologies – how do you quantify the quality of a candidate knowledge representation? We hypothesize that the structure of a domain corpus should be reflected in the structure of a taxonomy of that domain, and that a better taxonomy will better match the corpus statistics.

    I’ll be headed to Valencia October 22-31, and the Hutton Honors College has generously approved a travel grant to cover expenses for the week. I’ve set up my flights to and from Madrid, and I’ll have 2 days before and 3 days after the conference to wander around Spain — I’ve never been to Europe before, so I’m extremely excited!

    The abstract is below:

    Ontology evaluation poses a number of difficult challenges requiring different evaluation methodologies, particularly for a "dynamic ontology" representing a complex set of concepts and generated by a combination of automatic and semi-automatic methods. We review evaluation methods that focus solely on syntactic (formal) correctness, on the preservation of semantic structure, or on pragmatic utility. We propose two novel methods for dynamic ontology evaluation and describe the use of these methods for evaluating the different taxonomic representations that are generated at different times or with different amounts of expert feedback. The proposed "volatility" and "violation" scores represent an attempt to merge syntactic and semantic considerations. Volatility calculates the stability of the methods for ontology generation and extension. Violation measures the degree of "ontological fit" to a text corpus representative of the domain. Combined, they support estimation of convergence towards a stable representation of the domain. No method of evaluation can avoid making substantive normative assumptions about what constitutes "correct" representation, but rendering those assumptions explicit can help with the decision about which methods are appropriate for selecting amongst a set of available ontologies or for tuning the design of methods used to generate a hierarchically organized representation of a domain.

  • virtualbox-bin in Gentoo

    Some non-Linux-dork posts are in the pipe, but today I had issues getting VirtualBox up and running on Gentoo. Here’s some proper install instructions to work around Bug 283617. I’ll get to fixing the ebuild later this weekend.

    emerge virtualbox-bin
    chmod 4750 /opt/VirtualBox/VBoxNetAdpCtl
    chmod 4510 /opt/VirtualBox/VBoxSDL /opt/VirtualBox/VBoxHeadless /opt/VirtualBox/VirtualBox
    gpasswd -a youruser vboxusers
    

    After a logout/login, VirtualBox should appear in your Applications menu, and can be run from the command line with VirtualBox.

  • Installing a Brother Printer on Gentoo

    I’ve been migrating over to Gentoo from Ubuntu (more on this later) and today had the lovely experience of installing a printer. Since at least 2 other computers will be needing these instructions, here we are:

    Install CUPS

    1. emerge cups
    2. /etc/init.d/cupsd start
    3. rc-update add cupsd default

    Install Driver

    1. Download the LPD and PPD RPM drivers from Brother’s Linux driver site.
    2. emerge rpm tcsh
    3. rpm  -ihv  --nodeps  (lpr-drivername)
    4. rpm  -ihv  --nodeps  (cupswrapper)
    5. Verify the drivers installed correctly: rpm  -qa  |  grep  -e (lpr-drivername)  -e  (cupswrapper-drivername) (if this is your only rpm package, just use rpm -qa)
    6. Create a symlink to the filter: ln -s /usr/lib/cups/filter/brlpdwrapper[printer name] /usr/libexec/cups/filter/brlpdwrapper[printer name]

    Add printer

    1. In a browser, go to the CUPS server at http://localhost:631/
    2. Click Add Printer and enter a name. Location and description are optional, but user-friendly.
    3. On the next page select: Device: AppSocket/HP JetDirect
    4. On the next page enter: Device URI: socket://192.168.1.11 (substitute with the IP address of your printer)
    5. The final page has a list of printer manufacturers. Skip that and click Choose File. Select the proper PPD file at /usr/share/cups/model/(printermodel).ppd. Click next.
    6. Print a test page and enjoy!

    As an aside, I did stumble upon the Brother PPD source code, however there were no make files for my printer, nor were there any LPD drivers. It is unfortunate to have rpm or dpkg as a dependency for my printer drivers, but so be it – they’re lightweight packages on their own.

  • Ahead of the Curve

    Another perfect day
    They keep pilin’ up
    I got happiness that I can maintain
    So beginner’s luck

    I had shoes to fill
    Walkin’ barefoot now
    Can’t tell north from south
    But no split hair’s gonna get me down

    Stayin’ above the flat line
    I’m ahead of the curve
    Take a piece of the sunshine with me
    On a redeye flight to another world

    It isn’t any trouble
    If you wanna come with me
    I know it’s out of the question, honey
    But I sure could use the company
    And a place to be

    Now the sky is pink
    Rooftop swimmin’ pool
    I’m not carefree, no
    I’m free to care
    I just never do

    All the bags are checked
    And the reasons why
    Yesterday lingers on
    That’s the piece you keep when you say goodbye

    You can get what you want now
    Knock it out of the park
    Bury it by the river, easy
    There’s a search party but it’s getting dark

    I won’t hold you to nothin’
    I wanna make that plain
    Probably end up a stranger and crazy
    But I’m still hopin’ there’s another way
    And a place to stay

    What the scene has got too sentimental
    When the night comes
    When the night comes loose
    All the things you put upon the mantle
    What a shame
    What a shame
    It’s old news

    I’m stayin’ above the flatline
    I’m ahead of the curve
    Take a piece of the sunshine with me
    On an all-night drive to another world

    You can get what you want now
    Knock it out of the park
    Probably end up a drifter and lonely
    But I’m still hoping for a change of heart
    And a place
    A place
    A place
    To start

    Monsters of Folk – Ahead of the Curve

  • Spring Semester

    Spring semester is over! All in all, this was an extremely exciting semester. A very brief, concise recap:

    It’s been super busy, but also incredibly stimulating. My grades are coming back, finances are under control and sleep is finally consistent. Can’t wait to keep things moving and get back on the bike this summer!

  • Dealing with ATi’s Linux Drivers

    ATi has gotten much better Linux support, but there is still much to be desired. Kernel upgrades pushed through the update manager tend to destroy the ATI kernel module. I’ve found the quickest, most painless way is to simply uninstall and then reinstall the drivers:

    sudo /usr/share/ati/fglrx-uninstall.sh
    

    Download the latest drivers from the ATI site: 32-bit and 64-bit <http://support.amd.com/us/gpudownload/linux/Pages/radeon_linux.aspx>.

    Open a terminal in the directory with the downloaded file (note: your exact file name may be different):

    sudo chmod +x ati-driver-installer-10-2-x86.x86_64.run
    sudo ./ati-driver-installer-10-2-x86.x86_64.run
    

    Install the drivers, restart the computer and type the following into a terminal:

    sudo aticonfig -f --initial
    

    Then restart X (Ctrl+Alt+Backspace) or restart the computer and all will be well!

    Update: If you experience black or grey screen artifacts in Firefox/Thunderbird using Catalyst 10.6 or higher, it may be due to the new 2D rendering system. To force use of the old XAA system run the following command after the initial aticonfig setup:

    sudo aticonfig --set-pcs-str=DDX,ForceXAA,TRUE
    

    Restart X, and all should be well!

  • More Curriculum Musings

    I’ve been making a bunch of comments on Computer Science education lately. The New York Times has an excellent article about “Making Computer Science More Enticing” which focuses on Stanford’s new curriculum. The Stanford curriculum is very similar to IU’s new specialization-based curriculum and seems to be an excellent approach to “teaching the discipline”.

    Also, I found the “definitive” document on CS education – The ACM/IEEE Computing Curriculum 2008 Update [PDF].

    Why so much focus on education? Computer Science is a (relatively) new discipline with a multitude of high-impact applications, giving us an imperative to train students quickly. Unfortunately, the speed at which our field is moving can cause us to lose sight of the philosophy behind the science.

    If someone wants to learn Biology, you would point them to Campbell & Reece. If someone wants to learn computation, where do you point them? A list of books. There are books focused on introducing algorithms and functional programming (SICP); there are tomes focused on general computation (Knuth); there are books focused on application (the entire O’Reilly library); there are definitive texts on specific languages (The C Programming Language, The Scheme Programming Language); there does not seem to be a widely-accepted, integrative introduction that emphasizes computation — algorithms and models. From what I’m observing in CS curricula across the country, the coursework is moving in this direction, but we still need this cohesive “Introduction to Computing” book.

    As a final message, this video linked in the NYT article captures the beauty, richness and excitement of our discipline right now — “It’s sort of like you’re geometers and you’re living in the time of Euclid”:

  • Computer Studies

    The latest issue of Communications of the ACM, the premier computer science journal, contains an interesting article by IU Professor Dennis Groth — Why an Informatics Degree? The article has much to say about the necessity of application and applied computing as a measure of computer science success.

    However, there are some questions left unanswered. First, I address two questions in philosophy of science: “What is Computer Science?” and “Why Informatics?” I then address the pedagogical implications of these questions in a section on “Computer Studies”.

    What is Computer Science?

    Any new discipline needs to consider its philosophy in order to establish a methodology and range of study. Prof. Groth’s definitions of Computer Science and Informatics do not quite capture these considerations:

    Computer science is focused on the design of hardware and software technology that provides computation. Informatics, in general, studies the intersection of people, information, and technology systems.

    In explicitly linking the science to its implementation, this definition of Computer Science fumbles away its essence. Yes, the technology is important and provides a crucial instrument on which to study computation, but at its core computer science studies computation — information processing. Computer science empirically examines this question by studying algorithms (or procedures) in the context of a well-defined model (or system).

    This conflation of implementation and quantum is extremely pervasive. For example, Biology is “the study of life”, but in a (typical) biology class one never addresses the basic question: “What is life?” The phenomena of life can be studied independently of the specific carbon-based implementation we have encountered. This doesn’t deny the practical utility of modern biology, but it does raise the question of how useful our study of the applied life is to our understanding of life itself. (If you’re interested in this line of questioning, I highly recommend Larry Yaeger’s course INFO-I486 Artificial Life.)

    Similarly, Computer Science can study procedures independently of the hardware and software implementations. Consider the sorting problem. (If you are unfamiliar with sorting, see the Appendix: Sorting Example.) One would not start by looking at processor architecture or software deisgn, but would instead focus on the algorithm. Pure Computer Science has nothing to do with hardware or software, they are just an extremely practical medium on which we experiment.

    Why Informatics?

    Informatics seems to be ahead of itself here in asking “Why an Informatics degree?” before asking the more fundamental “Why Informatics?” There are two primary definitions implied in the article. The more popular answer is that “Informatics solves interdisciplinary problems through computation”. The second, emerging answer is that “Informatics studies the interaction of people and technology”.

    The first definition defines a methodology but does not define a subject. It should be obvious that we live in a collaborative, interdisciplinary world. Fields should inform one another but there is still a distinction between fields: Biology studies life; Computer Science studies computation; Cognitive Science studies cognition; Chemistry studies chemicals; etc. One can approach any problem with any number of techniques – computing is one part of this problem-solving toolkit, along with algebra, calculus, logic and rhetoric. However, each of the particular sciences should answer some natural question – whether that be a better explanation of life, computation, mathematics or cognition. Positing a discipline as the use of one field to address problems in another field is not a new field. It’s applied [field] or [field] engineering.

    The other definition, that informatics studies the interaction of people and technology, hints at a new discipline studying a quantum of “interaction”. This area has tons of exciting research, especially in human-computer interaction (HCI) and network science. Further emphasizing this would go a long ways toward creating a new discipline and set a clear distinction between the informaticist and the computer scientist. Computer scientists study computation; informaticists study interaction; both should be encouraged. As it stands, both study “computers” and both step on each other’s toes.

    Computer Studies

    This discussion of philosophies has important implications for how we structure computer-related education (formalized as Computer Studies). Despite major differences in our approaches, it does seem clear that Computer Science and Informatics should work together, especially in applications.

    However, as currently implemented at IU, the Informatics curriculum is a liberal arts degree in technology. Formal education should teach either a vocation, a discipline or (ideally) both. Informatics seems to answer to neither claim by emphasizing how informaticists “solve problems with computers” without diving into programming or modeling. If it aims to teach such a vocation, then more application is necessary to give expertise; if it aims to teach a discipline, it is fine to do that through application, but we must recognize that application is only useful insofar as it benefits theory (and vice versa). Additionally, if the field does indeed have a quantum of interaction, then interaction should be the forefront of the curriculum.

    IU’s Computer Science ex-department is a valiant effort to teach a discipline – in the span of 4 years we cover at least 3 distinct programming paradigms (functional, object-oriented and logic) spread over 4 distinct languages, bristling with an exploration of algorithms. That being said, I would be surprised if more than 25% of the graduating class could explain a Turing Machine.

    Not everyone is into theory – most people really just want to “solve problems with computers” and have a good job. Where do these programmers go? Informatics does not address this challenge, and shouldn’t attempt to. The answer is software engineering – just as applied physics finds a home in classical engineering. By establishing a third program for those clearly interested in application, IU would have a very solid “computer studies” program (as distinguished from computation or technology). [A friend has pointed out that IU cannot legally offer an engineering degree, so we’d have to get creative on the name or tell people to go to Purdue. This works as a general model of Computer Studies pedagogy.]

    As another example of how to split “computer studies”, Georgia Tech recently moved to a three-prong approach with the School of Computer Science (CS), School of Interactive Computing (IC), and Computational Science and Engineering Division (CSE). My view of Informatics roughly correlates to that of IC; the Computer Science programs are equivalent but include software engineering. The CSE division is a novel concept, presently captured by IU’s School of Informatics, and it seems this is another working group, but I feel it is best captured by adjunct faculty and interdisciplinary programs, rather than a whole new field.

    Appendix: Sorting Example

    Let’s say we have a list of numbers and want to sort them from smallest to largest. One naive way is to compare each term to the next one, and swap them if they are in the wrong order and restart until you can make it to the end without swapping:

    1: *4 3* 2 1 -> 3 *4 2* 1 -> 3 2 *4 1* -> 3 2 1 4
    2: *3 2* 1 4 -> 2 *3 1* 4 -> 2 1 *3 4* -> 2 1 3 4
    3: *2 1* 3 4 -> 1 *2 3* 4 -> 1 2 *3 4* -> 1 2 3 4
    4: *1 2* 3 4 -> 1 *2 3* 4 -> 1 2 *3 4* -> 1 2 3 4
    

    This is called bubble sort, and solves the problem of sorting. However, consider what you’d have to do to sort a bigger list: each time you make a swap you have to rescan the whole list! A smarter way to sort this list would be to divide the list into two smaller lists, sort the smaller lists, and then merge them together:

    1a: *4 3* -> 3 4
    1b: *2 1* -> 1 2
    
    Now merge:
    2a: *3* 4 -> *3* 4 -> 1 2 3 4
    2b: *1* 2 -> 1 *2* -^
    

    This only takes 4 comparisons, compared to 12! We just did a classic problem in Computer Science without even once mentioning computer hardware or writing a single line of code!

  • One Year Later

    With the state of the union address rapidly approaching, I want to highlight some excellent articles bringing Obama’s first year into perspective. Many people have become disheartened by the lack of swift action by the administration on many topics – health care, Iraq, the economy, etc. I’d encourage you to read Andrew Sullivan’s article: Obama’s Substantive First Year:

    Obama is a liberal pragmatist in politics and a traditional conservative in his understanding of the presidency. Once you grasp this, his first year makes much more sense.

    The article highlights Obama’s strengths and shortcomings in a calm, collected manner. Further reading on the year’s accomplishments:

    All that is great – things have been getting done… so what’s our problem with Obama? The New Yorker addresses this succinctly – One Year: Storyteller-in-Chief:

    I’ve been an Obama man all the way. I voted for him in 2008 and I’ll vote for him again in 2012, with far less enthusiasm. But it would help me out so much if he could give me some kind of story to hang onto. At this stage, a scrap would suffice. A President can have all the vision in the world, be an extraordinary orator and a superb politician, have courage and foresight and a willingness to make painful choices, have a bold progressive plan for his nation—but none of these things will matter a wit if the President cannot couch his vision, his policies, his courage, his will, his plan in the idiom of story.

    People need stories to latch on to and remembering our personal narrative is vital to projecting our future. Obama would be wise to heed these words: after all, so much of his meteoric rise comes from his extraordinary storytelling (Dreams from My Father, anyone?).

    As for me, I remain optimistic about the future of the administration. A great deal of current frustrations have to do with participation. 2008 was a never-ending deluge of political news and activism. Working for the campaign and delivering Indiana was the highlight of my year. In 2009 our nation had to unwind and reconcile our own drive for action with the notion that legislating is a full-time job, requiring a ton of expertise. The government is huge and most people don’t have time to read every bill, to learn the details of every issue – that’s why we are a representative democracy. I think a large part of liberal frustration comes from a headstrong dislike of delegation. We have to let those we elected do their job; our role as citizens is to give them feedback through our communications and then our votes.

    From the inaugural address Obama knew he faced a myriad of difficult problems:

    Today I say to you that the challenges we face are real.  They are serious and they are many.  They will not be met easily or in a short span of time.  But know this America:  They will be met.

    We still face serious challenges, but fortunately we have three (hopefully seven) more years. Let’s recognize those challenges “future work” and take solace what’s already been accomplished in this short portion of the excruciatingly slow march of progress. My hope for the State of the Union is that it frames our current challenges in the “unlikely story that is America”, reasserting that once again we will meet and far surpass our challenges.