Category: science

Space Mirrors?!

For tens of thousands of years, humanity has looked up at the stars not for illumination, but for guidance. Before compasses and charts, the stars led the way for countless generations of explorers. From Phoenicians sailing beneath the Pleiades to Odysseus following the Great Bear home, and from Polynesian navigators crossing the Pacific to caravans trading across great deserts, we have relied on the cosmos to provide direction and order to our world.

Then, suddenly, in 1957, the cosmos changed forever. Sputnik left humanity’s mark in the night sky, collapsing the distance of the heavens from the stars’ light-years to mere light-milliseconds above us. Since then, there has always been an artificial satellite in orbit. For more than a quarter-century, there has also been a continuous human presence aboard the International Space Station, bringing humanity itself into the heavens.

Space debris as of January 1, 2019. Photo: NASA Orbital Debris Program Office (ODPO)

But humanity’s presence in orbit has developed from exploration into chaos. As of July 2026, more than 16,000 active satellites are orbiting Earth, accompanied by thousands of defunct payloads, rocket bodies, and pieces of catalogued debris. Since 2024, SpaceX alone has more than doubled its Starlink constellation from 5,000 satellites to over 10,300, and they are now applying to launch up to 100,000 more. Space debris increasingly threatens operating spacecraft and is an especially serious concern for crewed missions.

This was evident while camping a few weeks ago. Where the night sky once appeared static—an eternal reminder of our insignificance—it now appeared in constant motion, populated with our own artifacts. It’s a fundamental shift in human perspective, our impact on the universe impossible to ignore—if we remember to look.

… and then there are the space mirrors.

Photo: Reflect Orbital

On July 9, 2026, the FCC authorized the radio communications needed to deploy and operate Reflect Orbital’s Eärendil-1 demonstration satellite, along with its orbital debris mitigation plan. Upon launch, the satellite will deploy a 60-foot-by-60-foot “space mirror” to direct reflected sunlight onto a ground footprint approximately 5 kilometers in diameter. The company envisions using such technology to extend solar-energy production, support agriculture and emergency operations, and potentially replace some conventional streetlighting. While the idea sounds like science fiction, the Soviet Znamya 2 satellite was a successful test of the concept in February 1993. The FCC approved a single demonstrator sattelite, but Reflect Orbital plans for a 4,000 satellite constellation by 2030.

The proposal, however, raises substantial environmental and scientific concerns. Artificial light at night is known to disrupt circadian and seasonal timing across a wide range of species. It can alter broader ecosystem processes, including migration, reproduction, pollination, and predator–prey relations. Astronomers warn that even a single reflector could interfere significantly with ground-based astronomy by saturating sensitive detectors, contaminating exposures, increasing localized sky brightness, and disrupting time-sensitive observations. A large constellation would make avoidance increasingly impractical.

The FCC concluded that its authority in this proceeding extended principally to radiofrequency operations and orbital-debris mitigation, not to the reflector’s broader environmental effects. Under the National Environmental Policy Act (NEPA), the Federal Aviation Administration (FAA) conducts environmental reviews associated with commercial launch and reentry licenses, but those reviews do not necessarily encompass the in-orbit environmental effects of a payload. Compounding the oversight challenges, Executive Order 14335 (Enabling Competition in Commercial Spaceflight) directed federal agencies to expedite environmental reviews, identify activities that are not subject to NEPA, and expand the use of categorical exclusions. Taken together, these limitations reveal a potential regulatory gap for novel activities, such as space-based illumination.

International law offers only limited recourse. The 1967 Outer Space Treaty makes states internationally responsible for their national space activities and requires the authorization and continuing supervision of private operators. It also permits states to request consultations concerning activities that may cause harmful interference with the peaceful exploration and use of outer space. The treaty, however, establishes no international regulator or direct environmental review mechanism for evaluating effects on Earth from a project such as this.

How do we respond? The available options are limited, but they are not exhausted. Congressional action offers the clearest path toward a lasting solution. Contact your representatives and senators and ask them to investigate this regulatory gap for environmental impacts of space activities and establish enforceable protections for the night sky, biodiversity, and astronomy. DarkSky International, Earthjustice, and other coalition partners are currently evaluating available legal and policy options to oppose a broader constellation of space mirrors. Supporting those efforts, sharing accurate scientific information, and building sustained public pressure are crucial to ensure this demonstration doesn’t act as a precedent for a much larger constellation.

The Milky Way, Aurora Borealis, and Andromeda Galaxy over Telluride, Colorado. A function of our dark skies. Photo: Exploring the Frontier

Our night skies seem permanent, but the ancient order of the cosmos can quickly become chaos. We must fight the space mirror project for the sake of our planet and the stars that have always guided us home.

July 15, 2026
Imaging 20 Galaxies
This report is on what I found while imaging the M96 Group. I ended up capturing 20 galaxies. The report details issues both in image processing and image and is a fairly typical representation of the experimentation in astrophotography. I’m going to share the final product with labels now, then show how the picture evolved.

Background

I’ve been working my way through the Messier objects, which are a list of 110 “comet-like” entities that were cataloged by the French astronomer Charles Messier in the 1700s. They’re a great list for an amateur astronomer to knock out, as they were visible using the optics of the 1700s, many are visible with modern binoculars. They are “comet-like” objects because the notion of a galaxy was not formalized until 1926 by Edwin Hubble.

“20 Galaxies” – The final processed image with labels.

On February 8, I set out to image M95, M96, and M105, collectively known as the “M96 Group”. These three galaxies are really close together and of a decent size, making them easy to image at once. They are all about 37 Million light-years away, which makes the use of long-exposures a necessity.

Imaging

As the first 5-minute exposures came back, I ran into some issues right away: the dust on my camera sensor, which I had been delaying cleaning, was creating little distortions all across the deep field of stars. For most of the objects I had targeted so far, I was able to crop around the dirt. However, these objects were perfectly positioned to make that impossible.

Original framing, centered on M96. Notice all the circles caused by sensor dust.

Another issue was guiding. In order to get clear, pinpoint stars in a long-exposure image, the movement of the stars must be counteracted. There are two mechanisms at work: an electronic “tracking” mount moves in lock step with the stars. However, precise polar alignment is required to set the reference point to have accurate tracking. This is often difficult. In order to correct for poor polar alignment, “guiding” is often used. A secondary imaging system is mounted to the primary telescope. It is attached to a computer, which calculates the drift of the star field using a guide star and sends small “pulse” signals to the tracking mount.

Performing a polar alignment in KStars/EKOS for tracking accuracy. I was 1Â° 17′ 46″ off in my initial calibration. In order to bring the scope into alignment, I had to move the scope so that the star at the left end of the pink line is in the crosshair.

The first time I used guiding, everything “just worked”, I stayed within the error tolerance with no effort. The second time, it was failing quite regularly with more than 8 arc-seconds of drift and constantly trying to find new guide stars. What happened? Did I forget to balance the scope? Was the guide scope going out of focus? No. It was just clouds!

The guiding module in KStars/EKOS. On the left is the configuration info. The picture shows the guide camera view and the guide star. The chart shows the corrective pulses and the RMS error. The bullseye is a nice visualization of where the scope is drifting.

That’s when I discovered a really cool feature of my imaging software, KStars: it can abort an exposure if guiding error goes above a threshold and retry once the guiding settles down. An interesting consequence is that if a cloud appears, the star will disappear, and errors will rapidly go up, aborting the exposure. Essentially, I can use this to automatically image even if there are sporadic clouds.

The analytics module of KStars/EKOS. The bottom graph shows the drift and RMS. The first imaging run was largely successful, with 14 exposures of M67. I then moved my scope, which is what the yellow and blue boxes on top show. I then slewed to M96, and ran alignment (the two teal boxes). This is when the clouds started appearing, and you can see the aborted exposures in red on the bottom row, corresponding to very high RMS in the graph below. An in-progress shot is the hashed green box.

Finally, all these errors gave me a chance to look at the framing of my shot. When I took my first shots and started processing, I noticed two more galaxies near M105: NGC3384 and NGC3389. As I looked at the shot alignment, I noticed a few more galaxies to the left of M105 on my star charts. By changing the center of the frame from M96 to M105, I was able to pick up 3 more galaxies in frame. Rather than imaging 3 galaxies, I would now be targeting 8 galaxies!

Field of view for the M96 vs. M105 framing. This mosaic shot is from an early experiment in stacking frames from different imaging sessions.

In summary, there were 3 issues that came up during the imaging process on February 8:
- Cleaning – Dust directly on the camera sensor causing distortions.
- Guiding – Clouds causing loss of guide-star.
- Framing – Better framing by moving to a different target.
Processing

After I got my camera back from cleaning at Albuquerque Photo-Tech, I got out the scope again on February 11th and properly calibrated it. I was able to capture 17 5-minute exposures to use for the image. I used DeepSkyStacker to automatically align and stack the top 12 exposures, resulting in a 1-hour total exposure.

For the first time, I experimented with a technique called “drizzling”. This method was pioneered by the Hubble Space Telescope and released as open-source software. It uses slightly offset images to create much higher resolution images than the camera sensor is able to capture by exploiting the sensor’s undersampling of the telescope resolution. By deliberately varying the telescope position slightly, a different portion of the image will be sampled from. These differences can be used to upscale the image. The position variance is called “dithering”, the upscaling is “drizzling”.

Undersampling results from when the sensor resolution is lower than the telescope’s resolving power or atmospheric seeing. This allows for relaxed tolerance in guiding and is a great scenario for drizzling.

Ideal sampling occurs when the sensor approximates the telescope’s resolving power. This is also a good scenario for drizzle, as the atmospheric seeing is still undersampled.

Oversampling occurs when the sensor resolution is greater than the telescope’s resolving power. This results in “soft” images, as stars are spread across multiple pixels, rather than being a point source of light. Rather than drizzling to increase resolution, oversampling is addressed via “binning” pixels together, reducing resolution.

For example, my imaging telescope has a resolving power of 1.9 arcsec. My camera sensor can resolve 2.97 arcsec/pixel. That means that I am undersampling what the telescope is capable of resolving by 33%. By dithering and drizzling, I was able to create a 48MP master from my 12MP sensor and get signal from a lot more galaxies than anticipated.

Vignette removal – before and after comparison.

The major struggle for me in processing is light pollution and the uneven background it casts on my images. While I can get to dark skies pretty easily, outings require a lot of coordination for family duties, so I took this image in my backyard, almost directly under the neighbor’s flood light. No matter how I adjusted the color curves, I couldn’t remove the light pollution without also eliminating the galaxies. Finally, I found a tutorial on removing gradients by Astrobackyard. It detailed how to remove the light pollution through use of a threshold layer to create an artificial “flat” image representing the uneven light field. With the help of the tutorial, I was able to get a mostly-uniform background, although there’s some light vignetting remaining.

The telescope in its ultra-light-polluted, backyard habitat. Seriously, I think it might literally be the worst place in New Mexico to take pictures.

On the positive side, by zooming on so many parts of the image to examine the vignetting, I discovered another 12(!) galaxies in the frame, bringing the total to 20!
- Messier: M95, M96, M105
- NGC: 3338, 3357, 3367, 3377, 3377a, 3384, 3389, 3412
- PGC: 31937, 32371, 32393, 32488, 1403591
- UGC: 5832, 5869, 5897
- IC: 643
I was helped by two tools in identifying these galaxies: astrometry.net and Stellarium Web. Astrometry.net is awesome. You upload an image and it reports the celestial coordinates and objects in view. It’s completely open source, so I can run the solver locally. Astrometry is how I ensure proper alignment of the scope and accurate go-to movements while in the field. Stellarium Web is a browser-based planetarium that has a huge object database. While processing, I centered my view on M105, just like my camera, and used it to walk across the image to see what objects had resolved.

Screenshot of Stellarium Web showing the sky at 10:55 PM on February 11th, with one of the fainter galaxies in the image’s field of view highlighted. It’s interesting to note that there are even more galaxies in this shot that my imaging system couldn’t resolve.

In summary, I used three new techniques on the processing side for the exposures I captured on February 11th:
- Drizzle – Boosting resolution through slightly offset, undersampled exposures.
- Vignette removal – Photoshop threshold filters, combined with selective removal of deep space objects from the background field to produce a gradient mask.
- Locating objects – Astrometry and Stellarium Web to ensure alignment and get object names.
And here it is! 20 galaxies in a single image.

20 galaxies in a single image, unlabeled.

Conclusions

Two final notes on this field report:
1. Starting with a simple, manual scope was absolutely the right decision. I wrote about this for anyone considering a telescope purchase, especially if they want to share it with their family. The depth of understanding necessary to identify problems in the imaging process is enormous and learning one-step-at-a-time is highly recommended. Dip in a toe first. Don’t drop $2,000 and get frustrated because you can’t get it to work.
2. Less-than-perfect equipment makes room for experimentation. I started taking images with my cell phone and a manual mount. Right now I’m using my wife’s Canon EOS Rebel T3 from 2011. I’m not going to get Hubble-quality images in my backyard, but it’s amazing to learn the limits of our consumer technology and then push those limits. I’ve been astonished at what’s possible and how quickly my knowledge has grown based on necessity to get to the next level.
Thanks for reading!
February 17, 2021
Photographing Saturn

I’ve made a lot of progress in 2 months with astrophotography. Here’s my best attempt at Saturn from June 17 and then on August 12.

What changed? Phone and telescope stayed the same. However, I learned how to use the exposure settings and focus lock on the Pixel 3a’s video mode, meaning that my photos were no longer blown out. I also increased the resolution to 4k.

At the eyepiece, I switched from a Celestron X-Cel 12mm to an Explore Scientific 14mm. The biggest difference here is the field of view – moving from the 60-degree to the 82-degree means its easier to keep the planet within the frame (.3 degree TFOV vs. .47 TFOV), even though magnification is down (200x vs 171x when Barlowed). This means that focus and exposure are more consistence when I split and stack the frames from the camera using Registax.

I’m still not happy with my Jupiter photos, but I’m starting to pick up color details. The problem has been edge resolution, which is a focus issue.

Definitely having fun!

August 13, 2020
Astronomy
One thing about frontier life is that you can’t always struggle against the environment. New Mexico is hot, dry, and high elevation. Being outdoors in the summer is physically taxing. What about the night though? All the isolation out here makes this one of the best places to go stargazing.

At the start of quarantine, I decided to buy a telescope – something to get me outside, away from screens, and a chance to quiet my mind. I got an Apertura 8″ Dobsonian reflector from High Point Scientific, mostly on the wonder of AstroBackyard’s review video. It’s a fantastic beginner scope, and the manual mount is forcing me to really learn the skies.

The moon at 80x.

Seeing the moon, even at 40x magnification, is incredible. There are so many craters! Finding the planets has been a really neat adventure: I can’t believe that I’m able to separate Saturn’s rings and see Jupiter’s moons.

Jupiter and the 4 Galilean moons.

Saturn and it’s moon, Titan.

I also got a cell-phone adapter for my eyepieces. It essentially lets me use the entire telescope as a gigantic camera lens. My Google Pixel 3a has an astrophotography mode that has helped get long exposure photos of the sky.

The Milky Way, shot in astrophotography mode on a Pixel 3a XL.

Another cool thing is the discovery of Comet NEOWISE. It’s not really visible from the city, so it’s been a good excuse to get out of town into the wild.

Comet NEOWISE.

As I get deeper into this hobby, I’m realizing that something I originally started to get away from screens might get me into more screens. The basic calculations around optics have led to a gnarly spreadsheet. The notion of astrophotography as data collection is mind-blowing. Digital sensors have evolved to where we are literally measuring the number of photons hitting a 3 square-micron pixel, down to the level of a single photon in 5 minutes. This is all possible with consumer hardware too!

I wanted to share some beginner resources that have helped me.
- AstroBackyard review of Apertura AD8 â€” Trevor Jones has a great channel for astrophotography and conveys the wonder of it all well. This video really sealed my purchase.
- Allen’s Stuff on choosing a beginner telescope â€” Allan Hall reviews
  pretty much every kind of beginner scope and the pros-cons of each.
- A Beginner’s Guide to Solar System Photography â€” Particularly useful article focusing on alt-az mounts. A Dobsonian is a fancy alt-az mount and one of the big challenges is that stars do not track with that mount so your exposure times are limited.
- Astrophotography with a Dobsonian? â€” Video demonstrating reaosnable expectations from a beginner with the same type of telescope that I have.
- The Deep-Sky Imaging Primer -â€” Fantastic guide, university-course level of detail, far exceeded expectations and gave me a glimpse of just how engrossing this hobby can be. All of the author’s books are stunningly beautiful – his Sky Atlas is also great!
July 22, 2020
Towards Cultural-Scale Models of Full Text

For the past year, Colin and I have been on a HathiTrust Advanced Collaborative Support (ACS) Grant. This project has examined how topic models differ between library subject areas. For example, some areas may have a “canon” meaning that a low number of topics selects the same themes, no matter what the corpus size is. In contrast, still emerging fields may not agree on the overall thematic structure. We also looked at how sample size affects these models. We’ve uploaded the initial technical report to the arXiv:

Towards Cultural Scale Models of Full Text
Jaimie Murdock, Jiaan Zeng, Colin Allen
In this preliminary study, we examine whether random samples from within given Library of Congress Classification Outline areas yield significantly different topic models. We find that models of subsamples can equal the topic similarity of models over the whole corpus. As the sample size increases, topic distance decreases and topic overlap increases. The requisite subsample size differs by field and by number of topics. While this study focuses on only five areas, we find significant differences in the behavior of these areas that can only be investigated with large corpora like the Hathi Trust.
http://arxiv.org/abs/1512.05004

January 26, 2016
Psychonomics 2015

This weekend I was in Chicago for the Psychonomic Society and Society for Computers in Psychology meetings. Emily and I stayed Thursday through Saturday and experienced a record first snow of the season. I hope that our fellow conference-goers made it back safely as well.

Chicago is one of the best food towns we’ve ever been to: we cannot recommend Gino’s East deep-dish pizza and Santorini’s Greek restaurant enough.

Below are some conference observations and highlights.

Conference Impressions
As an abstract-only, non-proceedings conference, it is a great opportunity to showcase developing or under review work. For an idea of the breadth of the conference, please look at the abstract book. The talks were of varying quality, but the rapt attention of the audience and quality of questions were excellent. Next year it will be in Boston on November 17-20.

Distributed Cognition
One of the best talks was by Steven Sloman on “The Illusion of Explanatory Depth and the Community of Knowledge”:

Asking people to explain how something works reveals an illusion of explanatory depth: Typically, people know less about the causal mechanism they are describing than they think they do (Rozenblit & Keil, 2002). I report studies showing that explanation shatters peopleâ€™s sense of understanding in politics. I also show that peopleâ€™s sense of understanding increases when they are informed that someone else understands and that this effect is not attributable to task demands or understandability inferences. The evidence suggests that our sense of understanding resides in a community of knowledge: People fail to distinguish the knowledge inside their heads from the knowledge in other peopleâ€™s heads.

The article detailing that explanation shatters political understanding is quite accessible. The further results about “a community of knowledge” are under review.

Prof. Sloman is the conference chair for the International Conference on Thinking on August 3-6, 2016 at Brown University. Submission deadline is March 31, 2016.

The Science of Narrative
Another excellent talk was by Mark Finlayson who studies “the science of narrative”. He developed “Analogical Story Merging” (ASM), which can replicate Vladmir Propp’s theory of the structure of folktale plots. This process is described in his dissertation, which is an excellent synthesis of literary theory and computer science.

Prof. Finlayson is hosting the 7th International Workshop on Computational Models of Narrative at Digital Humanities 2016 in KrakÃ³w, Poland on July 11-12. The call for papers is pending.

Bilingualism

There were two talks in the Bilingualism track that were particularly interesting.Â Â ConorÂ McLennan and Sara InceraÂ reported that mouse tracking behavior in bilinguals doing a word discrimination task shows the same sort of reaction delay as in expert discrimination tasks. This correlates with confidence in answers – experts may take longer but move directly toÂ their answers.Â The results are published in Bilingualism.

Another talk looked at how multilingualism affects vocabulary size using a massive online experiment. While the task of identifying whether a word is known or not is riddled with false positives, the results were interesting in and of themselves. Mutlilinguals tended to have higher vocabularies across languages, and L2 learners tended to actually have a higher vocabulary than L1 native speakers within a language. The results are published inÂ The Quarterly Journal of Experimental Psychology.

November 23, 2015
Darwin’s Semantic Voyage

The preprint of my project “Exploration and Exploitation of Victorian Science in Darwin’s Reading Notebooks” was released on arXiv on Friday. The paper is joint work with my advisors Colin Allen and Simon DeDeo.

This has consumed my life for the past year and I’m incredibly proud of the results. It’s an entertaining read — printing pages “1-11,24-28” gives the main body and references. 12-23 are the “supporting information” explaining some of the archival work, mathematics, and model verification, but absolutely not central to the key points of the paper.

The key point for digital humanities is that we’ve come up with a way to characterize an individual’s reading behaviors and identify key biographical periods from their life. Darwin is incredibly well-studied, so our results largely confirm existing history of science work. However, by adjusting the granularity we can also suggest hypotheses for further investigation – in this case, the period of Darwin’s life from 1851-1853 after his daughter’s death. For less well-studied individuals, this may help humanists gain traction on narrative organization when interacting with large historical archives.

The key point for cognitive scientists is that we can now characterize information foraging behaviors on multiple timescales using an information theoretic measure of cognitive surprise. While many people have studied foraging behavior in individuals on the order of minutes, or in cultures on the order of decades – this is the first study that looks at how an individual interacts with the products of their culture over the course of a lifetime.

It’s important to note that we don’t say anything about how his reading affected his writing – that’s for paper #2!

Also, I’ll presenting this work at the 2015 Conference on Complex Systems this Friday at Arizona State University, with slides available on Google Slides.

Exploration and Exploitation of Victorian Science in Darwin’s Reading Notebooks
Jaimie Murdock, Colin Allen, Simon DeDeo
Abstract:Â Search in an environment with an uncertain distribution of resources involves a trade-off between local exploitation and distant exploration. This extends to the problem of information foraging, where a knowledge-seeker shifts between reading in depth and studying new domains. To study this, we examine the reading choices made by one of the most celebrated scientists of the modern era: Charles Darwin. Darwin built his theory of natural selection in part by synthesizing disparate parts of Victorian science. When we analyze his extensively self-documented reading we find shifts, on multiple timescales, between choosing to remain with familiar topics and seeking cognitive surprise in novel fields. On the longest timescales, these shifts correlate with major intellectual epochs of his career, as detected by Bayesian epoch estimation. When we compare Darwin’s reading path with publication order of the same texts, we find Darwin more adventurous than the culture as a whole.

September 30, 2015
Topic Modeling Tutorial at JCDL2015

Join the HathiTrust Research Center (HTRC) and InPhO Project for a half-day tutorial on HathiTrust data access and topic modeling at JCDL 2015 in Knoxville, TN on Sunday, June 21, 2015, 9am-12pm!

Topic Exploration with the HTRC Data Capsule for Non-Consumptive Research
Organizers: Jaimie Murdock, Jiaan Zeng and Robert McDonald
Abstract: In this half-day tutorial, we will show 1) how the HathiTrust Research Center (HTRC) Data Capsule can be used for non-Âconsumptive research over collection of texts and 2) how integrated tools for LDA topic modeling and visualization can be used to drive formulation of new research questions. Participants will be given an account in the HTRC Data Capsule and taught how to use the workset manager to create a corpus, and then use the VMâ€™s secure mode to download texts and analyze their contents.Â [tutorial paper]

We draw your attention to the astonishingly low half-day tutorial fees:

Half-Day Tutorial/Workshop Early Registration (by May 22!)
ACM/IEEE/SIG/ASIS&T Members – $70
Non-ACM/IEEE/SIG/ASIS&T Members – $95
ACM/IEEE/SIG/ASIS&T Student – $20
Non-member Student – $40

Half-Day Tutorial/Workshop Late/Onsite Registration
ACM/IEEE/SIG/ASIS&T Members – $95
Non-ACM/IEEE/SIG/ASIS&T Members – $120
ACM/IEEE/SIG/ASIS&T Student – $40
Non-member Student – $60

Hope to see you there!

http://www.jcdl2015.org/registration

May 14, 2015
Six Upcoming Talks

For the past 6 months, I’ve been very busy working on a number of collaborations with Simon DeDeo and Colin Allen. Now, I’m taking to the road to show the fruit of my labors. Below are 6 upcoming talks, tutorials, and workshops about this work on topic modeling, Charles Darwin, information foraging, and the HathiTrust. I hope to see you there!

Topics over Time: Into Darwin’s Mind (Local)
Network Science @ IU Talks
Monday, March 9 â€” 12:30-1pm
Social Science Research Commons
Slides: http://jamr.am/DarwinIUNetSci
Video coming soon!

Topic Modeling with the HathiTrust Data Capsule
HathiTrust UnCamp 2015
Monday, March 30
Ann Arbor, MI
Presenters: Jaimie Murdock, Colin Allen

Topic-driven ForagingÂ (Local)
Goldstone, Todd, Landy Lab
Friday, April 10 â€” 9-10a
MSB II Gill Conference Room

Visualization Techniques for LDA (Local)
Cognitive Science 25th Anniversary
Interactive Systems Open House
Friday, April 17 â€” 3:30-5:15pm
Location TBD

Topic Modeling & Network Analysis (Local)
Catapult Center Workshops
Friday, April 24 â€” 1-4pm
Wells Library E159
Presenter: Colin Allen

HT Data Capsule & Topic Modeling for Non-consumptive Research
JCDL 2015 Tutorial
Sunday, June 21 â€” 9am-noon
Knoxville, TN
Presenters: Jaimie Murdock,Â Jiaan Zeng, Robert MacDonald

March 16, 2015
Wisdom of the Few?

Wisdom of the Few? “Supertaggers” in Collaborative Tagging Systems

Jared Lorince, Sam Zorowitz, Jaimie Murdock, Peter M. Todd

A folksonomy is ostensibly an information structure built up by the “wisdom of the crowd”, but is the “crowd” really doing the work? Tagging is in fact a sharply skewed process in which a small minority of “supertagger” users generate an overwhelming majority of the annotations. Using data from three large-scale social tagging platforms, we explore (a) how to best quantify the imbalance in tagging behavior and formally define a supertagger, (b) how supertaggers differ from other users in their tagging patterns, and (c) if effects of motivation and expertise inform our understanding of what makes a supertagger. Our results indicate that such prolific users not only tag more than their counterparts, but in quantifiably different ways. These findings suggest that we should question the extent to which folkosonomies achieve crowdsourced classification via the “wisdom of the crowd”, especially for broad folksonomies like Last.fm as opposed to narrow folksonomies like Flickr.

Preprint of article in review available atÂ arXiv:1502.02777 [cs.SI]

February 11, 2015