Author: Jaimie Murdock

  • The New Social Media Landscape

    Social media has been very important to my life in many ways. In particular, Instagram has enriched my life through friendships, relationships, and my development as an amateur photographer to establishing a photography business. However, there have been many changes that concern me going forward. While the moderation changes are alarming, it’s honestly the increasing presence of AI agents on the platforms that have me considering my engagement.

    Outside of photography, I am an artificial intelligence researcher, with a PhD in Cognitive Science and Complex Systems from Indiana University. While I was at Indiana, I took courses with several professors who monitored the spread of disinformation through coordinated bot networks on the web through the Observatory on Social Media (OsOME)

    This article is aimed for the general public, explaining how we got to our present difficulties on the web through ad-funded business models. Here, I focus on Google and Meta with their respective challenges in measuring engagement and controlling spam. I conclude the article with some thoughts on the affordances of various social media platforms with respect to three aspects of behavior on the web: content, connection, and commerce. Finally, I implore people to consider the business models of the tools they use, with a few suggestions on landing spots.

    Algorithms

    Algorithms and data structures are the two core elements of computer science. An algorithm is a way of doing something – a formal set of procedures for problem-solving. This contrasts with popular usage, which focuses on a specific class of ranking algorithms that determine the way information is presented in a “feed”.

    This popular usage has its origins with the PageRank algorithm that allowed Google to quickly usurp all legacy search engines, such as Yahoo and AltaVista. The magic ingredient was looking at the structure of the web and hypothesizing that more important pages will have more incoming links. However, this algorithm can be quickly exploited – by creating sets of pages that link into a particular page, the target page’s importance can be artificially inflated. Thus, in order to maintain search result quality, Google augments PageRank with other metrics when presenting results – a measure of trust for each domain, augmentation with keyword blends, etc. By changing the algorithm, publishers then attempt to exploit the new algorithm, starting an arms race known as “Search Engine Optimization” (SEO).

    When the goals of the search company are to provide high-quality search results and the goals of the publishers are to promote high-quality content, this game is productive – an ecosystem in which “content is king”. However, the motives of both publisher and search engine can be easily compromised. For publishers, the goal of high-rankings can overtake content quality, resulting in a web designed for Google, instead of humans. For search engines, an ad-based business model conflicts with users’ desire for high-quality search results by allowing paid intrusions to the ranking algorithm. Eventually, the scales tip to the extreme articulated by Google’s VP of Finance:  “[W]e can mostly ignore the demand side…(users and queries) and only focus on the supply side of advertisers, ad formats and sales.” 

    This process, of compromising product quality for users and extracting as much value out of business customers as possible, is known as “enshittification”, a term coined by the tech critic, writer, and copyright reform advocate Cory Doctorow:

    First, [companies] are good to their users; then they abuse their users to make things better for their business customers; finally, they abuse those business customers to claw back all the value for themselves. Then, they die. 

    Engagement

    In the social media space, enshittification has been happening gradually through algorithmic changes as Meta, TikTok, and X have changed their default algorithms from being a chronological timeline of the profiles that a user follows to paid placement, just as Google did. Each time the app is opened, thousands of potential posts are evaluated from paid advertisements to recommended posts to the slight chance of seeing something from someone you follow. All these changes are carefully monitored to drive engagement metrics that can be marketed to advertisers: clicks, time-on-site, comments, likes, etc.

    However, these metrics can be gamified by social media companies. This happened in the “pivot to video”: In 2015, Facebook encouraged news and media companies to post videos on their platform, rather than linking to written hosted elsewhere. Facebook cited increased engagement and many companies reduced their footprint – some going as far as removing their homepage altogether (Mashable, CollegeHumor, Funny or Die) and cutting their writing teams (Fox Sports, MTV News, Vice News). These companies all missed that they had lost control of their relationship to their audience. Once the transition to Facebook was complete, the company implemented a pay-to-play scheme for exposure. This ultimately led to the demise of sites such as CollegeHumor and Funny or Die. 

    In 2018, it was alleged that Facebook’s “pivot to video” exaggerated the success of videos on the platform by claiming they were watched over 900 times more than they actually were.  The lawsuit was settled in 2019 for $40 million, with Facebook admitting no wrong-doing, but the damage to legacy media was done. Ironically, the “pivot to video” had also damaged Facebook’s long-term metrics for “organic posts from individuals.” As users began to see Facebook as a platform for advertisers, rather than to maintain connections, they disengaged from the platform – the final step of “enshittification”.

    Spam, Bots, and Slop

    Spam is the sending of unsolicited messages repeatedly to a large audience for the purposes of advertising, propaganda, or other purposes. The lines between SEO-optimized content and spam have long been blurred. On social media platforms, spam is colloquially carried out by “bots” – automated social agents that create content to influence the algorithm so it promotes their message. These “bots” have been widely seen as a problem, as they create a negative user experience. Furthermore, nation-state actors have often used “bot farms” to spread misinformation on social media. 

    However, not all bots are malicious, making it hard to remove all automated activities on social media. For example, the National Weather Service has many “bot” accounts to communicate weather statements, watches, and warnings to the general public. Other areas are more gray – engagement bots can automate liking of comments or posts and appear indistinguishable from human usage. Any platform with an open API is subject to both malicious and benign use, as defined by the particular terms of service.

    To improve user experience and maintain an audience for ad-driven business models, social media companies have traditionally waged war on malicious bots. Ostensibly, this was part of Elon Musk’s rationale for purchasing then-Twitter. Unfortunately, by closing or rate-limiting APIs, independent tools to evaluate whether an account was a bot or not were forced to shut down. Internal teams working on misinformation and removing malicious bots had significant overlap. As social media companies increasingly abandon their in-house misinformation teams, the prevalence of bots has increased. These next-generation bots are also increasingly capable, as AI tools have given them multi-modal functionalities.

    However, the pressures of enshittification operate here as well – scrolling through spam content increases certain engagement metrics, such as time-on-site and posts served. While user experience suffers and other metrics decrease, such as likes and comments, bots can have their place in driving ad sales by creating fake engagement for the ad market to promote. The deployment of new AI production strategies has accelerated the accumulation of spam content, exemplified by AI “travel influencers”. Metrics without differentiation between bot and human engagement are useless, but the new norm for end users.

    Meta has recently decided to go all-in on this strategy, telling the Financial Times: “We expect these AIs to actually, over time, exist on our platforms, kind of in the same way that accounts do. They’ll have bios and profile pictures and be able to generate and share content powered by AI on the platform.” In implementation, these AI users are undifferentiated from normal users in their posts. In fact, they bear the blue “verified” checkmark on the platform, indicating they should be trusted more than other users. Additionally, they were unblockable, meaning that a human user could not “opt out” from the experience. 

    Meta backpedaled on this by removing the AI profiles, but reporting of the removal has emphasized the politicized aspects of these AI users, rather than the fundamental transgression: by creating artificial users, undifferentiated from human users, the “social” aspects of social media are compromised. This erodes any trust that users are real people, a long-standing conspiracy theory known as Dead Internet Theory, which is now Meta’s business plan. While some articles were concerned with the “digital blackface” of such profiles as “Brian – Everybody’s grandpa” and “Liv – Proud Black Queer momma of 2 & truth-teller”, these caricatures only intensify the core offensiveness of pitching these agents as equals for genuine human connection. The problem is not the aspects of humanity they attempt to mimic, but rather the mimicry itself.

    This embrace of AI has also been seen on Google, where instead of embracing the arms race with purveyors of AI-generated content to maintain high-quality search results, they have merely placed their own AI summary before any search results – paid or organic. This technique attempts to use retrieval-augmented generation (RAG) to then link to alleged citations supporting the AI Summary. Recently, the search engine placed AI-generated content above the original article in their ranked results, an alarming loss for both searchers and publishers.

    Trust

    The degradation of search and social media products as a result of the ad-based revenue model is apparent to anyone who uses Google, Meta, or X. As a consumer, we ostensibly have choice in platforms (or to disengage altogether). As a business, disengagement is tempered by where the eyeballs are, which involves compromising ourselves to untrustworthy partners like Meta, who inflate metrics and invent artificial users.

    As you evaluate your relationship with these platforms, it is helpful to think of three aspects: content, connection, and commerce. Underlying each one is trust – that information has been vetted, that users are “real”, and that business will flow. At present, many platforms fail on these aspects of trust. In the case of Meta and X, deliberately so as they remove internal misinformation teams. In Google’s case, unreliable AI Summaries remove trust in content. At Meta, company-owned AI profiles remove the trust that users are real, removing connection. At X and across all Meta platforms, algorithmic deprioritization of links outside their platforms reduces commerce.

    Fighting Enshittification

    Ultimately, the highest trust comes from owning your own distribution channels. However, “surfing the web” has been replaced with “scrolling” for most discovery activities. E-mails are rarely seen by end users, as spam filters have advanced. This makes the maintenance of your own website or newsletter feel like shouting into the void. 

    My recommendation is to observe what a platform’s monetization strategy is. All ad-driven platforms will be compromised in some way. Finding platforms that seek novel funding mechanisms, such as subscription-driven models, is a high priority. Otherwise, we only participate in large-scale advertising systems, rather than investing in platforms that suit our needs for trusted content, connection, and commerce.

    For social media, there are non-ad-supported solutions: Mastodon and Bluesky. These are largely Threads or X replacements and of the two, I have had more traction on Bluesky. The medium gives a public forum for events and discussion, but do not replicate key aspects of Instagram, including the Grid that offers artists a gallery and Stories that offer ephemeral content, leading to connection. I struggle to see how to drive print sales or portrait bookings through either platform, so mostly have fun with it.

    Both Mastodon and Bluesky offer a revolutionary service to their users: a default timeline that acts as a “no algorithm” feed – simply showing the posts of the users you follow, ordered by recency. Once you hit a critical mass of accounts though, the necessity of an algorithm becomes apparent. Bluesky allows users to select alternate feeds and create their own algorithms for ordering the information. For example, I subscribe to a feed called “The ‘Gram” that shows only posts from people I follow with media. Mastodon has a capacity to generate feeds from lists of users or hashtags, but does not allow algorithmic filtering in the ways that Bluesky does. 

    Another advantage of Bluesky is domain verification, providing consistent branding that points back to your own website. For my academic musings, follow me @jamram.net. For my landscape photography, follow me @exploringthefrontier.com. I think this connection of identity to the web at large is excellent and restores the ecosystem of the open web through social media. Finally, I prefer the granularity of the moderation tools at Bluesky, compared to the per-instance moderation at Mastodon. It provides a user with excellent control of their personal experience.

    For search engines, I have not identified a non-ad-driven search engine. While DuckDuckGo heralds their privacy model as a reason to use the platform, they still rely on ad placements for revenue, compromising search quality. Given my academic interests, I’ve started thinking about alternatives to this in the context of enterprise search. I think a lot of frustration with disinformation on the web at large could be solved by increasing search index quality, which involves controlling for SEO, social bots, and AI-generated content. It’s a steep hill that would rely on trust models being integrated with the indexing efforts. Always available for consultation on that topic.

    Conclusion

    In this article, I covered the birth of modern content ranking algorithms through Google’s PageRank and the subsequent “enshittification” of these services through ad-based business models. I identified engagement and spam control as two challenges for high-quality content that are compromised by the metrics of ad-based business models. Then I analyzed social media platforms with respect to three aspects of behavior on the web: content, connection, and commerce. Finally, I identified current tools that may give hope for a non-ad-based business model and identified a gap in the search engine space.

  • Imaging 20 Galaxies

    This report is on what I found while imaging the M96 Group. I ended up capturing 20 galaxies. The report details issues both in image processing and image and is a fairly typical representation of the experimentation in astrophotography. I’m going to share the final product with labels now, then show how the picture evolved.

    Background

    I’ve been working my way through the Messier objects, which are a list of 110 “comet-like” entities that were cataloged by the French astronomer Charles Messier in the 1700s. They’re a great list for an amateur astronomer to knock out, as they were visible using the optics of the 1700s, many are visible with modern binoculars. They are “comet-like” objects because the notion of a galaxy was not formalized until 1926 by Edwin Hubble.

    “20 Galaxies” – The final processed image with labels.

    On February 8, I set out to image M95, M96, and M105, collectively known as the “M96 Group”. These three galaxies are really close together and of a decent size, making them easy to image at once. They are all about 37 Million light-years away, which makes the use of long-exposures a necessity.

    Imaging

    As the first 5-minute exposures came back, I ran into some issues right away: the dust on my camera sensor, which I had been delaying cleaning, was creating little distortions all across the deep field of stars. For most of the objects I had targeted so far, I was able to crop around the dirt. However, these objects were perfectly positioned to make that impossible.

    Original framing, centered on M96. Notice all the circles caused by sensor dust.

    Another issue was guiding. In order to get clear, pinpoint stars in a long-exposure image, the movement of the stars must be counteracted. There are two mechanisms at work: an electronic “tracking” mount moves in lock step with the stars. However, precise polar alignment is required to set the reference point to have accurate tracking. This is often difficult. In order to correct for poor polar alignment, “guiding” is often used. A secondary imaging system is mounted to the primary telescope. It is attached to a computer, which calculates the drift of the star field using a guide star and sends small “pulse” signals to the tracking mount.

    Performing a polar alignment in KStars/EKOS for tracking accuracy. I was 1° 17′ 46″ off in my initial calibration. In order to bring the scope into alignment, I had to move the scope so that the star at the left end of the pink line is in the crosshair.

    The first time I used guiding, everything “just worked”, I stayed within the error tolerance with no effort. The second time, it was failing quite regularly with more than 8 arc-seconds of drift and constantly trying to find new guide stars. What happened? Did I forget to balance the scope? Was the guide scope going out of focus? No. It was just clouds!

    The guiding module in KStars/EKOS. On the left is the configuration info. The picture shows the guide camera view and the guide star. The chart shows the corrective pulses and the RMS error. The bullseye is a nice visualization of where the scope is drifting.

    That’s when I discovered a really cool feature of my imaging software, KStars: it can abort an exposure if guiding error goes above a threshold and retry once the guiding settles down. An interesting consequence is that if a cloud appears, the star will disappear, and errors will rapidly go up, aborting the exposure. Essentially, I can use this to automatically image even if there are sporadic clouds.

    The analytics module of KStars/EKOS. The bottom graph shows the drift and RMS. The first imaging run was largely successful, with 14 exposures of M67. I then moved my scope, which is what the yellow and blue boxes on top show. I then slewed to M96, and ran alignment (the two teal boxes). This is when the clouds started appearing, and you can see the aborted exposures in red on the bottom row, corresponding to very high RMS in the graph below. An in-progress shot is the hashed green box.

    Finally, all these errors gave me a chance to look at the framing of my shot. When I took my first shots and started processing, I noticed two more galaxies near M105: NGC3384 and NGC3389. As I looked at the shot alignment, I noticed a few more galaxies to the left of M105 on my star charts. By changing the center of the frame from M96 to M105, I was able to pick up 3 more galaxies in frame. Rather than imaging 3 galaxies, I would now be targeting 8 galaxies!

    Field of view for the M96 vs. M105 framing. This mosaic shot is from an early experiment in stacking frames from different imaging sessions.

    In summary, there were 3 issues that came up during the imaging process on February 8:

    • Cleaning – Dust directly on the camera sensor causing distortions.
    • Guiding – Clouds causing loss of guide-star.
    • Framing – Better framing by moving to a different target.

    Processing

    After I got my camera back from cleaning at Albuquerque Photo-Tech, I got out the scope again on February 11th and properly calibrated it. I was able to capture 17 5-minute exposures to use for the image. I used DeepSkyStacker to automatically align and stack the top 12 exposures, resulting in a 1-hour total exposure.

    For the first time, I experimented with a technique called “drizzling”. This method was pioneered by the Hubble Space Telescope and released as open-source software. It uses slightly offset images to create much higher resolution images than the camera sensor is able to capture by exploiting the sensor’s undersampling of the telescope resolution. By deliberately varying the telescope position slightly, a different portion of the image will be sampled from. These differences can be used to upscale the image. The position variance is called “dithering”, the upscaling is “drizzling”.

    Undersampling results from when the sensor resolution is lower than the telescope’s resolving power or atmospheric seeing. This allows for relaxed tolerance in guiding and is a great scenario for drizzling.
    Ideal sampling occurs when the sensor approximates the telescope’s resolving power. This is also a good scenario for drizzle, as the atmospheric seeing is still undersampled.
    Oversampling occurs when the sensor resolution is greater than the telescope’s resolving power. This results in “soft” images, as stars are spread across multiple pixels, rather than being a point source of light. Rather than drizzling to increase resolution, oversampling is addressed via “binning” pixels together, reducing resolution.

    For example, my imaging telescope has a resolving power of 1.9 arcsec. My camera sensor can resolve 2.97 arcsec/pixel. That means that I am undersampling what the telescope is capable of resolving by 33%. By dithering and drizzling, I was able to create a 48MP master from my 12MP sensor and get signal from a lot more galaxies than anticipated.

    Vignette removal – before and after comparison.

    The major struggle for me in processing is light pollution and the uneven background it casts on my images. While I can get to dark skies pretty easily, outings require a lot of coordination for family duties, so I took this image in my backyard, almost directly under the neighbor’s flood light. No matter how I adjusted the color curves, I couldn’t remove the light pollution without also eliminating the galaxies. Finally, I found a tutorial on removing gradients by Astrobackyard. It detailed how to remove the light pollution through use of a threshold layer to create an artificial “flat” image representing the uneven light field. With the help of the tutorial, I was able to get a mostly-uniform background, although there’s some light vignetting remaining.

    The telescope in its ultra-light-polluted, backyard habitat. Seriously, I think it might literally be the worst place in New Mexico to take pictures.

    On the positive side, by zooming on so many parts of the image to examine the vignetting, I discovered another 12(!) galaxies in the frame, bringing the total to 20!

    • Messier: M95, M96, M105
    • NGC: 3338, 3357, 3367, 3377, 3377a, 3384, 3389, 3412
    • PGC: 31937, 32371, 32393, 32488, 1403591
    • UGC: 5832, 5869, 5897
    • IC: 643

    I was helped by two tools in identifying these galaxies: astrometry.net and Stellarium Web. Astrometry.net is awesome. You upload an image and it reports the celestial coordinates and objects in view. It’s completely open source, so I can run the solver locally. Astrometry is how I ensure proper alignment of the scope and accurate go-to movements while in the field. Stellarium Web is a browser-based planetarium that has a huge object database. While processing, I centered my view on M105, just like my camera, and used it to walk across the image to see what objects had resolved.

    Screenshot of Stellarium Web showing the sky at 10:55 PM on February 11th, with one of the fainter galaxies in the image’s field of view highlighted. It’s interesting to note that there are even more galaxies in this shot that my imaging system couldn’t resolve.

    In summary, I used three new techniques on the processing side for the exposures I captured on February 11th:

    • Drizzle – Boosting resolution through slightly offset, undersampled exposures.
    • Vignette removal – Photoshop threshold filters, combined with selective removal of deep space objects from the background field to produce a gradient mask.
    • Locating objects – Astrometry and Stellarium Web to ensure alignment and get object names.

    And here it is! 20 galaxies in a single image.

    20 galaxies in a single image, unlabeled.

    Conclusions

    Two final notes on this field report:

    1. Starting with a simple, manual scope was absolutely the right decision. I wrote about this for anyone considering a telescope purchase, especially if they want to share it with their family. The depth of understanding necessary to identify problems in the imaging process is enormous and learning one-step-at-a-time is highly recommended. Dip in a toe first. Don’t drop $2,000 and get frustrated because you can’t get it to work.
    2. Less-than-perfect equipment makes room for experimentation. I started taking images with my cell phone and a manual mount. Right now I’m using my wife’s Canon EOS Rebel T3 from 2011. I’m not going to get Hubble-quality images in my backyard, but it’s amazing to learn the limits of our consumer technology and then push those limits. I’ve been astonished at what’s possible and how quickly my knowledge has grown based on necessity to get to the next level.

    Thanks for reading!

  • The Raspberry Pi is the Future of Open Computing

    About a month ago, I got my first Raspberry Pi 4.

    What is a Raspberry Pi? A Raspberry Pi is a small singe-board computer intended for education and for tinkering. For $35, you get a 1.5GHz quad-core ARM processor with 2GB RAM, 2x USB 3.0 ports, 2x USB 2.0 ports, 2x MicroHDMI ports for 2x 4k@60Hz displays, gigabit ethernet, 802.11ac wifi, and Bluetooth. It uses a standard USB-C PD charger as a power supply. All that’s missing is a case, a display cable, and a MicroSD card. Using official sources, another $28. It uses Raspberry Pi OS, a Linux distribution based on Debian.

    Raspberry Pi 4 Tech Specs

    When my laptop died, I started thinking about replacements and the opportunity to get something to tinker with. I wanted to see just how far low-power computers had come and try to have a fanless computer as my daily machine. If I needed to do “serious work” I could always use it to shell into my larger desktop or a cloud compute node, but I could enjoy pure silence as a default state.

    The little machine has exceeded all expectations. It only draws 15W max under load, my usage has been more like 6W. It’s so small, it disappears under my desk with 2 command strips.

    Having a small, low-cost, silent computer inspired thoughts of where else a little computer could go. One of the first targets was my astronomy hobby. A modified OS called Astroberry came preconfigured with all the drivers I needed to operate my new astrophotography telescope. I simply bought a second microSD card, flashed it with Astroberry, and within an hour had a full guiding and tracking setup with connections to both my home network and an automatic hotspot for remote control when I’m at a dark sky site.

    Raspberry Pi 4 (silver) mounted to the front of my new astrophotography telescope.

    Working with a new-to-me compute architecture (the RPi4’s ARMv8) and interacting with embedded controllers (the telescope’s PMC-8) has me reflecting on the last time I really got to play with hardware. In undergrad, we learned about embedded controllers with a TI MSP430 retrofitted onto a  “Goofy Giggles” toy. A lot of our exercises involved cross-platform compilation of C code from an x86 machine onto the MSP430. It was my first experience with compilation, assembly, and anything resembling electrical engineering. While my machine learning research is beyond abstract, having a foundation in the physical constraints of computation has been incredibly useful.

    The Goofy Giggles with MSP430 controller soldered on. (Geoffrey Brown and Bryce Himebaugh, 2004)

    More important than tinkering or thinking about embedded computing, the Raspberry Pi is a fundamentally democratic platform and that’s where my excitement about the platform is palpable. For many, the Pi could be someone’s first encounter with “free” computing – not in a price sense, but in the sense of freedoms. We live in a world of forced updates and subscription-based software licensing. Our smartphones and tablets drive a consumption-based model of computing. The “smarthome” is just paying money to allow Alexa, Google, and Siri eyes and ears in our most private spaces. Modern software and hardware force us to accept this compromise of privacy and ownership. The notion that we could actually own our software, and moreover, pull back the curtains to figure out how it works is largely lost.

    The Raspberry Pi is a different path. I took open hardware running open software and can now control a telescope plus 2 cameras, and do it all from anywhere in the world without ever worrying about my subscription expiring or APIs changing. This is absolutely the future we want, but it’s not the one we’ve been offered by the smarthome. I think the Pi can change this by making computing accessible again, just as the hobbyists found in the 70s and 80s, or how we found as students with “Goofy Giggles”. I’m excited to try as my primary compute platform.

  • Getting a Telescope: A Cautionary Tale

    It’s Christmas or your kiddo’s birthday. They open a telescope and want to take it outside right away. You’ve never used a telescope before, so:

    1. You get the automated GoTo system with an equatorial mount because you might want to take pictures some day. You take it outside and it tells you to line up the scope with Polaris. You find the star and get it centered in your polar scope by adjusting the latitude knob. Then you look in your telescope and don’t see anything. The lens caps are off, but you’ve never focused it before and Polaris is literally the only star in that area under your neighbor’s floodlight. No matter how much you turn the knob, you can’t see anything. You try to connect the mount to your phone, but then realize the motors need 8 C batteries, so you run to the store. When you get back, it’s gotten cold, so the kids are inside and this has become your “project” for the night. Eventually you get it pointed at the moon but the kids are mostly bored. It spends the rest of its life in the garage.

    2. You get a manual mount. You drop the telescope in the holes and swing it towards the moon. IT’S SO BRIGHT, but not focused at all. You twist the knobs and figure out where it is:

    Photograph of the moon taken with a cell phone camera at the eyepiece of an 8" Dobsonian telescope.
    The moon, taken with a cell phone at the eyepiece and a polarizing filter on a Dobsonian telescope.

    You can see individual craters! Amazing! You start to move the scope around to find Jupiter and just see so many stars! Even without finding the planet, you’ve figured out the focus and really just had no idea there were SO MANY stars.

    Also, despite “not being for astrophotography” took that picture of the moon with my 8″ Dobsonian scope and a cell phone held to the eyepiece. If you take more time with it, you’ll find the planets and maybe even catch the Orion Nebula:

    Photograph of the Orion Nebula, taken with a cell phone mounted to the eyepiece, 8 second exposure.
    The Orion Nebula, taken with my phone on a cell phone adapter attached to my 8″ Dobsonian telescope. An 8 second exposure.

    Astronomy can be attractive to exactly the kind of person that tinkers and values quality tools and obsesses over specs. This post is a reminder that the “best” system isn’t always the best for learning. Dip a toe in. If you like it, you can spend more and get fancy later. Don’t create barriers for yourself. There are a lot of skills to learn, and if you get a solid foundation, you’ll know what direction to take it.

    My first scope was an Apertura AD8. It’s about the same as any 8″ Dobsonian reflector telescope. It’s a common recommendation for beginning astronomers, and entirely manual. I am glad I listened to the overwhelming community advice to start with a manual mount and get as much aperture as you can afford, even more so after fighting my astrophotography setup for the past 2 weeks. I know how to diagnose the issues on my new setup because I have the orientation of learning things by hand for 6 months.

    When you’re starting a hobby, you want to make it easy. Point, look, wonder. If you have any questions, please reach out!

  • Creativity, Wanderlust, and the Frontier

    Context: I started a new Instagram photo blog project @exploringthefrontier. This is my intro…

    Art and identity are fundamentally linked. Turning 30 has been a constant reevaluation of my identity between finishing school, having a kid, and finding my home. It’s also a chance to be deliberate in what I nurture in myself, my son, and the world.

    Creativity is first: whether making music, taking photos, designing a program, or fixing the house, creativity is a way to find life. In 1945, the physicist Erwin Schrodinger answered the question “What is life?” with the notion of “negative entropy” – life is literally that which fights the universe’s natural inclination to decay. Creativity makes sense of the world, making order, fighting destruction.

    Wanderlust is second. It flows through my veins. My roots are in Kentucky, but at the same time both sides of my family have deep connections to living as expats. I was born half a world away, so wanderlust is not just a feeling, but a way of life – the eternal tension between roots and adventure.

    With wanderlust comes the pioneer search for home. I’m drawn to the frontier. Two years ago, I found my home in New Mexico. My adopted city of Albuquerque predates the Declaration of Independence by 70 years. The state capital predates the Mayflower by a decade. People have been living in the Rio Grande Valley for thousands of years. Before humans even existed, dinosaurs roamed what are now the badlands. Millions of years are recorded in our exposed canyon walls. And yet, we’ve barely been a state for 100 years. There are only 17 people per square mile. I delight in the contrast: people have always been here and yet no one is here. That’s the frontier – timeless and new at the same time.


    Artistically, there’s three components I’m exploring: subjects, equipment, and editing. I prefer landscape and astro-photography over human subjects. Taking shots of nature is just more … natural … to me and doesn’t seem as disruptive to “pause” the moment to ask people for a picture.

    As for equipment, I’ve been doing everything with a cell phone camera – even astrophotography! Phones are always at hand and while “the revolution will not be televised”, it sure is being streamed. Rapid advances in image processing and sensor manufacturing make it even more democratized – a camera in every pocket. Pushing the medium to its limits is really exciting.

    For editing, I’m learning Lightroom. From minimally invasive color rebalancing to editing out footprints with the healing brush, finding my personal style is going to be fun. Watching some NM photographers play with editing has been really inspiring and I’ve been thinking a lot about what is “real” – how many layers are there between collecting light on a sensor to what we finally see on the screen? How many of those are an “artistic” decision? Astro makes this even more poignant as we reveal what is hidden among the stars.

    Finally, the name “Exploring the Frontier” captures what I’ll be doing as I learn and create.

    I’d be delighted for comments and to meet new people! Thanks for reading!

  • Photographing Saturn

    I’ve made a lot of progress in 2 months with astrophotography. Here’s my best attempt at Saturn from June 17 and then on August 12.

    What changed? Phone and telescope stayed the same. However, I learned how to use the exposure settings and focus lock on the Pixel 3a’s video mode, meaning that my photos were no longer blown out. I also increased the resolution to 4k.

    At the eyepiece, I switched from a Celestron X-Cel 12mm to an Explore Scientific 14mm. The biggest difference here is the field of view – moving from the 60-degree to the 82-degree means its easier to keep the planet within the frame (.3 degree TFOV vs. .47 TFOV), even though magnification is down (200x vs 171x when Barlowed). This means that focus and exposure are more consistence when I split and stack the frames from the camera using Registax.

    I’m still not happy with my Jupiter photos, but I’m starting to pick up color details. The problem has been edge resolution, which is a focus issue.

    Definitely having fun!

  • Astronomy

    One thing about frontier life is that you can’t always struggle against the environment. New Mexico is hot, dry, and high elevation. Being outdoors in the summer is physically taxing. What about the night though? All the isolation out here makes this one of the best places to go stargazing.

    At the start of quarantine, I decided to buy a telescope – something to get me outside, away from screens, and a chance to quiet my mind. I got an Apertura 8″ Dobsonian reflector from High Point Scientific, mostly on the wonder of AstroBackyard’s review video. It’s a fantastic beginner scope, and the manual mount is forcing me to really learn the skies.

    The moon at 80x.

    Seeing the moon, even at 40x magnification, is incredible. There are so many craters! Finding the planets has been a really neat adventure: I can’t believe that I’m able to separate Saturn’s rings and see Jupiter’s moons.

    Jupiter and the 4 Galilean moons.
    Saturn and it’s moon, Titan.

    I also got a cell-phone adapter for my eyepieces. It essentially lets me use the entire telescope as a gigantic camera lens. My Google Pixel 3a has an astrophotography mode that has helped get long exposure photos of the sky.

    The Milky Way, shot in astrophotography mode on a Pixel 3a XL.

    Another cool thing is the discovery of Comet NEOWISE. It’s not really visible from the city, so it’s been a good excuse to get out of town into the wild.

    Comet NEOWISE.

    As I get deeper into this hobby, I’m realizing that something I originally started to get away from screens might get me into more screens. The basic calculations around optics have led to a gnarly spreadsheet. The notion of astrophotography as data collection is mind-blowing. Digital sensors have evolved to where we are literally measuring the number of photons hitting a 3 square-micron pixel, down to the level of a single photon in 5 minutes. This is all possible with consumer hardware too!

    I wanted to share some beginner resources that have helped me.

    • AstroBackyard review of Apertura AD8 — Trevor Jones has a great channel for astrophotography and conveys the wonder of it all well. This video really sealed my purchase.
    • Allen’s Stuff on choosing a beginner telescope — Allan Hall reviews
      pretty much every kind of beginner scope and the pros-cons of each.
    • A Beginner’s Guide to Solar System Photography — Particularly useful article focusing on alt-az mounts. A Dobsonian is a fancy alt-az mount and one of the big challenges is that stars do not track with that mount so your exposure times are limited.
    • Astrophotography with a Dobsonian? — Video demonstrating reaosnable expectations from a beginner with the same type of telescope that I have.
    • The Deep-Sky Imaging Primer -— Fantastic guide, university-course level of detail, far exceeded expectations and gave me a glimpse of just how engrossing this hobby can be. All of the author’s books are stunningly beautiful – his Sky Atlas is also great!

  • HathiTrust Research Center

    For the 2016-18 academic years I was on fellowship or affiliated with the HathiTrust Research Center (HTRC). The work from that time was finally released in late 2018. A brief summary is below.

    Data Capsules

    HTRC Data Capsules are virtual machines provisioned for researchers at HathiTrust Member Insitutions that give access to the fulltext and OCR scans of both public domain and in-copyright texts. I launched two new features with much help from Yu Ma, Samitha Liyanage, Leena Unnikrishnan, Charitha Madurangi, and Eleanor Dickson Koehl.

    The first feature was the HTRC Workset Toolkit. This tool provides a command line interface (CLI) for interacting with and downloading volumes in the HathiTrust digital library. It also has tools for metadata management and collection management. The collection management tools are really great because a user can go from a collection URL to a list of volume IDs or record IDs for later download or metadata retrieval.

    The second feature was the addition of the InPhO Topic Explorer to the Data Capsule’s default software stack. This allows the Topic Explorer to train models on the raw fulltext of public domain and in-copyright texts, as oppposed to over the word counts exposed by the extracted features.

    One critical notion to the use of data capsules is that of non-consumptive research. In summary, research products cannot allow for reconstruction of the original text for human reading. Algorithmic analysis is considered a “transformative use” covered by fair use. These products can then be exported from a data capsule after review.

    Algorithms

    However, some analysis pipelines are guaranteed to produce valid non-consumptive products. These have been added to an HTRC Algorithmsportal for batch processing. I added the InPhO Topic Explorer to this tool.

    Extracted Features

    Finally, the coolest non-consumptive dataset is the HTRC Extracted Featrues Dataset which consists of word counts, part of speech tags, and more page-level details for 15.7 million public domain and in-copyright texts. The genius of the Extracted Features is that bag-of-words models (like topic models!) do not require anything more than word counts, so analyses can be performed on local computers, rather than a data capsule or other sandboxed environment.

    I did not create the extracted features dataset, but created a way to integrate it with the Topic Explorer. Now using the command topicexplorer init --htrc htrcids.txt, where htrcids.txt is a file with one HathiTrust Volume ID per file models can be built on the extracted features over any volumes.

  • What have I been up to?

    Hello from the Land of Enchantment! In October, our family moved to Albuquerque, New Mexico – our fourth state in two years. Understandably, blogging has been a bit slower, but we’re finally getting settled, so I’m going to start with some basics before doing research updates and then expanding on some of these things.

    The day after the election my then-fianceé Emily got offered a fellowship at the National Academies of Science. We packed up and left for DC, for what we thought was just 6 months. In order to keep us afloat, I took full-time work at the Internet Archive. We got married in May and when she was offered a (semi-)permanent position, we moved to suburban Maryland.

    The Archive does heroic work for preservation and access, but was not a good fit for me, so in January 2018 I started a new position at Cornell University, working on the arXiv Next Generation (arXiv-NG) publishing platform. I’ve gotten a kick out of the fact that I moved to an organization with the exact same URL pronunciation (archive.org → arxiv.org), and the work we do has tremendous impact on scientific communications, with over 22 million monthly article downloads.

    Skipping ahead to June 2018, our son Javier was born. With expenses already at their limit before adding a child and the challenges of employment in Trump’s Washington, we had to relocate in October. After triangulating what was important to us in a home – a bilingual blue state with sunny weather, low cost of living, and a lack of anti-vaxers – New Meixco was our choice. It’s definitely delivered on the promise: not a day goes by where I wish we were back East. There’s something exciting about being on the frontier, somewhere no one in my family history has ever lived.

    tl;dr: moved 3 times, got married, had a baby, archive.org → arxiv.org

    Sunset at Volcanoes Day Use Area, Petroglyph National Monument, Albuquerque, NM

  • Towards Cultural-Scale Models of Full Text

    For the past year, Colin and I have been on a HathiTrust Advanced Collaborative Support (ACS) Grant. This project has examined how topic models differ between library subject areas. For example, some areas may have a “canon” meaning that a low number of topics selects the same themes, no matter what the corpus size is. In contrast, still emerging fields may not agree on the overall thematic structure. We also looked at how sample size affects these models. We’ve uploaded the initial technical report to the arXiv:

    Towards Cultural Scale Models of Full Text
    Jaimie Murdock, Jiaan Zeng, Colin Allen
    In this preliminary study, we examine whether random samples from within given Library of Congress Classification Outline areas yield significantly different topic models. We find that models of subsamples can equal the topic similarity of models over the whole corpus. As the sample size increases, topic distance decreases and topic overlap increases. The requisite subsample size differs by field and by number of topics. While this study focuses on only five areas, we find significant differences in the behavior of these areas that can only be investigated with large corpora like the Hathi Trust.
    http://arxiv.org/abs/1512.05004