Chosen links

Links - 12th February 2023

Aw yea guns in the backrooms (Or: A rant about The Backrooms and how a lot of the content is lacking or just makes no goddamn sense)

So I know this guy is super cancelled now, but I have to bring it back to H.P. Lovecraft, since he’s the writer most responsible for popularizing the idea that the fear of the unknown is the greatest fear of all (and thus the only fear that is truly scary). The stories that keep coming back in service of this point are the ones that eventually comprised the Cthulhu Mythos, tales of ancient gods like Nyarlathotep and Cthulhu and Azathoth that make you go insane just by being around them. (Who, if you overanalyze them enough, reduce to a cosmic mummy, squid, and octopus respectively, defanging them of all their terror — affirming this belief that nothing is really scary if you throw details at it.)

That analysis really misses the point of what makes the Elder Gods so terrifying in Lovecraft’s original work, precisely because they have ceased to be his characters and are now a sort of spooky Halloween monster mash fandom that lives on beyond his weird xenophobic mindset. A canon of rules have been imposed on the madness they induce, reducing it to a quantifiable superpower they possess instead of an inherent quality of how far beyond human understanding they are, and that does downgrade them from unknowable terror to ordinary danger. But if you really want to understand the fear of the unknown in horror fiction, the story you should be looking at is “The Colour Out of Space”.

The thing modern horror fans tend to forget about “The Colour Out Of Space” — despite it still being the canonical “fear of the unknown” story — is that unlike unknowable terrors in modern horror, it is not just a sequence of weird creepy shit happening, and an unseen narrator loudly shushing you if you ask too many questions. The scary thing is not a secret, there is no tiny cabal of protagonists who no one believes. The thing makes the news. There are scientists. They successfully gather samples and look at the thing under a microscope. There are people cataloguing the thing, experiments being run, cultural and political repercussions, symposiums held to understand the thing. The thing is not mysteriously immune to order and categorization and analysis; humanity throws science at the problem like any other mystery of nature. The reaction to this cosmic horror is rational, organized, and predictable, and there are many productive findings.

And you know what? They still can’t figure out what the damned thing is. Science’s search for answers — as it does for any real-world natural phenomenon — just makes the thing ever weirder, more irrational, more impossible, as each new bit of data just raises more baffling questions. Humanity seeks to control their fear of how little they know about this weird impossible to describe thing by understanding it, only to see the scale of what they don’t understand grow ever larger, and challenge what they already thought they knew. The quest for knowledge makes the thing more creepy, not less. It almost makes sense! Almost! It’s consistent enough that if you only knew a little more, you could figure it out… but you can’t.

There is an entire genre of Silver Age science fiction written around this premise (weird but harmless-seeming object mysteriously appears and the entirety of human civilization can’t figure it out), a substantial subset of the post-/x/ collaborative horror fandom is completely unfamiliar with it, and it irritates me, because that kind of story is fascinating. You want something to be scary because it defies understanding, it’s scary because it defies understanding, not because the author nerfed the collective human capacity for rational understanding beyond the suspension of disbelief, and sicced black helicopters or the time assassins or slendermen or whatever on any characters who try to puzzle it out.

You know what’s not scary? Cthulhu being just a mind-altering cosmic squid, with a detailed description and catalog of where it lives, what and how it eats, what the limits of its abilities are, and so on. You know what’s really scary? Cthulhu being just a mind-altering cosmic squid, just like our terrestrial squids, and therefore we don’t understand what squids even are anymore.

The market for lemons

Customers that can’t assess the quality of products pay the wrong amount for them, creating a disincentive for high-quality products to emerge and working against their success when they do. For many years, this effect has dominated the frontend technology market. Partisans for slow, complex frameworks have successfully marketed lemons as the hot new thing, despite the pervasive failures in their wake, and crowding out higher-quality options in the process.

These technologies were initially pitched on the back of "better user experiences", but have utterly failed to deliver on that promise outside of the high-management-maturity organisations in which they were born. Transplanted into the wider web, these new stacks have proven to be expensive duds.

The complexity merchants knew their environments weren’t typical, but they sold highly specialised tools as though they were generally appropriate. They understood that most websites lack tight latency budgeting, dedicated performance teams, hawkish management reviews, ship gates to prevent regressions, and end-to-end measurements of critical user journeys. They understood the only way to scale JS-driven frontends are massive investments in controlling complexity, but warned none of their customers.

They also knew that their choices were hard to replicate. Few can afford to build and maintain 3+ versions of the same web app (“desktop”, “mobile”, and “lite”), and vanishingly few scaled sites feature long sessions and login-gated content.

Armed with all of this background and knowledge, they kept all the caveats to themselves.

This information asymmetry persists; the worst actors still haven’t levelled with their communities about what it takes to operate complex JS stacks at scale. They did not signpost the delicate balance of engineering constraints that allowed their products to adopt this new, slow, and complicated tech. Why? For the same reason used car dealers don’t talk up average monthly repair costs.

My city, my rules

Is it a surprise our technocrats think this way? They adore the simulation hypothesis: that our entire universe is just numbers running on some machine, as if the ineffable cosmos work 1:1 to our shitty computers. The consumer-nerd is not capable of understanding metaphor in any context. They cannot hold the contradiction of the literal-metaphoric. So, the world becomes a video game and video games become the world. We’re all just numbers on a page they need us to fit so we can keep being sold apps and subscriptions. Maybe you run a diner on your phone, or a farm, or a kingdom: they think they run the world on theirs. The world wasn’t originally a simulation, it was turned into one. When I close the game and walk outside, what do I find? Wrongness penetrates everywhere and it is no longer subtle. A veil lifting to reveal a sea of grotesqueries. More and more AI tools keep emerging to mimic our culture. Generate your own art, stories, music. Talk to a real robot! But behind the robot is sweatshop labor in some country devastated by the United States assigned to impersonate a machine that is imitating human beings.

The window of machine curation is closing

For the last decade, the search engine user experience has gotten progressively worse. This is partly due to search engines optimizing for profitable search results rather than helpful ones, but I would posit that the bigger cause is that the majority of new, useful information on the internet is being created behind closed doors, not on the searchable web.

More and more, the information that is available to search engines is created for search engines to find — companies paying writers pennies to churn out multi-thousand word essays that will rank high in google results, but not paying them enough to do the research that would make the information accurate and useful. This phenomenon is getting worse fast thanks to generative language models like ChatGPT (which aren’t even capable of producing accurate information except by accident) and I expect that soon the vast majority of text on the internet will be created by bots like that, writing more and more convincing essays full of useless information.

Maybe search engines will figure out how to discard this deluge, but my sense is that it’s an arms race that they’re going to inevitably lose. For a long time machine curation felt like the future, but now I think the window of pure machine curation is closing. To the extent that search will still be useful, it’ll be useful for searching human-whitelisted content.

The state of developer conferences

So, I have a theory on what happened to the audience. I’ve shared it with a lot of organizers, who all seem to think there’s some truth in it.

Pre-pandemic there were two noticably different segments of the audience: people who came for networking first and content second and people who came almost exclusively for content. As an organizer you could see these groups. The former group would be the folks who attended the social events and parties. In the past, this was typically only 50-60% of of the audience. The latter group would be those people who were often the first ones sitting in the session, well before the presentation started. They might sit by themselves or with a colleague or two that they came with, but, once the content was done, they’d leave, without participating in the social events.

Behavior seems to have changed at in-person events though. The audience, while smaller, seems more social. There’s less need as an organizer to encourage people to socialize because they seem to do so much more naturally and participation rates in social portions seems higher. However, that second segment, the ones in their seat waiting for the session content, appear to be absent.

My hypothesis is that we’ve bifurcated the audience somewhat. The folks that were there almost exclusively for the content have decided that they can do so more cheaply and efficiently online via virtual conferences or recordings. The folks that went for the networking as a primary driver, on the other hand, are largely eschewing online events as not fulfilling their needs. This may also explain a behavior change I’ve noticed for online events where the audience that consumes the recordings has increased while the live audience (the ones that participate in the limited social aspects like chat or Q&A) has decreased.

So ultimately what we are left with is a lower in-person audience and a lower virtual audience. I’ve been giving a lot of thought to how we can adjust (while also personally avoiding the huge financial risks of running in-person events right now). In my opinion, it’s clear that both in-person and online developer conferences need to adjust to new realities that no longer seem transitory due to the pandemic but what isn’t clear is how they can do that.

Big data is dead

For more than a decade now, the fact that people have a hard time gaining actionable insights from their data has been blamed on its size. “Your data is too big for your puny systems”, was the diagnosis, and the cure was to buy some new fancy technology that can handle massive scale. Of course, after the Big Data task force purchased all new tooling and migrated from Legacy systems, people found that they still were having trouble making sense of their data. They also may have noticed, if they were really paying attention, that data size wasn’t really the problem at all.

The world in 2023 looks different from when the Big Data alarm bells started going off. The data cataclysm that had been predicted hasn’t come to pass. Data sizes may have gotten marginally larger, but hardware has gotten bigger at an even faster rate. Vendors are still pushing their ability to scale, but practitioners are starting to wonder how any of that relates to their real world problems.

This post will make the case that the era of Big Data is over. It had a good run, but now we can stop worrying about data size and focus on how we’re going to use it to make better decisions.

Customers with moderate data sizes often did fairly large queries, but customers with giant data sizes almost never queried huge amounts of data. When they did, it was generally because they were generating a report, and performance wasn’t really a priority. A large social media company would run reports over the weekend to prepare for executives on Monday morning; those queries were pretty huge, but they were only a tiny fraction of the hundreds of thousands of queries they ran the rest of the week.

An alternate definition of Big Data is “when the cost of keeping data around is less than the cost of figuring out what to throw away”. I like this definition because it encapsulates why people end up with Big Data. It isn’t because they need it; they just haven’t bothered to delete it. If you think about many data lakes that organizations collect, they fit this bill entirely: giant, messy swamps where no one really knows what they hold or whether it is safe to clean them up.