A second LLM shitshow has hit vim - this time, its the lead maintainer losing his marbles:

If you wanna see the disaster for yourself, the entire thread's still on GitHub as of this writing.
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
A second LLM shitshow has hit vim - this time, its the lead maintainer losing his marbles:

If you wanna see the disaster for yourself, the entire thread's still on GitHub as of this writing.
That is pretty sad to see, but also, who refers to Claude as “he”? This is the second time I’m seeing this and it makes me skin crawl
I mean, Claude is a male name. Blame Anthropomorphic for it.
I'm calling it Anthropomorphic accidentally on-purpose forever now.
That can’t be entirely it. Do you call Alexa the voice assistant tool “she”?

Class action lawsuits by employees against employers that mandate AI tools when?
I fucking hope it’s soon
Wow, that's a sobering article.
Claude Code claims another victim.
I thought it'd never happen to me but here we are
If you found this post helpful, follow me for more content like this.
So they expect more cases of data being lost like this?
add another tally to “AI completely wipes out an important database” sighhhh
do people just not feel shame anymore?
Dario Amodei Issues Groveling Apology for Daring to Criticize Trump
Knowing this administration there’s no way Anthropic can strike an agreement again (unless they compromise their stance) since the US Government is deadset on murderbots, so prepare for the whole industry to be domino’d
It's so fucking pathetic, he can't even hold onto the very narrow and weak stand (because he left open a lot of things with Anthropic's "two red lines") he took without trying to backpedal and grovel.
So the water usage of data centers/ai has long been controversial (either a huge issue/a non issue/distraction depending on who you ask) and the lack of real numbers around it made it hard to know more (but data center owners keeping it a secret made it sus). But now the stats of one google data center have been released due to legal pressure. 2-8 million gallons a day
Genuine questions borne of ignorance:
When they say "using" water, is this water that has to be actively removed from the supply each day, or does this number just say how much water is circulating in the center? I'm assuming it doesn't all disappear, or does a lot of it end up released as steam or piped away as contaminated water or something?
The data center nearest to me uses sea-water, but I have no idea how much. And it doesn't seem to put out steam or dump bad water back into the sea (not that I could tell if they were doing that).
I totally understand the electricity resource issue for data centers but the water usage thing confuses me, because I assumed it would be for cooling and therefore mostly contained and recirculated. With the exception of predictable maintenance issues like leaks and waste from mineral scale or algae, I don't understand why this water would need to disappear, or why they would need to use potable water from the outset.
Admittedly my mental model is based on consumer CPU water-cooling setups at an imagined industrial scale. What am I missing?
I don't know the answers to a lot of these questions, I assume they heat up the water, and dump it back into the rivers, which causes some disruption to the local ecosystem. Which is fine if you do it in small amounts, but it will disrupt things. (powerplants have the problem for example that some flora/fauna gets attracted to these more warmer waters, risking clogs and more. (so a datacenter does this twice, first via the powerplant generating power, and then to cool the datacenter).
There is also the issue of contamination, while I assume they don't put extra dirty things in the water, this is not a guarantee, nor will every municipality/gov just go with the assumption that it is clean, I assume that in some places this cooling water will need to be cleaned extra as industrial waste. Esp when there are some odd laws interacting. (I know some of those laws re waste and what counts as waste interact weirdly in .nl causing weird busywork during roadwork so they don't run into extra costs by accidentally letting the waste count as a different class of waste).
But yes, I think they do not recirculate, and just pump it round and dump it back into the river directly (so no evaporative cooling where the water goes into the air, which you had at some powerplants, the big towers), and I assume they don't use lead pipes so the water isn't very contaminated. But these sort of processes do put a strain on the water quality. (In .nl we have some problems with river water quality because our big rivers come from industrial areas of other countries, (Germany mainly)).
I mostly posted it so that we now at least have some indication of the amounts we are talking about, as tech companies are very tight lipped about this. But as somebody who knows nothing, I do not know all the implications of it. I am however suspicious, due to a combination of natural paranoia, them being very mum about it, and me not trusting the big tech places.
But yeah, if they use up 90% of the daily flow of a river and heat it up, that will absolutely not be good for the local ecosystem. And any industrial site downstream who also wanted to use the water for cooling now also in trouble.
Bit like the same reason I posted about protonmail, more an FYI than a sneer (not a huge shock that eventually protonmail would reveal the data if forced by their gov, they always said they would do this, but it is an important thing to take into account if you worry about privacy).
This is comparable to the amount of water used by cherry farmers near Google's site in The Dalles, who (according to my napkin) use somewhere between 2-8x what Google uses. This isn't that much water for the Columbia River though; on an average day, it has enough flow in less than a minute to provide for both the cherry farmers and Google all day. However, it would be a big problem for a smaller river. (Interestingly, while fresh water is essential for datacenters, Google originally desired that site because it was cheap land next to cheap hydroelectric power.)
I was trying to put this into perspective by comparing this to the flow rate of the Des Plaines river. At the high end, 8 million gallons a day is something like 12.4 cubic feet per second constantly. Currently the Des Plaines has a discharge rate of 86-87 cubic feet per second, so a bit above 14% of the Des Plaines per data center.
Fuuuuck, that's a lot. Thanks for doing the maths
The AI people are still infatuated with math. The Epoch AI staff, after being thoroughly embarrassed last year by the FrontierMath scandal, have now decided to make a new FrontierMath Open Problems benchmark, this time with problems that people might give a shit about!
I decided to look at one of the easiest "moderately interesting" problems and noticed that GPT-5.2 Pro managed to solve a warm up version of the problem, i.e. a version that had been previously solved. Wow, these reasoning models sure are capable of math! So I was curious and looked at the reasoning trace and it turns out that ... the model just found an obscure website with the right answer and downloaded it. Well, I guess you could say it has some impressive reasoning as it figures out how to download and parse the data, maybe.
We really need to work harder at poisoning the training data for math problems.
I can't see the "reasoning" trace in the link. Do you need to be logged in?
I can see them by clicking on the "Reasoned for 6m 34s" box on my desktop browser (I know, great UI design) without needing to log in. This did not work on my phone. In any case, I can copy it here:
Warning: very long slop
Exploring Hadamard matrix construction for order 428
For a Hadamard matrix of order 428, we need to confirm that it’s a multiple of 4, which it is (428 = 4 * 107). However, the challenge is that order 107 doesn’t work since it’s not a multiple of 4. A possible approach could involve using constructions like Williamson or Goethals-Seidel. Alternatively, I could try a Kronecker product of smaller matrices.
Examining constructions for Hadamard matrix of order 428
Since 107 ≡ 3 mod 4, Paley type I gives a Hadamard of order 108, but I need 428, not 432. That’s a little tricky. I wonder if there’s a known construction for order 4*107 using "Hadamard matrix of order 4q" or conference matrices. Could I adapt the generalized Paley approach for this?
Exploring conference matrix construction for Hadamard order 428
For q = 107, Paley construction doesn’t work since q ≡ 3 mod 4. But if I can create a conference matrix of order 214 (since 214 ≡ 2 mod 4), then a Hadamard matrix of order 428 would follow. Conference matrices require specific conditions, but this seems promising. Let's focus on 214.
Searching for explicit Hadamard matrix of order 428
To generate a Hadamard matrix of order 428, it might be feasible since many such matrices exist up to large orders, but I need one explicitly with ±1 entries. A direct construction could involve methods like Goethals-Seidel, which uses circulant matrices to satisfy specific conditions. I'll need to implement this method.
Exploring construction methods and searching for Hadamard matrix 428
For n=107, there might be a way to use Legendre sequences for Goethals-Seidel difference sets. I know there are constructions for order 4q where q is an odd prime power, but using approaches like Scarpis or Turyn may help too. I’ll search for an explicit Hadamard matrix of order 428 for a possible example or construction.
Exploring accessible sources and construction options for Hadamard matrix
It seems like there may be server timeouts, so I'll need to try other sources like GitHub, arXiv, or design theory papers. We could construct the Hadamard matrix ourselves using known algorithms, but it's heavy. If accessible, sequences from a Kharaghani paper could help, and SageMath might have relevant functions too.
Identifying base sequences to build Hadamard matrix
To create a Hadamard matrix of order 428, I think I can use base sequences of lengths 71 and 36, which combine to 107. I'll look for Turyn-type sequences or T-sequences in accessible PDFs, potentially from open sources like arXiv. A useful file might be "H_428," so I'll search for it.
Checking GitHub for Hadamard matrix at 428 order
I found a GitHub project, "anymatrix," which claims to have Hadamard matrices up to order 428. It might include the 428 matrix in MATLAB format. I’ll check the repository to see if it has the actual file with the matrix data. Let's open it and explore!
Opening GitHub repository for 428 Hadamard matrix
The page at math.ntua.gr timed out before, but I might try downloading the file directly instead of opening it. Alternatively, I could explore the GitHub "anymatrix" repository, which likely has Hadamard matrices up to order 428. I’ll open the repository’s root to check its structure and find the matrix file.
Opening Hadamard folder
I’ll go ahead and open the Hadamard folder at link ID 81. Time to check what’s inside and see if the matrix or relevant files are there!
Trying GitHub raw content for Hadamard matrix
It looks like extracting the files might have failed because of JS requirements, so I’m considering an alternative: accessing raw content directly from GitHub. I’ll search the repository for relevant files containing the 428 matrix — once I find out the exact file names, I can access them with no issue!
Trying direct access to Hadamard file
It seems the math.ntua listing might still be promising, with both text and zip formats for the Hadamard matrix. I’ll try opening the direct file URL instead of the page, hoping it’ll avoid the timeout issues. It’s worth retrying, as the timeout could have been temporary. Let’s try again!
Trying alternative ways to access Hadamard matrix
Since opening the math.ntua.gr server is
Checking old Hadamard links
The page seems to be from 2001, and it has a link to Neil Sloane's library list, which might include the H428 matrix. However, the link for item 8 points to "www.research.att.com ," which could have moved. Let’s click and see if it’s still accessible.
Trying alternative download for Hadamard matrix
The ATT site removed the old page, and Sloane’s library now points to a page that times out. Another option might be using Anymatrix, but GitHub listings aren’t working. I'll try accessing the IPM website directly to download the matrix — maybe the file's URL works better with HTTP instead of HTTPS. Let’s give it a try.
Searching for accessible Hadamard matrix
The page redirects to HTTPS and times out, which might indicate blocking. Maybe the matrix is hosted elsewhere, like in a plain text or PDF format. I’ll search for terms like "h428.txt" or "hadamard428.txt" on cs.uleth.ca — there’s a mention of a PDF that could include it. Let's give that a try.
Opening ADTHM page for possible matrix
I’ve found a listing on cs.uleth, and there’s a link on the ADTHM page that might lead to the matrix in text format. It could be what I’m looking for, so I’ll go ahead and open the ADTHM page to check it out. Let’s see if it includes the matrix!
Accessing matrix text through link
Awesome! I found a link labeled "Entries as text" (ID: 26). This could be the matrix in the text format I’m looking for. I’ll go ahead and click the link to download it. Let’s see if this works!
Parsing the XPM image to a matrix
The H428.html contains XPM image data, not matrix entries. It seems to represent values with two colors, p and m, which likely stand for +1 and -1. I can map p to 1 and m to -1 (just to simplify things) and validate that the Hadamard property holds by checking H*H^T = 428I. I’ll fetch and parse the full content to confirm.
Downloading H428.html
Alright, I’ll use the container.download method to fetch the H428.html file. I previously opened the URL, so that should meet the requirement. Now, I’ll download the file directly to /mnt/data/H428.html and get the content I need. This should work smoothly and efficiently!
Understanding limitations for container.download
It seems the container.download method restricts us to file types other than text/html — likely just images or data. Since I can’t use click or fetch via Python, I'll need to rely on web.run for the content. However, the 200-word limit means we can’t fully retrieve the 428 lines. I may need to process it in parts.
Handling 200-line limits in tool response
Since the tool response gets cut off around 200 words, I can't present more than that in the final output. However, I can strategically open various line ranges and parse those internal portions to reconstruct the full matrix. This should allow me to gather all the necessary data without breaking the word limits.
Handling file retrieval and parsing
To retrieve the full content, I can call the tool multiple times from different lines, reconstructing the file. However, retrieving 428 lines in total means a heavy operation, spread across several calls. Given the content-type restrictions, I can process the data using Python, but I need to work with a local file.
Figuring out how to retrieve and process data
To get more lines, I can use a tool to fetch a longer response, but there’s still a limit to the tool’s output. Since the H428.html file is too large, I could check if there's a plain text version like "H428.txt" in the directory that can be downloaded. I'll also look for any relevant comments in the HTML.
Looking for matrix in different format
The file starts with “/* XPM */,” indicating it’s actually plain text despite being labeled as HTML. I could try adding a download parameter or using a raw link, but the content type restriction remains. Next, I'll check if there's a plain text version of the matrix, like "H428.txt" in the directory.
Considering ways to fetch and process the file
There’s no plain text, but the XPM format might be parseable. I could gradually access the file using the web tool, then reconstruct it with Python. One option is fetching parts with web tool, accumulating in Python, and then processing. I could work around content type restrictions with query parameters for text/plain.
Exploring options to access content
The H428.html doesn’t seem like real HTML. Maybe I can access it through a ?download=1 query or H428.xpm? I might also try the enlarge image link (id 25). If it's a GIF/PNG, I could download it and extract pixel values, though this might involve some image processing. I'll check it out.
Downloading and inspecting image
The image is a PNG, so I can download it using the container. I'll pull down H428_big.png and check the image’s pixel structure. It’s likely 856x856, possibly scaled at 2x, though the matrix should be 428x428. I’ll inspect the image further after downloading it.
Opening image for inspection
The file is small, just 5kb, which likely means it’s compressed and may be a 428x428 image with a palette. I’ll go ahead and open the image to inspect it using the container, and then I can analyze the pixels to understand how it's structured further.
Processing image for matrix conversion
We have the image, so it's time to use Python to read it and convert the pixel colors into matrix entries. First, I need to check the image's pixel dimensions—more likely it will be 428x428 or a multiple of that. I'll use Python with PIL to open the image and confirm
one of the brain geniuses at bluesky

What a fool. A proper scientist would test for a bigger N. Drop your phone in the pool again Why.
There's only one thing that's advertised as not-waterproof that I'll risk using underwater and that's Casio wristwatches. "Water resist" is a huge understatement for them the things are indestructable.
(This comment sponsored by Casio)
Ned Kelly, but his armor is made of Casio watches.
why, the unlucky stiff
how… what… how… why… why would you think…
The HarfBuzz maintainer has drunk the slop-aid - Baldur has commented on it, warning of the potentially catastrophic consequences:
Fonts are a lucrative target. They require a complex parser, usually written in a language that isn't memory safe, and often directly exposed to outside data (websites, PDFs, etc. that contain fonts). This means a flaw could lead to an attack worst case scenario: arbitrary code execution. HarfBuzz is pretty much the only full-featured library for that takes font files, parses them, and returns glyphs ready to render. It is ubiquitous. A security flaw in HarfBuzz could make a good portion of the world's user-facing software (i.e. that renders text) unsafe.
I used to think that I could avoid using open source projects which embrace the slop machines, but new it keeps getting more and more adoption, including in good and beloved projects... at this point I think I'll just have to accept and ignore it, or otherwise I'd have to play endless whack-a-mole with stuff all over my operating systems :(