Favorites: Canadian Irony and CIA Opsec
February 10th, 2025. Clinical trials frozen, congestion pricing thriving—what a week
Their systems couldn’t multiply? Wut? There’s more comparison with other countries (Australia mostly) in the thread.
I really like his simple table of tests of AI models, his whole writeup is good, with the upshot being that he doesn't think o3 is that much of an improvement over o1. Ethan Mollick has a test around writing poetry, it's passed only by Claude 3.5 Sonnet, o1-pro and now Gemini 2.0 Flash Thinking. Interesting that Claude performs as well as the top reasoning models on many tasks.

It’s so dumb, and an old video but even as only a casual FPS player this was amusing.
Longer Reads
• Old thread I found with details about measuring US wealth inequality when you take Social Security into account. If you include its net present value, suddenly the US wealth inequality over time is largely unchanged. Here's a more recent thread (cached) that linked back to the old thread. (src)(cached)
Flotsam and Jetsam
– Amusing joke video about Canadians avoiding US goods during the trade war (src)(cached)
– Have you ever wondered how well US spending on HIV prevention in Africa is? The answer appears to be that it’s made a big difference! (src)(cached)
– Polling on congestion pricing in NYC indicates that it’s getting more popular now that it’s been active for a while. Especially popular with people who use it. (src)(cached)
– Spending freeze on a bunch of clinical trials continues. I sorta suspect the trials are ruined now? (src)(cached)
– LLMs got worse at chess (src)(cached)
– Alex Tabarrok was not impressed by Khan's op-ed in the NYTimes: “She argues that we must break up US firms like Google, Apple, and Meta to compete with China’s more open and more competitive system. Unreal" (src)(cached)
– Subway ridership in NYC up, pushing crime down (src)
– Rubio wanted to increase USAID funding just two years ago (src)(cached)
– Tyler Cowen’s take on how to best understand Trumpian policy (src)(cached)
– Seems really bad for the CIA to be declassifying the names of their analysts (src)(cached)
– Marco Rubio reportedly has a minder from the Trump camp. A staffer not of his choosing who follows him to all meetings (src)(cached)
– Not a surprise, one of the young guys working for DOGE posted a bunch of racist stuff. Surprise: he resigned when the old posts came to light. (src)(cached)
– I said the other day that Trump’s time in office has already destroyed the Canadian conservative party’s chances. Here’s that claim in graph form. (src)(cached)
– Apparently Apple got pretty far along on training AI for car driving in simulation. They actually appear to outperform Waymo on the test benchmark (src)(cached)
– The new AG has disbanded the foreign influence task force. Seems bad. Apparently the anti-Russian-oligarch group (cached) as well, not sure if that's the same thing? (src)(cached)
– “New” ai math benchmark questions were readily available on the internet, making high scoring small models performance ratings quite dubious (src)(cached)