Welcome to ned Productions (non-commercial personal website, for commercial company see ned Productions Limited). Please choose an item you are interested in on the left hand side, or continue down for Niall’s virtual diary.
Niall’s virtual diary:
Started all the way back in 1998 when there was no word “blog” yet, hence “virtual diary”.
Original content has undergone multiple conversions Microsoft FrontPage => Microsoft Expression Web, legacy HTML tag soup => XHTML, XHTML => Markdown, and with a ‘various codepages’ => UTF-8 conversion for good measure. Some content, especially the older stuff, may not have entirely survived intact, especially in terms of broken links or images.
- A biography of me is here if you want to get a quick overview of who I am
- An archive of prior virtual diary entries are available here
- For a deep, meaningful moment, watch this dialogue (needs a video player), or for something which plays with your perception, check out this picture. Try moving your eyes around - are those circles rotating???
Latest entries: 
Word count: 4599. Estimated reading time: 22 minutes.
- Summary:
- The author’s personal history with LLMs is a long and winding road, marked by experimentation and exploration of various models and tools. They recall playing with llama3.1 8b in Autumn 2024, being impressed by its ability to call tools and search the internet, and recognizing its potential to aid productivity. The author’s use of LLMs has evolved over time, from generating summaries for their website to more complex tasks like image editing and coding assistance. They have also experimented with various models, including Qwen, Gemma3, and Claude, and have been impressed by the rapid progress in AI capabilities.
Thursday 12 March 2026: 17:40.
- Summary:
- The author’s personal history with LLMs is a long and winding road, marked by experimentation and exploration of various models and tools. They recall playing with llama3.1 8b in Autumn 2024, being impressed by its ability to call tools and search the internet, and recognizing its potential to aid productivity. The author’s use of LLMs has evolved over time, from generating summaries for their website to more complex tasks like image editing and coding assistance. They have also experimented with various models, including Qwen, Gemma3, and Claude, and have been impressed by the rapid progress in AI capabilities.
Obviously lots of programmers like myself have been laid off these past two years ostensibly due to being replaced by AI who will happily churn out code with quality similar to perhaps the bottom fifth worst programmers, but as management is well known to be absolutely terrible at figuring out who is a good or bad programmer, they’ve been mainly performing rounds of blind headcount decimation as usual. I’ve been without income now since last June – part of that is due to changes in US tax treatment of foreign workers, but probably more of it is due to widespread headcount reduction using AI as the excuse. To date, employers have only been investing in AI to the extent of substituting X number of human devs for Y dollars of subscription fees paid to OpenAI or Anthropic – they haven’t gone for any deeper integrations than that. There are good reasons for that: every six months AI gets quite a lot better, and with such shifting foundations there is no point investing in deep structural change to rebase your business on this new tooling until AI improvements slow down to a few percent per year.
As an example of exactly that fast progress, last month Alibaba released its newest set of Qwen models all of which can be downloaded and run locally – unlike nowadays most of the recent western AI models. That release was expected to very substantially improve capabilities over their previous models. I, along with lots of others, had been eagerly awaiting that release because the Qwen Mixture of Expert (MoE) models are the only feasible way to get large models running on hardware an individual could reasonably afford. As I have zero wish to invest my time training into AI which can be rug pulled from me later, an iron clad requirement for me personally (and I suspect lots of others like me) is that my time is only worth being invested in AI models I could have 100% personal control over. So, whatever such models can – or cannot – do is the sweet spot at which I shall aim my practice and training.
I don’t mind the moving target as those models keep leaping forwards – it’s the price of being on the leading edge. And I don’t mind if a future employer pays for some super smart AI to assist me for some piece of work they want me to do. However, for my own personal work, I will be absolutely refusing to get locked in to super smart AI I’ll never be able to fully run on hardware I 100% control.
My personal history with LLMs
The first time I played with a locally run LLM I think was about Autumn 2024, about four months after llama3.1 8b had been released. I was relatively late to that game, to be honest I had until then mostly dismissed LLMs as being little more than improved chat bots. I remember being especially impressed by its ability to call tools you had personally taught it about and it could search the internet when forming an answer to your question or instruction, plus it ran well on my Macbook and even not terribly on my ancient Haswell house server which is well over a decade old. Unlike the pure chat bots preceding which were mere curiosities, I felt at the time that this new generation of chat bot had genuine potential to aid my productivity. But the tooling, and indeed the LLM models, weren’t there yet – though, I am still using llama3.1 8b to generate summaries for the diary entries on this website to this day.
In July 2025 I wrote up how I converted ancient computer parts I had lying around plus an ancient datacenter AI accelerator board I had bought from Aliexpress into an AI video inferencing solution for the site. That used a decade old nVidia Tesla P4 with 8 Gb of VRAM, which was and still remains one of the best bang for the power watt AI accelerators you can get. I came away very impressed with its capabilities, and indeed I expect to reinstall it into the site later this month once enough sunshine falls from the sky to power it.
In October 2025 I upgraded this website’s generator scripts to invoke llama3.1 8b to summarise each entry, and last January I evaluated the then recently released Qwen models for image editing and whether the 30b MoE Qwen model could replace llama3.1 8b (it could not, at least for the limited 18 Gb RAM on my Macbook). Obviously last entry last month was all about getting Gemma3 4b to describe and categorise all 25,000 photos in our collection.
Around this time last year while I was still working at Category Labs, Anthropic’s Claude coding assistant AI was beginning to get mentioned due to Anthropic having released ‘Claude code’, their command line agentic AI programming assistant tool, in February 2025. I think they bought a subscription for anybody who wanted one around April just before they told me they’d be ending my contract early. So I only very briefly played with it during my final month working there in May, and given it cost US$20 per month and I was now unemployed, I wasn’t hugely keen to spend more money subscribing to it especially as I was 99.9% certain that six months later I wouldn’t need to. One thing that I did notice during my playtime was that I already was running into the daily usage limits of the US$20 per month plan after maybe an hour of use. Obviously they wanted you to pay them a LOT more money, which I suppose is fair – the US$20 per month plan is just their taste tester plan.
Qwen3 Coder Next (Q3CN)
It’s rare that I predict the future so accurately! Last month Alibaba released Qwen3 Coder Next, a 80b parameter MoE model specifically tuned to help you work with code. I waited for a few weeks for https://github.com/ggml-org/llama.cpp to catch up with optimising support for this latest LLM, and then I gave it a proper tyre kicking last week and this week. I have come away once again impressed!
Qwen3 Coder Next is about as capable as Claude Sonnet 3.5 is, so about where Claude was at in Summer 2024, which in practical terms is exactly where today’s US$20 per month Claude subscription is at because with that plan any newer model than 3.5 runs out of usage limits so quickly it’s useless. I therefore have exactly what I predicted: the same capability of AI assistant as what costs $20 per month from Anthropic, except runnable on my own local hardware for free of cost, if you have sufficient hardware.
My development workstation is a little old now: I last upgraded it when still working for MayStreet in 2022, and I had intended to upgrade it summer of last year, but without income it no longer made sense. It is a AMD Threadripper 5975WX based machine which has thirty-two Zen 3 CPUs with eight lanes of memory bandwidth, so should have ~180 Gb/sec of memory bandwidth, but only ~4 TFLOPs of FP16 compute. This is far too little to run LLMs well, as I found during the image analysis diary entry where even the small 4b Gemma3 model took a minute per image. But what it does have is 128 Gb of RAM and a PCI 4.0 interface, so you can theoretically run ~100 Gb footprint LLMs, if you can offload the compute to hardware with far more TFLOPs and memory bandwidth.
To run a MoE 80b model well – which is what Qwen3 Coder Next (Q3CN) is – you need a GPU with enough VRAM and compute power to run the dense levels quickly. Those dense levels then select which experts will be used, and those experts usually are run on the CPU using all available CPU cores. So long as the experts don’t touch much memory and are computationally lightweight, you absolutely can run a 80b model like Q3CN well on local hardware.
Running Q3CN locally
As you are surely inferring by now, much now hangs on what a ‘GPU with enough VRAM and compute power to run the dense levels quickly’ is, and more importantly, how much it might cost. I currently have in my workstation these two GPUs:
| GPU | parse toks/sec/euro | Price (EUR) | Launch year | RAM Gb | Bandwidth Gb/sec | Full power watts | FP16 TFLOPS | llama2 7b parse | llama2 7b gen | Notes | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| AMD RX 6600 XT | 2.87 | 200 | 2021 | 8 | 256 | 160 | 16.1 | 574 | 54 | assigned to linux | |
| AMD RX 6700 XT | 3.28 | 320 | 2021 | 12 | 384 | 230 | 23.8 | 1051 | 84 | assigned to windows |
These were purchased principally with gaming-on-a-budget in mind – I had wanted to play the Mass Effect Legendary Edition trilogy in 4k with the updated graphics and bug fixes which was released in 2021 (and I didn’t get round to it until Autumn 2024). Hence the GPU allocated to Windows was a bit beefier (also, I bought it a year after the first one, and what you could get for €500 had improved by then).
The above table shows their Euro price today on eBay, and the llama2 7b performance numbers come from this list of llama.cpp benchmarks which are for the Vulkan backend. As you can see, despite that the RX 6700 XT is only a bit faster than the RX 6600 XT for games (about 12%), it’s 50-100% faster for running a LLM which entirely fits inside VRAM. Had I known there would have been such a performance differential, I’d have used the 6700 XT to run Gemma3 in the last diary entry and saved myself days of processing time. Oh well!
Unfortunately, running Q3CN with the Q4_K quantisation on the RX 6700 XT is not good:
- Parse is 60 toks/sec.
- Generation is 14.5 toks/sec.
Particularly the parse speed is the problem here: in any LLM, you need to feed all the context through parse every turn of interaction. The context gets real big quickly because it has to include all the source code for everything relevant to what you’re working on, plus all the accumulated steps so far. Modern models are able to cache and not reprocess the context from previous calls, so large contexts aren’t the problem per se, rather it’s whenever the model receives a lot of new content it hasn’t seen before e.g. you just fed it a new source file content. You could actually live just fine with slow generation speeds, it’s the parse speed of new content is the problem.
To explain, if you examine a few hours of me doing work, you’ll find about 99.2% of tokens used are input tokens (parsing context), and just 0.8% are output tokens (emitting changes). Therefore, for speedy turnarounds, you don’t really care about token generation speed much at all. Of the input tokens, 2% will be novel tokens, and the other 98% will be cached due to having been seen before. Therefore the ratio is:
- 3.9% novel input tokens
- 95.3% cached input tokens
- 0.8% output tokens
This is of course an average over many interactions, and so long as novel input tokens are small and cached input tokens are large, running Q3CN locally on the developer workstation is just fine. However, when it comes to large new input content, that 60 toks/sec parse speed becomes a problem: particularly at the beginning of each task, expect minutes for it to parse context for the first time. After it is parsed, it trundles along at a fair clip and is nicely interactive with me, until it next reads a new file and then it needs more minutes. All that is fair enough: it’s got 12 Gb of VRAM running a ~40 Gb sized model, so it’s falling back onto main, slow, RAM a lot.
So what’s the best LLM running hardware bang for the buck in March 2026?
The budget LLM executing hardware market in March 2026
I assembled a list of all GPUs and data centre AI accelerator boards with 16 Gb or more VRAM currently available new or second hand to Ireland costing no more than €1,000 inc VAT. For the purposes of comparison, I threw in my existing AMD GPUs and the nVidia Tesla P4 I bought last year for the site video inferencing – these are the only 8 Gb VRAM boards below:
| GPU | parse toks/sec/euro | Price (EUR) | Launch year | RAM Gb | Bandwidth Gb/sec | Full power watts | FP16 TFLOPS | llama2 7b parse | llama2 7b gen | Notes | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Intel Arc Pro B50 | 0.5542857143 | 350 | 2025 | 16 | 224 | 70 | 21.3 | 194 | 40 | ||
| Intel Arc Pro B60 | 0.87 | 600 | 2025 | 24 | 456 | 200 | 24.5 | 522 | 69 | ||
| nVidia P40 | 1.134883721 | 430 | 2016 | 24 | 345 | 10 | 488 | 59 | needs additional fan, high idle power consumption, low compute perf | ||
| nVidia Tesla P4 | 2.094488189 | 127 | 2016 | 8 | 192 | 75 | 5.7 | 266 | 28 | needs additional fan | |
| AMD V340L | 2.4 | 100 | 2018 | 2x 8 | 410 | 300 | 21 | 240 | 48 | old, ensure vulkan shaders | |
| AMD RX 6600 XT | 2.87 | 200 | 2021 | 8 | 256 | 160 | 16.1 | 574 | 54 | assigned to linux | |
| AMD RX 6700 XT | 3.284375 | 320 | 2021 | 12 | 384 | 230 | 23.8 | 1051 | 84 | assigned to windows | |
| AMD RX 7700 XT | 3.488372093 | 430 | 2023 | 16 | 624 | 263 | 26 | 1500 | 70 | ||
| AMD RX 6900 XT | 4.405 | 400 | 2020 | 16 | 512 | 300 | 37 | 1762 | 106 | ||
| Intel Arc A770 | 4.435714286 | 280 | 2022 | 16 | 560 | 225 | 34.4 | 1242 | 55 | ||
| AMD RX 7900 XTX | 4.736 | 750 | 2022 | 24 | 960 | 355 | 46.7 | 3552 | 167 | ||
| AMD Radeon VII | 5.066666667 | 180 | 2019 | 16 | 1024 | 300 | 26.9 | 912 | 106 | needs additional fan, old, ensure vulkan shaders work | |
| AMD RX 9070 | 5.144715447 | 615 | 2025 | 16 | 640 | 220 | 72.3 | 3164 | 120 | ||
| AMD Mi50 | 7.466666667 | 150 | 2018 | 16 | 1024 | 300 | 26.5 | 1120 | 108 | needs additional fan, old, ensure vulkan shaders work | |
| nVidia RTX 5070 Ti | 8.863157895 | 950 | 2025 | 16 | 896 | 300 | 43.9 | 8420 | 182 | ||
| nVidia RTX 5060 Ti | 9.536363636 | 440 | 2025 | 16 | 448 | 180 | 23.7 | 4196 | 94 | ||
| nVidia RTX 3090 | 11.12 | 500 | 2021 | 24 | 936 | 350 | 29.3 | 5560 | 162 |
Those prices, especially for the nVidia cards even old ones, are grim.
Despite how very depressing this table is, it’s actually much improved over this time last year when I last updated my spreadsheet for AI accelerators. Back then the only games in town were the expensive nVidia GPUs above, and the Intel GPUs which suck at parsing. AMD GPUs one year ago just weren’t viable because AMD ROCm only supported the very newest GPUs, all of which cost over a grand at the time if you wanted 16 Gb of VRAM (and even today, only the RX 9070 comes in under a grand).
Twelve months later, as I noted last diary entry, now AMD ROCm ‘just works’ even on
technically unsupported GPUs from the previous generation like mine. It no
longer crashes and blows up like the dumpster fire it was even six months ago.
However, the llama2 7b benchmarks listed above aren’t actually from the ROCm
backend for llama.cpp, they’re actually from the Vulkan backend because
that’s now usually faster than the ROCm backend if you’re using trunk llama.cpp.
Its Vulkan backend was started last summer and it’s made enormous strides
ahead in just the last three months such that it’s now almost always the fastest
backend on AMD GPUs, and it’s as fast as CUDA for token generation on nVidia
GPUs. Parsing performance is still one third to one half slower than CUDA
on nVidia GPUs, but that gap is closing quickly.
The reason why the Vulkan backend is so game changing is because GPUs have supported Vulkan shaders (which are for high performance games) for over a decade, and that in turn means that all the ancient AMD datacenter AI accelerator boards suddenly come into play because they can all run Vulkan shaders no problem, even if they’ll never run ROCm. That expands the table above with some promising new options from what it was twelve months ago – it also proves that AMD GPUs never actually sucked at LLMs as much as people thought until recently, the actual problem with them was lousy software support, not that the hardware wasn’t capable. This much improved story for running LLMs locally is 100% the result of recent runtime software improvements, and I’m very glad for the increased menu of choice.
The table above is ordered in terms of parse speed per euro, so the bottom of the table is where the standout bang for your buck boards are listed. Unsurprisingly all of those are nVidia GPUs: nothing else can parse tokens as well for your euro. But given that all those are expensive even used on eBay, the next category of standout board is the 2018 era AMD Mi50 datacenter board which is up there with the RX 5xxx nVidia GPUs in terms of bang for the buck. Unlike those boards, the Mi50 can be sourced from China delivered for €150 inc VAT. So, naturally I’ve ordered one and I’m looking forward to its delivery. It has a similar token parse speed to my existing 6700 XT, however it has one third more VRAM and that VRAM is three times faster. I would expect maybe a +20% performance improvement at running Q3CN. I guess I’ll find out.
To get radically faster performance such that there is no waiting at all, one would need at least 48 Gb of VRAM I think, seeing as the model is ~40 Gb. That probably means two cards with 24 Gb VRAM each, and to keep under a €1,000 budget:
- €860 2x nVidia P40: probably not that much faster than my 6700 XT at parsing.
- €1000 2x nVidia RTX 3090: many times faster.
… of which clearly the RTX 3090 is by far the better option, plus it can be used for gaming. Still, that’s a cool €1,000. That’s a lot of money.
I am mindful that after this AI investment bubble bursts, there is going to be a flood of used AI accelerator hardware on the market which will depress prices. So now is a lousy time to buy, especially in my current financially straightened circumstances. Which then makes one ask: how much would it cost to rent the hardware instead?
Renting Q3CN
The idea for a ‘LLM marketplace’ is of course an obvious one, and as far as I am aware the first of these, and still the biggest, is OpenRouter who got started in 2023. What they do is to provide an OpenAI compatible REST API endpoint which proxies a marketplace of LLM providers for a +5.5% fee over whatever the underlying provider charges. You can set rules for which providers to choose and when and in what order you’d prefer – note that ‘cheapest first’ is NOT the default after account creation. You also don’t necessarily want the absolute cheapest as I found they hang frequently, so with some trial and error you’ll figure out what to ban and what to allow.
You can of course open an account directly with the providers on OpenRouter and save yourself the +5.5% fee, and there are further providers who don’t list on OpenRouter. However one enormous advantage of OpenRouter is automatic failover, because when a provider gets overloaded – and in my experience, they do during peak times – OpenRouter reroutes you to the next cheapest provider with zero outage experienced by you. Maybe down the road when these providers get much better uptime this will change, but for now I think I’ll be happy to pay OpenRouter their fee for not suddenly being put on pause mid-flow.
To rent Q3CN today, these are the four cheapest providers I could find online:
| Model | Supplier | Max context | Input US$ per 1M tokens | Cached input US$ per 1M tokens | Output US$ per 1M tokens | Estimated US$ price per day (10M tokens) |
|---|---|---|---|---|---|---|
| qwen3-coder-next | Openrouter (parasail) | 262k | 0.15 | N/A | 0.8 | 1.552 |
| qwen3-coder-next | Openrouter (ionstream) | 262k | 0.15 | N/A | 0.8 | 1.552 |
| qwen3-coder-next | NanoGPT | 262k | 0.15 | N/A | 1.5 | 1.608 |
The reason that the chutes provider entry is struck out is because they’re the provider who kept hanging the session or corrupting the context. They’re basically useless to use, so I only list them for information only. I’ve had good experiences with Parasail and Ionstream, each has dropped out on occasion, but OpenRouter routed to the other so my work was uninterrupted. NanoGPT is actually a standalone provider not listed on OpenRouter, there are other standalone providers a LOT more expensive for Q3CN rental than those listed here, but they were so much more expensive they’re not really worth listing. In any case, about US$1.50 per day is estimated assuming a use of ten million tokens – which given that I easily chewed through seven million tokens in five hours, may well be an underestimate.
OpenRouter can supply detailed logs on request, so from those I calculated that whatever hardware Parasail is running has this performance:
- Parse (uncached): 17,450 toks/sec
Generation (uncached input): 56 toks/sec
Parse (cached): 31,204 toks/sec
Generation (cached input): 81 toks/sec
… which smell to me to be nVidia A100 cards, which I suppose makes sense as they’re older and therefore cost depreciated. In case, they are more than plenty fast enough, the agentic coding AI snaps along faster than I can read its log of actions – half the speed would also be more than plenty. I should remember that for later when buying GPU hardware.
Speaking of expensive … would you like to know how much the ‘big boy’ AI agentic coding services cost for comparison?
| Model | Supplier | Max context | Input US$ per 1M tokens | Cached input US$ per 1M tokens | Output US$ per 1M tokens | Estimated US$ price per day (10M tokens) | Multiple of rented Q3CN cost above |
|---|---|---|---|---|---|---|---|
| qwen3-coder-plus | Openrouter (alibaba) | 1000k (though gets noticeably forgetful after 300k) | 1.17 | 0.13 | 5.85 | 2.1632 | 1.4x |
| gemini3.1-pro | 200k | 2 | 0.2 | 12 | 3.646 | 2.3x | |
| gpt-5.3-codex | OpenAI | 400k | 1.75 | 0.25 | 14 | 4.185 | 2.7x |
| claude-sonnet-4.6 | Anthropic | 200k | 3 | 0.3 | 15 | 5.229 | 3.4x |
| claude-opus-4.6 | Anthropic | 200k | 5 | 0.5 | 25 | 8.715 | 5.6x |
The cheapest frontier coding model is Alibaba’s Qwen via OpenRouter (where it is heavily discounted for some reason), followed by Google’s Gemini3.1 Pro and OpenAI’s GPT5.3 Codex, with a slightly larger price gap to Anthropic Claude Sonnet and Opus. Qwen Coder Plus is 1.4x the cost of rented Q3CN, which is a useful data point; Claude’s most capable model is 5.6x the cost for my usage patterns, which if I’m honest, is less than I had expected.
Few devs pay for frontier models by the token, and instead have monthly subscriptions. I seem to consume 100 - 120 requests per hour, so that’s 500 - 600 requests and seven million tokens per five hours. That certainly needs the highest possible US$200 per month subscription: that buys you 800 requests per five hours, but there is also a weekly usage limit of 15 - 30 hours for their Opus model. If you want more, Anthropic want you to pay by per-token billing instead, and to be honest, at an estimated US$8.72 per day for my usage pattern, paying by the token for their highest end model with an average 22 day working month would be US$191.84 per month which is cheaper than their US$200 monthly subscription and it has no usage limits, which is another useful data point. I read a lot online about people complaining about the usage limits built into their US$200 per month Claude subscriptions, yet for my AI use patterns the per-token billing would always be cheaper than the subscription. I guess a lot of people have Claude write a lot more output than I would have it do?
I’ll come back to that next section, as I’m digressing as this is a
Q3CN focused diary entry: point is that
renting Q3CN probably would cost US$34 per month, which is
under US$400 per year and that’s if you’re using it full time.
Use it only sporadically like I do as I’m unemployed, and
spending €1,000 on your own hardware to run it looks like lunacy
(unless that same hardware can play upcoming Grand Theft Auto VI
at max graphics settings, in which case it becomes more a bird in the hand
is worth two in the bush type of cost-benefit analysis
).
Conclusions
I think I’m pretty much decided: I shall use OpenRouter for my Q3CN implementation until the AI investment bubble pops and I can affordably pick up ideally a new powerful GPU also able to run GTA VI well, or I can pick up firesaled legacy AI hardware on eBay for cheap. But the well under five hundred euro sort of cheap – there’s no rational point dropping more than €500 on new hardware given the rental costs, as I’m better off renting until used component prices get under €500.
OpenRouter makes it super simple to flip over to Claude Opus for analysis, architecture and plan writing, and then flip back to Q3CN for implementation. I’m not opposed to paying tens of euro cents for analysis and planning if it reveals things I would likely have missed – especially as it’ll write all that out into documentation for me, which I can then manually review and strip out the wrong parts. I simply view that as good engineering: I welcome all good quality feedback, from any source.
I think this reveals what kind of AI using coder I will be; there appear to be two main categories:
- Devs who don’t like writing code much, so get AI to write as much of the code as possible, so they can focus on solving problems ASAP. The AI will therefore output lots of tokens, as it writes all the code.
- Devs who feel the whole point of coding is to emit high quality code, and AI isn’t good at that especially starting from a blank sheet, so they’ll always write the bulk of the initial implementation by hand, and then only use AI when appropriate to adjust and refine that codebase. In this category, the AI will mostly read tokens, and output very few as it never edits more than a few lines of existing code at a time.
The first category tends to use an AI focused IDE like Cursor which is a fork of vscode, whilst the second category tends to use AI extensions like Roo Code installed into vscode – and to be specific about the difference here, Roo Code only appears when you open its tab. Otherwise it’s as if it’s not installed, which is exactly what you want when you’re doing work you don’t want the AI to do. Whereas, in Cursor, by choosing that IDE you’re basically saying ‘there is no work I don’t want AI to do’. In other words, it’s outside-in vs inside-out.
I am probably in category two for most of my open source work which is on reference implementation libraries. These set the standard for everybody else, and they have to be very carefully written and designed. So I like the AI to help, but I’m always going to be writing most of the code by hand most of the time.
However I’m not opposed to category one for some tasks: there are a number of Python scripts I’ve written to implement some part of a processing pipeline where I would be more than happy if the AI did as much of the work as possible, as I just want a solution ASAP and I don’t especially care how we get there. For example, if I needed some Android app to solve some itch or something, chances are very high I’d just vibe code that and call it a day.
I guess this is pretty much what Linus Torvalds said about this stuff: ‘use AI to write the code you don’t care about’. That’s pretty much where I’ve arrived at too, though I do find its analysis of what I’ve written quite insightful sometimes, as it sees with eyes which are not my own.
Anyway, that’s my analysis of agentic AI coding assistants written up! I do apologise for the wall of text, but I did also want to be comprehensive as I’ll almost certainly refer back to this in the future, so I wanted all my scattered notes built over many months to be condensed into a single, albeit very long and dense, diary entry which will turn up in search in the future should I need to refer to it.
Next post will almost certainly be about making the rented house internet go faster, but that’ll be at the earliest next week. Be happy everybody!
Word count: 8891. Estimated reading time: 42 minutes.
- Summary:
- The photo collection was finally sorted out after weeks of work using an open-source photo manager called Damselfly and a locally run AI model called Gemma 3 4b. The script used to generate textual descriptions of each image, along with ten most likely keywords for that photo, took about 17 hours to process the entire collection on a Radeon GPU.
Wednesday 25 February 2026: 23:48.
- Summary:
- The photo collection was finally sorted out after weeks of work using an open-source photo manager called Damselfly and a locally run AI model called Gemma 3 4b. The script used to generate textual descriptions of each image, along with ten most likely keywords for that photo, took about 17 hours to process the entire collection on a Radeon GPU.
these past two weeks I have been working on another long overdue chore
item from my unemployment backlog: doing something about the tens of thousands
of photos we have accumulated over the years. More on
that shortly – firstly last weekend I think I have solved another problem which
had been growing these past six months or so: I have been increasingly finding
water in my boot where the spare tyre lives, so I resolved to do something
about that which led me to doing this:
Yup that’s the rear bumper of my car removed. You might notice the four air ventilation exhaust vents two on each side: these have been overflowing with water into the boot. Why? It is because the seals around the exhaust vents lose their plasticity with age (> 10 years), and 98% of the time this is the cause of water ingressing into the boot in Ford Focus cars Mk 2 and 3 because as rain drips down the outside, it hits those vents and if the seal isn’t watertight the rain enters the rear ventilation compartments of the car where it pools up until it overflows into the boot. This was particularly a problem in the Mk 2, interestingly they tried to ‘solve’ this in the Mk 3 by doubling the number of vents to improve evaporation and introducing an ‘overflow’ escape for the pooled water, which probably did the job for most of the world, but of course here in Ireland it rains a lot AND we have lots of tight country road corners, so as you go round those the centrifugal forces especially transfer water from the ventilation compartments into the boot. And particularly this year, so far we’ve had sixty days of continuous rain (yes, since before 2026 began!), so those rear ventilation compartments have been pouring water into the boots of both my car (Mk 3) and especially Megan’s car (Mk 2) far more than in years previous.
This design defect in the Ford Focus is very well known, and the solution is to remove the ventilation exhaust vents, seal them with lots of silicone, and reinstall them. To implement this requires removing the rear bumper, which is more than I’d attempted alone on a car before. I got it done, and here is my car with its vents sealed with silicone:
Of course for me to do this work I needed it to stop raining for enough hours for the silicone to cure, which took a week after I diagnosed the cause of the boot leak. Every day I’d wake up, study the weather forecast, and last Sunday I decided that the forecast that we might get no rain for six hours towards the end of the day was worth the risk. It rained while I was disassembling the car, but I think I got twelve hours after applying the silicone until it rained again, by which point the silicone should have been mostly cured, so we should be good.
I intended to perform the same operation to Megan’s car, however hers being older and with far more mileage on it the bolts which hold on the bumper turned to rusty dust when I tried to apply a spanner to them. Given that I expect her car to have between one and two years of lifespan left on it (it is fifteen), and that to remove the rear bumper would now require replacing it, it’s not worth the cost of fixing this leak especially as it’ll mostly remedy itself when the summer comes – which will be helped by me drilling two holes into the undercarriage through which the pooling water can exit the car instead of spilling into the boot. This kinda sucks for Megan whose car will smell of mould all the time, but as our first priority is house building, this is unfortunately the price to be paid.
Climbing Claragh Mountain
You may have noticed the very tall image floating to the right: that is the five hour hike up and down Claragh Mountain in Millstreet which me and Clara and Henry did last Thursday during their midterm break. The weather was not kind to us, the aforementioned rain burst upon us with aggression several times which required stopping and sheltering under our large umbrella until the rain squall passed. So the hike involved lots of stopping and starting, as is very obvious from the data shown in the image (though we did also additionally stop to eat and rest, sometimes to have staring contests with sheep who were attracted possibly by the food we were eating, or maybe we were just novel and interesting to them).
This was the first moderate difficulty hike which my children have done (I think ‘moderate’ is defined as a > 500 metre ascent in Ireland?): I am immensely proud of them for pushing through despite the substantial increase in difficulty from what they are familiar with, and had the weather been better we would have been rewarded with some of best views anywhere in Ireland. Even with the poor views we got, the hike did seem to make an impression: the kids weren’t too happy about doing this hike initially, but as the views of the four counties opened up and as we walked the ridge high above the valley seeing a hundred kilometres to the north west and north east, despite the lousy weather the kids did seem to appreciate how amazing Ireland is for that stuff. I look forward to taking them in blazing sunshine one day, that hike really does have the ‘wow’ factor, the kind of tear inducing awesome natural beauty you only get in a few places worldwide. It’s worth the cost for me personally to do that hike: the first time I ever did it just after unemployment last year I was laid up for four days to recover from it, and I genuinely ran out of power in the final two kilometres and had to keep sitting down as my legs didn’t work anymore.
This time round I am fitter and stronger thanks to my daily 5 km walks, so no loss of power this time round and the whole body muscle pain in reaction to the unusual exercise has been more one of ‘you upset me’ rather than ‘wtf?’. As much as that still hurts over multiple days to recover, I feel far better afterwards this time round – like I’m not dead yet, whereas first time round there were moments afterwards where I was also going ‘wtf happened to my youth?’ and getting a bit into a depression about it. Of course, Clara and Henry barely registered the unusual exercise, there were a few complaints at the time but zero a few hours after completion. Youth is wasted on the young!
Doing something about our photo collection
Apart from drawing lots more wires into the 3D model of the house, my principle long term todo item cleared these past weeks is to fix up our photo collection. We have accumulated about 25,000 photos from our various devices over the years, and while we made an attempt to organise them before Henry was born, we have completely given up since. Photos are added to the collection when a phone upgrade occurs, and are generally dumped into directory consisting of the date and whose phone was backed up. Otherwise, they were completely unstructured in recent years, and semi unstructured in years before Henry was born.
Most people would push all those into a cloud photo management solution, let the cloud AI organise them, and call it a day. However we don’t put family photos anywhere where they can be used for training data, so that was a non-starter. It would have to be a local solution only.
The lack of structure meant finding an image you were looking for was quite painful. You had to remember the year and then search both mine and Megan’s photo collection by visual inspection. All those photos in a single folder loaded slowly. It took forever, and as a result we very rarely did it.
My first step was to import the collection into Damselfly Photo Manager which is an open source photo manager. I chose Damselfly mainly because it replicates the thumbnails generated with the same path as the image in the collection itself, which I reckoned would be very handy for the next step. Browsing through the collection I realised we had a big problem with metadata consistency, depending on when the image was taken and by what device:
- Images could be in JPEG, PNG or HEIC. HEIC famously does not support legacy metadata formats, whereas JPEG supports at least three generations of metadata format. PNG can have the metadata formats of JPEG, plus it has its own additional metadata format. It’s all a mess to be honest.
- Images record in their metadata when they were taken in multiple different ways, some of which aren’t recognised by photo managers. This then breaks the timeline ordering.
- Images use cases are all jumbled together, everything from pictures of rental cars taken to prove no damage to birthday parties to landscapes, and there was no way to disambiguate.
- Of those images containing family members, we had no idea which people were in which photo.
Step 1: Get AI to describe all the photos
So what we really needed as a first step was to generate a textual
description of each image, along with ten most likely keywords for that
photo. Obviously as I had been playing a lot with AI recently, and as I
definitely did not want to send personal images outside my local network,
this seemed a perfect task for locally run AI. I tested a few models,
and eventually landed on Gemma 3 4b as a reasonable tradeoff between
size (it plus uncompressed image to be parsed needed to fit inside 8Gb
of VRAM), accuracy (not awful), and
following instructions accurately so we could reliably parse its output.
Here is the final script used for which you will need an Ollama
running on localhost with gemma3:4b-it-qat installed into it:
#!/usr/bin/python3
from PIL import Image, ImageOps
from io import BytesIO
import glob, os, urllib.request, urllib.parse, json, multiprocessing, base64, subprocess, re, time, random
to_damselfly = (r'(.*)/IcyPictures1/Pictures/(.*)\..*?', r'\1/.damselfly/thumbs/\2_l.JPG')
max_ai_image_dimension = 1024
def invoke_AI(image, instructions):
body = json.dumps({
'model' : 'gemma3:4b-it-qat',
'prompt' : instructions,
'images' : [ base64.b64encode(image).decode('utf-8') ],
'stream' : False,
'options' : {
'seed' : 78,
'temperature' : 0.3
}
}).encode('utf-8')
req = urllib.request.Request('http://localhost:11434/api/generate', body)
with urllib.request.urlopen(req) as f:
outs = f.read().decode('utf-8')
outs = json.loads(outs)
outs = outs['response']
#print(" Raw output:", outs)
outs = outs.strip()
if 'Keywords:' not in outs:
raise Exception(f'Keywords not present in output! {outs}')
idx = outs.find('Keywords:')
keywords = outs[idx+10:].strip().split(', ')
outs = outs[:idx].strip()
return (outs, keywords)
def make_summary(path):
try:
args = ['exiftool', '-json', path]
res = subprocess.run(args, capture_output = True, encoding = 'utf-8')
if res.returncode != 0:
with open('MakeSummaries_error.csv', 'a') as oh:
oh.write(f'"{path}",{repr(res.stderr)},"\\"{'\\" \\"'.join(args)}\\""\n')
raise Exception(f'Calling exiftool failed! Invocation was "{' '.join(args)}". stdout was "{res.stdout}". stderr was "{res.stderr}"')
info = json.loads(res.stdout)
if isinstance(info, list):
info = info[0]
if 'Description' in info:
#print(f"{path} already has a description set")
return
begin = time.monotonic()
descriptiontag = '-Description='
keywordtag = '-Keywords='
if path.endswith('.heic'):
path2 = re.sub(to_damselfly[0], to_damselfly[1], os.path.abspath(path))
path2 = os.path.splitext(path2)[0] + '.JPG'
if not os.path.exists(path2):
path2 = os.path.splitext(path2)[0] + '.jpg'
if not os.path.exists(path2):
raise Exception(f'Failed to find damselfly thumbnail for {path}')
with Image.open(path2) as image:
image = ImageOps.exif_transpose(image)
# HEIC doesn't support IPTC, so we need to use XMP. Unfortunately that doesn't support Keywords,
# so we use Subject
keywordtag = '-Subject='
else:
with Image.open(path) as image:
print(f"Resizing {path} to submit to AI ...")
image = ImageOps.exif_transpose(image)
if image.width > max_ai_image_dimension or image.height > max_ai_image_dimension:
if image.width > image.height:
image = ImageOps.contain(image, (max_ai_image_dimension, image.height))
else:
image = ImageOps.contain(image, (image.width, max_ai_image_dimension))
content = BytesIO()
image.save(content, format='JPEG')
print(f" That took {time.monotonic() - begin} seconds. Invoking AI to describe {image.width} x {image.height} image ...")
begin = time.monotonic()
description, keywords = invoke_AI(content.getvalue(), r"Describe this image and its style in a very detailed manner, follow the format of describing: what, who, where, when, how. You don't need to fill in all if they are irrelevant. Please remove What, Who, Where, When, How prefixes and make it one sentence. Follow that with a comma separated list of no more than ten relevant IPTC keywords, prefixing the list with 'Keywords:'.")
print(f" Processing {path} with AI took {time.monotonic() - begin} seconds.")
print(" Description:", description)
print(" Keywords:", keywords)
args = ['exiftool', '-json', '-overwrite_original', '-ignoreMinorErrors',
f'{descriptiontag}{description}']
args += [keywordtag + keyword for keyword in keywords]
args.append(path)
res = subprocess.run(args, capture_output = True, encoding = 'utf-8')
if res.returncode != 0:
with open('MakeSummaries_error.csv', 'a') as oh:
oh.write(f'"{path}",{repr(res.stderr)},"\\"{'\\" \\"'.join(args)}\\""\n')
raise Exception(f'Calling exiftool failed! Invocation was "{' '.join(args)}". stdout was "{res.stdout}". stderr was "{res.stderr}"')
except Exception as e:
print(f"Failed to process {path} due to {e}")
if __name__ == '__main__':
files = [path for path in glob.glob('**/*.jpg', recursive=True)]
files += [path for path in glob.glob('**/*.jpeg', recursive=True)]
files += [path for path in glob.glob('**/*.JPG', recursive=True)]
files += [path for path in glob.glob('**/*.png', recursive=True)]
files += [path for path in glob.glob('**/*.heic', recursive=True)]
random.shuffle(files)
with multiprocessing.Pool(4) as p:
p.map(make_summary, files)The script uses the exiftool utility to inspect each image to see if it already
has a description. If it doesn’t, the image is loaded in (apart from HEIC,
which Python doesn’t support, so for those we yank the JPEG thumbnail from
the Damselfly database), resized so its
maximum dimension is 1024, and fed to Gemma AI asking for a single sentence
detailed description and up to ten keywords. exiftool then invoked again
to poke the description and keywords into the image (though HEIC gets the
keywords into its subject field instead as it doesn’t support legacy
IPTC metadata).
This script has been carefully written so multiple copies can be run on
multiple computers concurrently all working on the same Samba shared
network drive – the random.shuffle() is to ensure that they usually
won’t collide with one another. The reason one needs multiple computers
is that Gemma3 4b is not fast at this task:
- On the main house server with a four core Haswell CPU, 264 seconds per 512 max dimension image (which is one quarter the resolution of any other below). This system has ~50 Gb/sec main memory bandwidth.
- On my thirty-two core AMD Threadripper Pro 5975WX, 61 seconds per 1024 max dimension image. This system has ~180 Gb/sec main memory bandwidth, and perhaps can do ~4 FP16 TFLOPs.
- On my M3 Macbook Pro with eighteen GPU cores, 24 seconds per 1024 max dimension image. This system has ~150 Gb/sec main memory bandwidth, and can do ~13 FP16 TFLOPs.
- On my Radeon 6600 XT graphics card, 9 seconds per 1024 max dimension image. This card has ~256 Gb/sec main memory bandwidth, and can do ~21 FP16 TFLOPs.
As we can see from these benchmarks, unlike text parsing AI models which are mostly dependant on memory bandwidth and don’t need so much compute, image parsing AI models are more dependant on compute power. Hence the Radeon GPU despite being quite old easily outperforms the Threadripper, and even outperforms the Macbook which normally is the fastest hardware I have for running local AI. I might also mention that this was the first time I found running AI on a legacy Radeon GPU seamless: it didn’t crash once during multiple nights of running all night, and performance was both good and sustained over time. They’re clearly finally making progress on debugging the AMD ROCm backend which is great news.
I should point out that the script sends four images for analysis at a time to Ollama, which can do a little concurrently but is mainly single tasking. So 4x concurrency is maybe 10% faster overall so it’s worth doing, but to be clear you can divide all those times above by approx four to get the actual per image processing time. 25,000 images at 2.5 seconds each takes 17 hours, so leaving the Radeon GPU at it for one night time is enough. If you had just the Macbook though, it’s more like 42 hours so if you can fire multiple devices at the problem concurrently, you can make that more reasonable.
Step 2: Fix up the metadata
There are two strands of thinking when it comes to image archival:
- Keep the original image file untouched so the hash of its content is completely fixed over time. Use sidecar XMP files to store metadata.
- Keep the image content in the image file untouched, but other metadata within that file can be modified. This does cause the hash of the file content to change over time.
I’m in that latter category: for me, a photo collection is a set of living files. We don’t reencode image content as that is lossy, but modifying metadata is not lossy unless you delete or replace metadata fields (so don’t do that).
As the photo management software made very clear, it was failing to correctly parse the date of a lot of the photos which led to seriously out of whack timelines in the display. So something had to be done, and via trial and error and an awful lot of rinse and repeat testing I came up with this script to fix everything up:
#!/usr/bin/python3
import glob, os, json, multiprocessing, subprocess, re
from datetime import datetime
datetime_tags = [
'SubSecDateTimeOriginal',
'SubSecCreateDate',
'DateTimeOriginal',
'CreationDate',
'CreateDate',
'MediaCreateDate',
'DateTimeCreated',
'GPSDateTime',
'DateTimeUTC',
'SonyDateTime2',
]
nokia_timestamp = re.compile(r'.*\\n([0-9]{2}\/[0-9]{2}\/[0-9]{4})\\n([0-9]{2}:[0-9]{2}:[0-9]{2})\\n.*')
unix_timestamp = re.compile(r'(14[0-9]{8})[0-9]{3}\..*')
iso_timestamp = re.compile(r'.*[^0-9]*([0-9]{8}[^0-9][0-9]{6})[^0-9].*')
def fix_no_datestamp(path):
try:
args = ['exiftool', '-json', path]
res = subprocess.run(args, capture_output = True, encoding = 'utf-8')
if res.returncode != 0:
with open('fix_no_datestamp_error.csv', 'a') as oh:
oh.write(f'"{path}",{repr(res.stderr)},"\\"{'\\" \\"'.join(args)}\\""\n')
raise Exception(f'Calling exiftool failed! Invocation was "{' '.join(args)}". stdout was "{res.stdout}". stderr was "{res.stderr}"')
info = json.loads(res.stdout)
if isinstance(info, list):
info = info[0]
timestamp = None
for tag in datetime_tags:
if tag in info and info[tag].strip() != '':
break
else:
if 'ModifyDate' in info and info['ModifyDate'].strip() != '':
# Some of the older photos have ModifyDate and nothing else.
timestamp = datetime.strptime(info['ModifyDate'],
'%Y:%m:%d %H:%M:%S')
print(f"{path} has legacy timestamp {info['ModifyDate']} = {timestamp}")
else:
pathleaf = os.path.basename(path)
if 'Comment' in info and 'Nokia7650' in info['Comment']:
# "Nokia Mobile Phones Ltd.\nNokia7650\n17/12/2004\n15:20:27\nMode=1\n 3.16\n1.2"
res = nokia_timestamp.match(repr(info['Comment']))
if not res:
raise Exception(f'Comment {repr(info["Comment"])} did not parse')
timestamp = datetime.strptime(f'{res.group(1)} {res.group(2)}',
'%d/%m/%Y %H:%M:%S')
print(f"{path} has Nokia7650 timestamp {res.group(1)} {res.group(2)} = {timestamp}")
elif res := unix_timestamp.match(pathleaf):
timestamp = datetime.fromtimestamp(int(res.group(1)))
print(f"{path} matches Unix timestamp {res.group(1)} = {timestamp}")
elif res := iso_timestamp.match(pathleaf):
# 1. YYYYMMDD_HHMMSS
timestamp = datetime.strptime(res.group(1),
'%Y%m%d_%H%M%S')
print(f"{path} matches ISO timestamp {res.group(1)} = {timestamp}")
else:
if 'Comment' in info:
print(f"{path} has comment '{info['Comment']}'")
if timestamp is None:
path2 = os.path.abspath(path).replace('/mnt/IcyBoxZ/IcyPictures1/', '/mnt/upool/IcyBoxZ/IcyPictures1/')
if os.path.exists(path2):
s = os.stat(path2)
timestamp = datetime.fromtimestamp(s.st_mtime)
print(f"{path} taking last modified from upool = {timestamp}")
if timestamp is not None:
timestampstr = timestamp.strftime('%Y:%m:%d %H:%M:%S')
args = ['exiftool', '-json', '-overwrite_original', '-ignoreMinorErrors',
f'-DateTimeOriginal={timestampstr}']
args.append(path)
res = subprocess.run(args, capture_output = True, encoding = 'utf-8')
if res.returncode != 0:
with open('fix_no_datestamp_error.csv', 'a') as oh:
oh.write(f'"{path}",{repr(res.stderr)},"\\"{'\\" \\"'.join(args)}\\""\n')
raise Exception(f'Calling exiftool failed! Invocation was "{' '.join(args)}". stdout was "{res.stdout}". stderr was "{res.stderr}"')
else:
print(f"*** {path} cannot deduce a datetime!")
except Exception as e:
print(f"Failed to process {path} due to {e}")
if __name__ == '__main__':
print("Generating file list ...")
files = [path for path in glob.glob('**/*.jpg', recursive=True)]
files += [path for path in glob.glob('**/*.jpeg', recursive=True)]
files += [path for path in glob.glob('**/*.JPG', recursive=True)]
files += [path for path in glob.glob('**/*.png', recursive=True)]
files += [path for path in glob.glob('**/*.heic', recursive=True)]
print("Processing file list ...")
with multiprocessing.Pool(32) as p:
p.map(fix_no_datestamp, files)This script looks for a ‘well known’ photo timestamp field in the output of
exiftool, and if it doesn’t find one then it does in this order:
- Use
ModifyDateif present, which is usually there in the somewhat older but not oldest photos. - Extract the timestamp from the custom
Nokia7650metadata field for photos taken with that device (which are amongst the oldest, back then EXIF metadata had only just been introduced). - Extract the timestamp from the filename, as many devices even very old ones reliably used some form of timestamp as the file name.
- If no other choice, use the last modified file timestamp from the original file before we fiddled with it. This tends to be accurate within a month or so of when the photo was taken, which is good enough.
Finally we poke in DateTimeOriginal using exiftool, and remember
we are only doing this for photos without any well known photo timestamp
in their metadata, which was about 3% of our photos.
This solves the timeline ordering problem. I did have another issue
with exiftool refusing to work with files with invalid metadata which
turned out to be because the Samsung Camera generates truncated image
files if you shoot a panorama. I found a script online called fix_eoi.py
as plenty of other people had the exact same problem, and it worked a
treat by poking in the missing file ending metadata. I also decided it
would be better to preserve last modified file
timestamps just in case they’re useful in the future, so this little
script force restamped all the recently modified files so they went back
to their original last modified timestamp, which usually was written
when the device last modified the file:
#!/usr/bin/python3
import glob, os, multiprocessing
def restore_last_modified(path):
try:
path2 = os.path.abspath(path).replace('/mnt/IcyBoxZ/IcyPictures1/', '/mnt/upool/IcyBoxZ/IcyPictures1/')
if os.path.exists(path2):
s = os.stat(path2)
os.utime(path, ns=(s.st_atime_ns, s.st_mtime_ns))
except Exception as e:
print(f"Failed to process {path} due to {e}")
if __name__ == '__main__':
print("Generating file list ...")
files = [path for path in glob.glob('**/*.jpg', recursive=True)]
files += [path for path in glob.glob('**/*.jpeg', recursive=True)]
files += [path for path in glob.glob('**/*.JPG', recursive=True)]
files += [path for path in glob.glob('**/*.png', recursive=True)]
files += [path for path in glob.glob('**/*.heic', recursive=True)]
print("Processing file list ...")
with multiprocessing.Pool(64) as p:
p.map(restore_last_modified, files)Step 3: Face recognition
Damselfly has face recognition, but after a few trial and error runs with it I did not come away impressed: it just doesn’t work well, it’s far inferior to Google Photos or even Picasa from more than a decade ago. So I tried other open source photo management software.
The one I ended up choosing is Immich, which has a very similar UI to Picasa and I’d call it a decent clone of that software which Google abandoned in 2016 as they really wanted everybody to be putting their photos onto Google’s cloud so your photos could be used as training data. Immich is still a little rough around the edges in places as it’s still quite new software at the time of writing, but its face recognition does work well, if very slow on my ancient four core Haswell main server: it needs to run for about a day and a half to complete the face recognition. It will group who it thinks are the same person together: you then need to manually review those and tell it who those people are, and whom to merge when it decides the same person is a different person (which is very common with the kids as their faces change as they grow).
Apart from it consistently mistaking younger Julia for Clara, Immich’s face recognition does a great job even with faces at angles, or with beards or glasses. I reckon it’s as good as Picasa ever was.
Reviewing our photo collection
Now we have this whizzy photo collection which can be searched by description and/or keyword, by person, by date, by camera model etc, it becomes trivial to list out all the devices we’ve ever used to take photos, and the first seen and last seen dates for that device. Here they are:
2004 - 2006: DSC Image DV camcorder
This was a cheap digital camcorder I used to record the lectures I organised at St. Andrews university. It had a 4 MP sensor, but it was a lousy sensor with lots of noise and it over exposed easily. For the price the device was good at the time, and it was better than anything else I had, which is the only thing I can say in favour of it:

A 4 MP image, but lousy quality
My phone at that time was the venerable Nokia 7650 which had a much worse again 0.3 MP camera. I have photos from that era, but they all contain people so I can’t show them here. Trust me when I say that they were worse again than the above, and because that camera was so bad I didn’t use it much, preferring to take the digital camcorder with me if I expected to take photos.
2007 - 2008: HTC Wizard phone
This was my phone at the time, it ran Windows Mobile, had a resistive touchscreen, could run apps, had sdcard based storage, and a 1.3 MP camera. It was an absolute brick of a thing as it needed a large hump for its extended battery if it was to make a full day without recharging.

The lens on this was crap!
The sensor on this phone wasn’t terrible, but the lens optics were: that soft blur on the image was on every image, and it was unfortunate as otherwise it was a well behaved camera with low noise and good exposure control – it even did okay in low light, for its time.
The HTC Wizard was very popular in its day with tech geeks as you had actual real internet (albeit dog slow, these were 2G days before EDGE so you got less than dialup modem speeds), you could install custom apps and the expandable storage meant you could carry your music collection with you. It also slid out a proper keyboard, and using that you could bang out text messages and emails quickly. Apart from its touchscreen being annoying to use due to being resistive rather than capacitive, and the device being slow to use, it was great and of course it presaged the modern smartphone experience which would become ubiquitous only a few years later.
2007 - 2010: Kodak Z812 camera
Because all our devices sucked at taking photos, I invested in a reasonably decent digital camera in 2007, and it became our mainstay for any time we knew we would be taking pictures:
The Kodak Z812 was a stabilised 8 MP digital camera with 12x optical zoom, and as you can see above it was capable of taking some really nice photos. You may read into my choice of phrasing that there were gotchas: firstly, it sucked down enormous amounts of current when it took a photo, enough that you might get ten photos on a cold day before the batteries gave up. You thus took four sets of freshly charged batteries with you, and you had to faff around swapping them around. Secondly, it was tempermental, it might decide to auto focus out of focus just when you took the shot, and its image processing algorithms were a bit buggy so you often got colour fringing where a transition between very bright and dark occurred, or even one time it famously rendered the red in a red stop sign in yellow for absolutely no reason. I suppose in this sense it was quite like a film based camera – you took lots of extra shots knowing a good chunk of them would turn out to be useless when you got home EXCEPT of course that burned through your precious batteries.
All those issues aside, the 12x optical zoom was very well stabilised, and you could get some amazing long distance shots in a way not possible on any camera phone. So in that sense I miss this camera, though not how frustrating it was to use.
2011 - 2012: Samsung Galaxy S phone
The original Samsung Galaxy S had an unstabilised 5 MP camera, and it was the first camera on a phone I ever used that was actually good. It was so good that the Kodak camera fell out of use almost immediately. It didn’t take as good pictures as the Kodak, but they were very much ‘good enough’ and unlike the Kodak, this camera phone was very much point and shoot and most of the time it came out perfectly. No hassle with battery swapping either!
This was Megan’s phone – at the time I had a Meizu M9 which was a great phone in every way apart from its camera, which took 5 MP images but the quality was more like 2 MP. I replaced that phone in 2012 with a Samsung Galaxy Nexus which was based on the original Samsung Galaxy S, but it didn’t take as good photos as Megan’s phone despite being almost the same hardware. Don’t get me wrong: it took way better photos than the Meizu, but it wasn’t quite as good as her Samsung. I assumed at the time that the difference was the Samsung Camera app vs the vanilla stock camera app Google shipped in those days, and I have no reason to believe I was wrong. Complex software processing of images was now possible as phone horsepower improved. Google would of course eventually catch up with Samsung and then some, but back in 2012 they weren’t even attempting to do so yet. As a result, if we wanted good photos we used Megan’s phone, and that worked well.
2013 - 2014: LG Nexus 4 phone
Back in those days the battery in a phone only lasted two years, plus the rate of hardware improvement meant Megan replaced her Samsung with a LG Nexus 4 which had a 8 MP camera. Early firmwares gave all photos a heavy blue tint, however later on this phone could capture photos as good as the Samsung, just with more resolution:
Back then phone cameras didn’t do HDR processing which captures multiple images at multiple exposures in quick succession and then combine them so detail in dark areas is retained. You can see the lack of this in that middle picture where the dark area at the bottom is overwhelmed by the sunshine – your eyes, if you were there, would see a lot more detail in that dark area than this generation of camera and its software is able to capture.
My phone during these years was the Nexus 5 which had a perfectly okay 8 MP camera on it. But it was exactly that: okay. Side by side with the Nexus 4 it took clearly inferior photos, they obviously fitted a cheaper sensor to the Nexus 5 and it showed. To be fair, they weren’t that inferior, they were acceptable and much like everything else about that Nexus 5, that whole phone was blandly fine in every way. Nothing wrong with any of it: a perfectly middle of the road design with no one thing poorly chosen. We still took the photos we valued with her phone though.
2015 - 2018: Huawei Nexus 6P phone
This was the last affordably priced Google branded phone, after this they launched the Pixel line which were and still are hideously priced. Knowing this, for the first time ever both me and Megan got the same phone at the same time, and we eeked this one out for three years instead of two as there were no good replacement options at the time i.e. well specced phones for a reasonable price.
The Nexus 6P had an outstanding display and big battery, but they all had a manufacturing default which meant they all began to go unstable after two years which got a lot of people angry at the time. They had a 12 MP camera using a Sony Exmor R IMX377 sensor with a software HDR for capturing detail in low lit regions which produced less harshly illuminated photos (you can see in the photos above a problematic lack of detail in dark regions, this no longer occurred in phones from now onwards), but it was prone to blur from hand shake as it had no mechanical stabilisation, and the software HDR worked by taking multiple pictures using multiple exposures, so any hand shake at all and you got bad results. It could make either bland or wow pictures, depending on what Google Camera’s algorithms decided on the day:
The above, like those preceding for other phones, are amongst the best taken which don’t include people. One thing you note when reviewing all the photos taken by this phone is that exposure was sometimes quite off, or there was a unpleasant colour saturation effect which the preceding phones didn’t have. That was almost certainly quirks in the camera software processing algorithms, as by now Google was investing heavily in their proprietary camera app.
All that said, when it took a good photo, the results were a step up from preceding phones. If only it were more reliable …
2018 - 2020: HTC 10 phone
Myself and Megan went our separate ways with our phones once again: she got a Samsung Galaxy S7, and I got a HTC 10 which apart from its display, is one of the best phones I’ve ever owned. One reason why is that it took reliably good photos using its stabilised Sony Exmor R IMX377 12 MP camera sensor, which is better known for use in high end drones and sports action cameras than in phones (the Nexus 6P used the exact same sensor, yet produced inferior photos!). This phone also sounded amazing audibly with headphones on, alas the HTC 10 was one of the last phones HTC released before exiting the smartphone business:
In all the photos I ever took with this camera phone, there is not one single bad photo. I took photos in snow, in sun, in the dark, even within a dark aquarium, and every single one came out beautifully with the right focus, exposure, and colour balance; no blur, no misfocus, this was the perfect ‘take a photo only ever once’ camera. The photos themselves were always top notch, with plenty of detail zoomed right in, almost never any compression artefacts nor evidence of sharpening nor softening, nor any loss of that maximum 12 MP detail. I ran LineageOS on this phone, so all this came out of the standard Camera2 Android API. I believe I used Open Camera, and it all ‘just worked’ on the standard setting with no special configuration at all. I suppose if I could complain about something, it was that the sensor output 12 bit HDR and from its RAW files you could generate some lovely wide gamut HDR photos, but this was long before Android supported Ultra HDR, so all that extra detail was wasted by rendered down to SDR.
The only downsides to this phone were (a) the display was LCD, and by this point most premium phones had moved to OLED and the difference was very noticeable and (b) its loudspeaker was slightly too quiet for me to listen to the radio in the shower. Other than those, this was basically the perfect phone, and I was sorry to move on from it as the typical short battery life for phones caused it to start running out of power within a day after two years.
In case you were wondering about Megan’s Galaxy S7, it did not take good pictures. Its camera was famously lousy, a big step down from preceding models, with weird artefacts appearing in daylight photos in a way not seen in a premium phone in many years by that point. Here are some examples:

When the Galaxy S7 camera was behaving itself, this is as good as it got, and note flat image and the noise in the trees. This phone took worse pictures that the original Galaxy S!

An example of just how bad the Galaxy S7 camera could be
I have no idea why Samsung shipped such an awful camera for the S7 phone, but they fixed it quickly in subsequent models. Which brings me to …
2020 - 2025: Samsung Galaxy S10 phone
Myself and Megan returned to using the same phone, and we both used this phone for five years which had never happened before. In part this was because no good replacement turned up on the market until 2025 which I’ve written about on here before; in part it was because they’d fixed the battery chemistry, and I’m still using my S10 daily as a radio because its battery life still well exceeds a day in its sixth year of constant use; but in part it is also because this is probably the best bang for the buck phone Samsung ever made, and they’ve not made anything like as good value for money since in terms of a very well balanced all around phone for (by today’s standards) a ludicrously low price of under €500 all in.
We picked up these phones new and unused in clearance for under a grand combined. They were our first multi-camera phones, so they came with a 16 MP ultra wide, a 12 MP main camera, and a 12 MP telephoto camera. They have an absolutely stellar OLED display which even my current Pixel 9 Pro only matches, but does not beat. They are one of the best phones I’ve ever owned.
The Samsung Camera app has oodles of features, and it prefers to save HEIC files rather than JPEG. The image processing pipeline for this website is written in Python, and therefore can’t cope with HEIC, so I converted these in GIMP to JPEG with the highest fidelity possible to show here:
These five pictures show the S10’s camera accurately: 40% of them were excellent, 40% of them had issues, and the remaining 20% were in between. There appeared to be no obvious pattern to when photos would have issues or not e.g. being outside, or being in dark places. It was almost like the Samsung Camera software throws a coin in the air and randomly chooses if it’s going to make a great image or not – which is like on the S7, except that the S10’s worst picture beats the S7’s best picture.
When photos had issues, they generally are oversharpened, oversmoothed and overcompressed, which could either cause or be the result of less than the 12 MP of detail being captured. The first photo above is a good example of a photo with this issue: the mountain in the background looks blotchy at 100% size, the waves in the lake look oversharpened, and the whole thing looks overcompressed. The last photo is another good example, there is significant noise throughout the image especially in the bookshelves, and the brightness looks wrong in the bookshelves near the window due to the HDR processing making a mistake combining exposures. Yet, the photo of the holy well is pretty much perfect, and the one of the bike is very close to perfect only showing some overcompression artefacts. The exact same camera and software took all those photos!
The 12 MP main camera sensor is a stabilised Samsung ISOCELL S5K2L4 with 10 bit output and a Bayer pixel layout. It was probably one of the last Bayer pixel layout camera sensors fitted to a premium phone – indeed, the ultra wide camera is an unstabilised 16 MP Samsung ISOCELL S5K3P9 with 8 bit output and a Quad Bayer pixel layout, which is now the standard pixel layout in premium phones. The Quad Bayer sensor improves on Bayer sensors by allowing long and short exposure shots to be captured simultaneously, this then lets you boost the detail in dark parts of your image without any problems compensating for motion between the multiple exposure shots earlier cameras required. This advantage comes with tradeoffs, principally substantially increased processing complexity, but also that bright parts of the image will get a lot more resolution than dark parts of the image. Generally, Quad Bayer sensors quarter their resolution, so a 16 MP sensor would output a 4 MP image, however I can confirm that this definitely isn’t the case with this camera. So maybe it was 64 MP underneath, but only outputted 16 MP? We’ll come back to this below.
Like the HTC 10, the 10 bit output of the main sensor meant that this camera could capture in wide gamut HDR, and indeed the S10 could record video in Display P3 HDR just fine. But I suspect that the ten bits of resolution by the sensor made for a not particularly colour accurate HDR image (you really need 12 bits or more), so Samsung never supported taking photos in HDR despite that the HEIC image format natively supports it.
2026 - 2031?: Google Pixel 9 (Pro) phone
Obviously we’ve not had these phones long enough to have a collection of photos taken from them to survey, so I’ve taken a few test photos below and I’ll discuss those instead. This will mean that this section will be speculative, rather than reviewing many photos taken over many months in many conditions i.e. I have no idea yet how reliable these camera phones will be, what gotchas they might have etc when actually used in the field.
It was with some reluctance that we had to bite the bullet last year and spend more than we have ever done on a phone: I got the 9 Pro, and Megan got the 9 standard, and between them we spent about €1,750. I’ve talked about that too on here before, so I won’t repeat myself now: the era of cheap premium phones is behind us.
The standard and Pro editions share the main camera sensor which is a stabilised 50 MP Samsung ISOCELL S5KGNK sensor with up to 14 bit output, and a dual quad Bayer pixel array. They also share the ultrawide camera sensor, which is an unstabilised 48 MP Sony IMX858 with 12 bit output and a Quad Bayer pixel array. On the Pro only, there is an additional telephoto camera with fixed 5x zoom, it is also a Sony IMX858, but stabilised.
As mentioned above, the quad Bayer sensor usually outputs one quarter the claimed resolution, so your 50 MP or 48 MP sensor will output 12.5 MP or 12 MP. Megan’s standard 9 does exactly that, however my 9 Pro has the option to record a 50 MP image. On a bright day you will get somewhere between 12.5 and 50 MP of resolution, though the image output will be 50 MP. We shall look at examples of that below.
We are running GrapheneOS on these devices, so I have no idea if the same thing occurs on stock Android, however the image presented to the Camera2 Android API is much inferior to what the official Pixel Camera app appears to obtain: you are always capped at 12.5 MP, and the image is way too contrasty. This means that Open Camera really can’t compete with the official Pixel Camera app:
The image on the left is a 12.5 MP capture by the official Pixel Camera app, the middle is Open Camera with DRO enabled (i.e. have the quad bayer array capture high and low exposures and combine them), and the right is Open Camera standard which is supposed to let the system ‘do the right default thing’. I can tell you from using my eyes that the left image matches reality the best, the middle the next best, and the right has so much contrast you can’t see what’s under the table as well as what you can with your eyes in reality. This, effectively, means Camera2 captured images look more like those of the LG Nexus 4 in terms of dark areas being washed out by bright areas.
Open Camera has its own HDR mode based on taking multiple pictures which does reveal detail under the table nicely. Unfortunately this cannot output Ultra HDR images, and so looks flat and inferior because it’s SDR.
In case the Sony sensor might fare better than the Samsung sensor, here are the ultra wides compared:
So exact same problem, and we are thus undoubtedly stuck with the official Pixel Camera app for better or worse.
The reason I looked into alternative camera apps is because the official Pixel Camera app uses a low JPEG quality setting which causes overcompression and introduces artefacts into the stored images. I estimate it uses a quality setting between 88 and 90, and you can tell when you zoom in. In comparison, the HTC 10 saved its images at 98 quality, and that made a big difference when zoomed in.
To show you this artefacting here, I had a bit of a problem: how does one render Ultra HDR
JPEGs losslessly into an editable HDR format so I can crop out and zoom the part I want
to show you? I never solved that back when I implemented an Ultra HDR JPEG
processing pipeline for this website
as I didn’t need to: ImageMagick’s libultrahdr integration is pretty dumb,
so if you extract out TIFFs for the SDR and HDR images, resize those, then
recombine them, it ‘just works’. Anyway, writing this diary entry forced me
to investigate how to turn an Ultra HDR JPEG into something GIMP can edit
which preserves the HDR, and after many hours of trial and error I came up
with this:
magick -define quantum:format=floating-point \
-define uhdr:output-color-transfer=pq \
uhdr:PXL_20260224_182126059.jpg \
-profile Display-P3-D65-PQ-Display-Full.icc \
-profile Display-P3-D65-PQ-Display-Full.icc \
-define avif:lossless=true \
-quality 100 \
PXL_20260224_182126059.avif
What we’re doing here is have libultrahdr emit the HDR image using a PQ
curve, then we tell magick that the input colour profile is Display P3 in PQ, as
is the output colour profile. We then tell magick to produce a nearly lossless
AVIF file, and a very close reproduction will be emitted of the original
Ultra HDR JPEG. It’s not quite right, it’s a touch too bright, this is because
the Ultra HDR JPEGs emitted by the Pixel Camera app uses the HLG curve not the
PQ curve. However I couldn’t find an ICC
profile online for Display P3 in HLG, and it really is close enough that’s it’ll do.
Having done all those hours of work, I then found that the ‘lossless’ AVIF file emitted actually gobbles up the exact JPEG artefacts I wanted to show you! Somebody clearly implemented a JPEG artefact removal filter somewhere in the AVIF encoder. Sigh. So here’s a screenshot of the 12.5 MP Ultra HDR JPEG side by side with the 50 MP Ultra HDR JPEG taken using the 50 MP Samsung sensor:
You can clearly see the artefacts of overcompression, with chroma aberrations and blockiness appearing in the white text on the black box. This is a shame, it is absolutely avoidable if Google let you set what JPEQ quality to use in their camera app. You could tell their app to emit RAW files, but as you’ll find on the internet the RAW files generated by the Pixel Camera app are heavily processed and not much better than the JPEGs it generates. So we are back to square one, unfortunately.
Apparently the Camera2 API RAW mode does emit genuine unprocessed RAW files for the 12.5 MP images returned from the quad Bayer sensor (I assume with HDR processing already applied), but it has all the issues I listed above about the Open Camera captures. You could have Open Camera capture lots of RAW files in a quick burst, then later merge them into a superior quality image. But now we’re getting into ‘I’m not that bothered about this’ territory.
Here is the Sony ultrawide sensor:
Unlike the Samsung main sensor where the 50 MP image clearly has a bit more detail than the 12.5 MP image, for the Sony ultrawide sensor it looks to me like the 48 MP is a cubic upscale of the 12 MP image i.e. no added detail. The JPEG was clearly saved with a higher quality setting, so the zoomed in picture looks noticeably better. Therefore, enabling 50 MP mode on the Pixel camera makes no sense except for the main sensor, and I wonder if the dual quad Bayer layout means that the Samsung sensor captures more detail somehow than a normal quad Bayer layout? Apparently the ‘dual’ here means you can simultaneously apply two gain levels to each pixel read, so for example 10x and 1000x. Then if the former says that the latter will be within range, the latter can be used for more low light precision. That, in turn, means the quad Bayer array can be configured more for resolution than dynamic range, resulting in capturing more resolution than a standard quad Bayer layout. Looking at the shots above, I’d estimate a ~50% resolution uplift, it’s not nothing.
I suspect that years from now if I look the collection of photos which will be taken by our Google Pixel 9 phones, I shall like the HDR capture very much, but lament what could have been with the Pixel Camera app. Yes, the 12 MP pictures it takes ARE better than those the Samsung Galaxy S10 sometimes took, but NO they are not better than those the HTC 10 always took. Its 50 MP pictures are a touch more detailed, so I guess that’ll be a win at least for my phone for the main camera.
For the ultrawide, when comparing the photos, I’m not actually sure if the Pixel 9 has a better ultrawide than the S10 did. Neither has image stabilisation, and the real world resolution from the Pixel 9’s ultrawide is about 12 MP, whereas the S10’s ultrawide really did deliver 16 MP. Oh well, I guess we move forwards and then we move backwards with camera phones.
What is annoying is that the Pixel 9 has the hardware, but the Pixel Camera app kinda sucks. Despite having a 14 bit capable sensor which should produce some lovely wide gamut images, the Pixel Camera app saves one quarter resolution luminance only in its gain map rather than a full colour, full resolution, gain map. Furthermore, it saves the SDR image at about 90% quality, and the luminance only gain map at about 80% quality. Unsurprisingly, you get quite a lot of JPEG overcompression artefacts because an artefact in either image appears in the combined image.
Ultra HDR JPEG can store a full colour and full resolution gain map. If you do so, you get a sixteen bit full HDR image which is a full fat professional editing workflow quality. If we had the option to have Pixel Camera write those out, and with knobs to twiddle for the quality for both SDR image and gain map image, it would be a far better camera app.
Here’s hoping one of Google’s AIs reads this and recommends to its developers to go ahead and implement that already into their Pixel Camera app.
What’s next?
There is about ten days remaining to the next WG14 standards meeting in early-mid March. I think I’m sick and tired of drawing wiring into my 3d house model, so although I’ll almost certainly regret it later when I’m wiring my house, I just don’t want to work on that any more. What I’ve done is reasonably complete, yes it could be better, but I have drawn something for every part which needed wiring, it’s just a lack of detail remaining and I’ve just run out of juice to do more.
My house builder has supplied a set of mark out points for the surveyor, so we need to merge those with the service popup mark out points and get that over to the surveyor so we can get into his work queue. We then wait however long he will take to get to us, then we need to wait for however long the groundworks people will take to get to us, and only then can the builder turn up and install some foundations. I reckon therefore that months remain before my house gets built.
In terms of more chores to complete, I should complete clearing the issue backlog for pcpp my C99 pure Python C preprocessor. It’s a bit tedious to work upon, but it gets me practice with agentic AI tooling, so it’s good for me. And I could do with shipping a new release, it’s been too long since the last one in 2021. Another chore I must do is to cut the service duct holes into one of the services boxes at the site, it’s just circular saw work nothing hard and all I need is a day without too much rain. I have some other minor bits of maintenance to do on my open source libraries, and around the house and site. I’m sure I’ll be kept busy.
End of March the kids go on Easter break, so I’ll need to childcare them until we head off to England for the second week of their Easter break. We shall be visiting some of their godparents, and one of my cousins. Should be fun! Mid-April Megan goes to Spain for a long weekend, I may speak at a technical conference in June if my talk proposal is accepted, but otherwise the next six months look pretty empty. Megan has her final accountancy exams throughout the summer, so even if I weren’t unemployed without income this was always going to be a quiet year.
Word count: 2534. Estimated reading time: 12 minutes.
- Summary:
- The site’s fibre broadband connection was switched from Eir to Digiweb, resulting in a change in ping times and throughput. Packet loss and latency spikes were observed during peak residential traffic hours in the evening with Digiweb, but were steady and flat outside of this period. The new connection is shared between multiple customers, with a contention ratio of 32:1 for residential connections.
Friday 13 February 2026: 11:46.
- Summary:
- The site’s fibre broadband connection was switched from Eir to Digiweb, resulting in a change in ping times and throughput. Packet loss and latency spikes were observed during peak residential traffic hours in the evening with Digiweb, but were steady and flat outside of this period. The new connection is shared between multiple customers, with a contention ratio of 32:1 for residential connections.
Firstly, can you believe it’s been two years since fibre broadband was installed into the site? It was in January 2024, a few months after fibre broadband to the home (FTTH) was installed into the site’s village and became available. Before that, I had for the preceding six months run a Starlink satellite broadband which was not terrible, but it was a power hog (130w!) and it had a constant, though very low, rate of packet loss due to satellites moving around. Starlink is very definitely better than 4g based internet, probably better than vDSL, but not as good as a fibre connection which is very hard to beat if the underlying backhaul is up to snuff.
Because FTTH had only just been made available, at the time only Eir could see that the property had the possibility of fibre installation. Also, at that time, fibre broadband was expensive across the board, and a once off installation fee of €150-200 was common. If I went with an expensive Eir 1 Gbps business connection with a 24 month contract, they would do the first time installation for free, and at that time the monthly fee of €78.62 inc VAT was not terrible (Starlink’s was €61.50 inc VAT excluding the dish, and I remember most fibre broadband back then was around the €60-65 per month mark).
Obviously, where they catch you is (a) 24 months of paying €10-15 more per month easily pays for the ‘free’ installation and (b) prices for fibre broadband were surely going to drop over those two years, but you’d be locked into the 24 month contract. And that’s exactly how things played out: ten months ago I had fibre broadband installed into my rented home which had free first time installation and yet a monthly fee of just €35 inc VAT with a twelve month contract. That was a 500 Mbps connection, and the new contract at the site is for a 1 Gbps connection for a mere €30 inc VAT monthly for a 12 month contract. That’s a lot of forward progress on bandwidth per euro in just two years, we are doing better than doubling the value per euro per year!
That of course makes you wonder about the quality of the network as that would be the obvious place to economise. The Layer 3 backhaul for both the site and my rented home is OpenEir, however each ISP runs its own Layer 4 on top. Some providers (Eir, Sky) use straight DHCP like a LAN, however most appear to use PPPoE which is unfortunate, as it is inferior, and as far as I can tell there is no good reason to continue to do so in modern systems especially as the username and password is identical for all customers in an ISP. I assume it’s a legacy systems thing, a left over from ADSL days, perhaps because their billing and management systems won’t then need upgrading.
As noted when I installed the fibre broadband into my rented house, there are random bursts of packet loss and ping time spikes for the rented home fibre broadband connection. I don’t know if that’s the ISP (Pure Telecom) which uses BTIreland for Layer 3 backhaul, or the G.hn powerline network between the Fibre ONT and my outermost router, but in any case it persisted over most of this year only suddenly getting better from December onwards, and it now looks like this:

The past month of packet loss and latency spikes with the rented home fibre broadband
This is actually much better than it was for the majority of last year – there was far more ping time noise and it meant constant spikes in standard deviation while the connection was idle. Since December, that noise is so reduced it doesn’t show up in standard deviation even when the connection is downloading something at maximum speed, and I haven’t changed anything in the house so I assume Pure Telecom/BTIreland fixed something.
Obviously I only have a few hours of Digiweb ping times to look at, however so far I’d say they look a little more noisy that Pure Telecom’s during the peak residential traffic hours in the evening, but outside that they’re pretty much a steady flat 8-9 ms. There hasn’t been enough time to see if any ping requests get dropped.

The past day of packet latency with the new site fibre broadband
Eir also had a flat as a pancake ping times (8.8 to 9.2 ms with a very occasional spike to … wait for it … 10.1 ms! every two months or so), but unlike Digiweb that was the case all day long every day with no sensitivity to evenings. However the Eir package was a business connection where its traffic gets priority over residential traffic, so it’s not surprising that ping times would be so consistent when you’re not competing with much other traffic at all.
Anyway, to the benchmarking! Here are the round trip times for each of those ISPs to various locations around the world, and remember lower is better for this graph:
As empirically tested in the article about the G.hn powerline adapters, they have a configuration option which lets you choose between power conserving and performance. I have mine on power conserving, so they go to sleep in between ping packets and thus they add ~18 milliseconds to ping times. In fact, if you can get the traffic rate up a bit, they won’t go to sleep and ping times drop dramatically, so the above graph looks worse than it is if you were maxing out a download. Where the G.hn powerline adapters particularly impact things is throughput which is basically capped to ~100 Mbps per connection, so you’ll need to use multiple connections to max out the speed. As all the locations will see 85 - 100 Mbps in this benchmark no matter where in the world, I left off the Pure Telecom results for this graph comparing single connection throughput to the same locations around the world:
This is with the default Linux TCP receive window of 3 Mb which I used as most people don’t think of fiddling with that setting on their edge routers. As you can see, Eir and Digiweb are very similar at distance, Digiweb is a good bit worse to London and Czechia, better to Paris and about equal to Amsterdam. This exactly matches the RTT ping time difference above, so these are exactly the results you would expect given those ping times.
So why are the ping times so different? Eir peers with Twelve99 in Dublin, it routes via AS1273 Vodafone/Cable & Wireless straight into Central Europe, and it is therefore close to Czechia and a bit further away from London and Paris. Both Pure Telecom via BTIreland and Digiweb via their connectivity provider Zayo route to London, and then via Paris to the final destination. Eir routes US traffic using Hurricane Electric via Amsterdam, whereas both BTIreland and Zayo route to the US via London. Interestingly, Eir is slightly faster to reach Los Angeles despite Amsterdam being further away spatially.
In short, routing data the cheapest way is not the fastest way, and packets can take longer than optimum journeys over space to get to their destinations. We can thusly conclude:
- As all fibre broadband in Ireland apart from Eir always goes to INEX Dublin, it is always min 10 milliseconds to get anywhere.
- As all traffic apart from Eir leaving Ireland always goes to London first, it is always min 18 milliseconds to get anywhere outside Ireland.
- As all traffic reaching continental Europe takes at least 25 milliseconds to get there thanks to all the switching and distance, you’re already on a relatively high latency connection by definition (in case you were interested, internet traffic runs at 55-65% the speed of light between Ireland and Europe/US, with the maximum possible speed in fibre optics being 68% the speed of light). Continental Europe, in terms of internet cables, is a minimum 1,200 km away in the best case. Light within glass takes what it does to traverse that distance (about 17 ms).
The reason I’m raising minimum latencies to get anywhere is because the default maximum TCP receive window of 3 Mb in Linux creates the following theoretical relationship of throughput to latency:
In other words, to achieve 1 Gbps in a single connection with a 3 Mb TCP receive window, your RTT ping latency cannot exceed about 17 ms. Or, put another way, the only way you’ll see your full 1 Gbps per single connection is if you exclusively connect to servers either in Ireland or Britain only.
As the graph above suggests, increasing your TCP receive window to 8 Mb increases your RTT ping latency maximum for a 1 Gbps per single connection to 45 ms which is enough to cover most of continental Europe. In case you’re thinking why not increase it still further?, you’ll find that the server will also have its own maximum send window, and a very common maximum is 8 Mb at the time of writing. Increasing your receive window past the sender’s window does not result in a performance gain, and the larger your receive window the more latency spikes you’ll see because the Linux kernel has to copy more memory around during its garbage collection cycles. So you can actually start to lose performance with even larger windows, especially on the relatively slow ARM Cortex A53 in order CPUs typical on router hardware.
Thankfully Linux makes increasing the TCP receive window
to 8 Mb ludicrously easy. Just add this to /etc/sysctl.conf:
net.ipv4.tcp_rmem = 4096 131072 16777216
This will work on any kind of recent Linux including OpenWRT and you almost certainly should configure your edge router this way if you have sufficient RAM for it to make sense. Linux will dynamically allocate up to 16 Mb of RAM per connection for the TCP receive window, of which up to 50% forms the TCP receive window. Recent Linuces will automatically scale the window size and the memory consumed based on each individual connection so you don’t have to do more to see a 2x to 3x throughput gain from a single line change. In case you’re wondering what happens if there are thousands of connections all consuming 16 Mb of RAM each on a device with no swap, you can relax as Linux will clip the maximum RAM per connection automatically if free RAM gets tight. Equally, this means that changing this parameter will only have an effect on router hardware with plenty of free RAM. Still, you can set this and nothing is going to blow up, it’ll just enter a slow path under load on RAM constrained devices.
1 Gbps broadband appears to be the price floor as of this year in Ireland – the 500 Mbps service is barely cheaper if it is cheaper at all (for the site when I ordered the Digiweb package their 500 Mbps and 1 Gbps packages were identically priced under a ‘New Year special offer’), and from my testing above it would seem that at least both Eir and Digiweb are providing a genuine true 1 Gbps downstream from the public internet, albeit obviously shared between however many residential customers at a time. The next obvious step for next year’s competitive landscape is a new price floor of 2 Gbps for residential fibre broadband where it doesn’t cost much more than 1 Gbps. OpenEIR was built with up to 5 Gbps per residential user in mind, after that things would get a bit tricky technically speaking. But, to be honest, I find 2.5 Gbps ethernet LAN more than plenty, and my planned fibre backhaul for my house is all 2.5 Gbps based principally because (a) it’s cheap (b) it’s low power and (c) again, genuinely, do you really ever need more than 2.5 Gbps except on the very occasional case of copying a whole laptop drive to backup?
The Bluray specification maxes out at 144 Mbps though few content ever reaches that – a typical 4k Ultra HDR eight channel video runs at about 100 Mbps. High end 4k video off the internet uses more modern compression codecs, and typically peaks at 50 Mbps. You could handily run twenty maximum quality Bluray video streams, or forty maximum quality Netflix video streams on a 2 Gbps broadband connection. As most households would probably never run more than four or five of those concurrently (and usually far less), I suspect the residential market will mainly care about guaranteed minimum 100 Mbps during peak evening hours rather than maximum performance in off peak hours.
That brings us back to contention and how densely is backhaul shared across residential homes. Back in vDSL days, I paid the extra for a business connection into my rented house because vDSL broadband became noticeably sucky each evening, so by paying extra for my traffic to be prioritised over everybody else’s I had good quality internet all day long. Fibre to the cabinet (FTTC) which was what vDSL was typically had 48:1 contention ratios for residential connections, but 20:1 plus priority traffic queue for business connections. I had assumed that fibre broadband would have a similarly sucky experience in the evenings, but so far it’s been fine with Pure Telecom in my rented house. Time will tell for Digiweb at the site.
OpenEir uses a contention ratio of 32:1 for residential connections, but that’s of a 10 Gbps link so you always get a guaranteed minimum of 312 Mbps per connection. As noted above, due to the G.hn powerline in between the ONT in my rented house we are capped to about that in any case, so it’s unsurprising I haven’t noticed any performance loss in the evenings. 312 Mbps is of course plenty for several concurrent 4k Netflix video streams, so I suspect so long as streaming video never stutters, 99.9% of fibre broadband users will be happy.
In fairness to governments, though it took them twenty years, they do appear to have finally solved ‘quality residential internet’ without any major caveats. I remember paying through the nose for cable based internet in Madrid back around the year 2000. It was the fastest package they had at 1 Mbps, and you usually got about 75 Kb/sec downloads off it. Back then hard drives were small, so you basically had it downloading 24-7 and you wrote out content to DVDs – I remember hauling a very heavy backpack stuffed with DVDs through the airport when I emigrated back to Ireland. A different era!
Word count: 1621. Estimated reading time: 8 minutes.
- Summary:
- The solar panel mounting kit was purchased from VEVOR for €55 inc VAT delivered each, and it consisted of two aluminium brackets made from 6005-T5 aluminium alloy, with a length of 1.27 metres, depth of 5 cm, and width of 3 cm. These brackets were stronger than expected and could withstand a load of up to 6885 Newtons before buckling.
Tuesday 10 February 2026: 10:37.
- Summary:
- The solar panel mounting kit was purchased from VEVOR for €55 inc VAT delivered each, and it consisted of two aluminium brackets made from 6005-T5 aluminium alloy, with a length of 1.27 metres, depth of 5 cm, and width of 3 cm. These brackets were stronger than expected and could withstand a load of up to 6885 Newtons before buckling.
In the meantime, I’ve been trying to coax my architect into completing the Passive House certification work which had been let languish these past two years as until the builder and engineer had signed off on a completely complete design, there was no point doing the individual thermal bridge calculations as some detail might change. So all that had gone on hiatus until basically just before this Christmas just passed. My architect feels about thermal bridge calculations the same way as I feel about routing wires around my 3D house design i.e. we’d rather do almost anything else, but we all have our crosses to bear and when you’re this close to the finish line, you just need to keep up the endurance and get yourself over that line. It undoubtedly sucks though.
I completed a small but important todo item this week which was to complete the roof tile lifting arm + electric hoist solution shown in the last post by creating a suitable lifting surface. This is simply a mini pallet with the wood from an old garden bench whose metal sides rusted through screwed into it – the wood is a low end hard wood and must be easily a decade old now, but as it had no rot in it when I cut up the garden bench, I kept it and it’s now been recycled into usefulness – though I suspect that this use will be its last hurrah, as all those concrete tiles are going to batter the crap out of it:
As that’s hard wood, it’ll take more abuse than the soft pallet wood, and I even used the rounded edged lengths at the sides to reduce splintering when loading and unloading. I have a lifting hoist I’ll thread around and through it, and it should do very well if we keep the weight under 125 kg which is the limit for the electric hoist in any case.
Another a small but important todo item was to solve how to mount solar panels onto the wall. We have six solar panels mounted on the south wall which act as brise soleil for the upstairs southern windows:
In the past three years I had not found an affordable and acceptable solution for how to mount those panels because I specifically did NOT want to use steel brackets, as those would produce rust stains running down the wall after a few years. I had consigned that problem to one that I’d probably have to fabricate my own brackets by hand from something like aluminium tube, so I was delighted to stumble across an aluminium solar panel mounting kit on VEVOR a few weeks ago. For €55 inc VAT delivered each you get two of these:
Actually, the bottom cross bar is an addition manufactured by me from 20x20x1.5 square aluminium tube, but I’ll get onto that in a minute. These VEVOR brackets are made from 6005-T5 aluminium alloy, are 1.27 metres long, 5 cm deep and 3 cm wide. They come with 304 stainless steel M8 bolt fasteners. I very much doubt that I could have made each for less than €28 each, and it would have taken me days to make all of them given I spent six hours making just the bottom cross bars alone. So I have saved both time and money here, which is always delightful.
Which brings me to cross bars. The outer brackets are very strong, and being braced at at least two occasions during their lengths I have zero concerns about them. This raises the cross bar: it is a 2 mm thick 570 mm long profile 30 mm on one side and 20 mm on the other side. Using Euler’s buckling load formula:
… using the appropriate values for 6005-T5 aluminium alloy, and for which I, the minimum second moment of area, is the hardest part to calculate, and for a right angle length I reckon that is:
… you’d expect a maximum load before buckling of: 6885 Newtons.
(I checked my minimum second moment of area calculation using the much less simplified https://calcs.com/freetools/free-moment-of-inertia-calculator, and it is about right)
6885 Newtons looks plenty strong enough. Let’s check it: the solar panels have an area of about 1.8 m2 and can take a wind load of up to 4000 Pa before disintegrating. We would have five brackets for four panels so our design load needs to be 4000 Pa x 1.8 * 4 / 5 = 5760 Newtons. If the crossbar were at the far end, you would halve that load between the top M8 bolt and the bottom crossbar, but because we’re mounting these on a wall and not on the ground, and because we need the panels to be at a 35 degree angle, the crossbar HAS to be most of the way up the two side arms. Indeed, if you look again at the photo above where the angle is correctly set to 55 degrees so the panels are at 35 degrees, the crossbar is about one third from the top. This is effectively a lever, and I reckon that it would amplify the force on the crossbar by about double, which would buckle it if the panel ever experienced a 1673 Pa wind gust.
Here are the worst recorded wind gusts ever (with pressure calculated by (P = 0.613 × V2):
- Worst in Ireland: 184 km/hr, 51 m/s = 1594 Pa
- Worst on land: 408 km/hr, 113 m/s = 7827 Pa
- Worst hurricane at sea: 406 km/hr, 113 m/s = 7827 Pa
- Worst tornado: 516 km/hr, 143 m/s = 12535 Pa
However, at an angle of 35 degrees, 0.57 of a horizontal wind pressure would apply to a panel, so not even 1000 Pa would ever land on a panel in the worst wind gust ever recorded in Ireland. So on that basis, that little cross bar should be more than plenty in real world conditions.
If these were steel brackets, we’d be done, but steel is unusual in the world of materials: it has a weird fatigue endurance curve. I’m going to borrow this graph from https://en.wikipedia.org/wiki/Fatigue_limit as it’s hard to explain in words:

Fatigue endurance of steel compared to aluminium over stress cycles
Most materials are like aluminium in that as you repeatedly flex them, their strength decreases as the number of flex cycles increases. This makes sense intuitively: imagine little tiny fibres in a rope breaking with each flex, and over time the rope loses strength. Steel however greatly slows down its strength loss after a million flexes, which is one of the big reasons so much structural stuff in modern society is made from steel: almost nothing as cheap and easy to mass produce as steel has this property. Hence your cars, houses, bridges, screws, bolts, nails etc anything which sees lots of repeated flex tends to be made from steel. Why is steel like this? It comes down to orientation of crystalline structure, but I’m getting well off the reservation at this point, so go look it up if you’re interested.
In any case, my brackets will be up in the wind getting repeatedly flexed, and that little cross bar would be getting flexed a lot. So while it might last five or even ten years, I had my doubts it would last until my death and those brackets will be an absolute pain to get to once the greenhouse is up. So I decided to add a second, longer, crossbar which you saw in the photo above.
For that I purchased eight one metre lengths of raw 20x20x1.5 square tube made out of 6060-T6 aluminium alloy. 6060-T6 is about half as strong as 6005-T5, but it didn’t matter for this use case and it was cheap at €4 inc VAT per metre. I drilled out holes for M8 stainless steel bolts, and voilà, there is the bracket above which is so strong that me throwing all my body weight onto it doesn’t make it flex even in the slightest. No flexing at all in any way is the ideal here as it maximises lifespan, so only the very slow corrosion of the aluminium will eventually cause failure.
Out of curiosity I calculated this second crossbar’s buckling load:
A 304 stainless M8 bolt will shear at 15,079 Newtons, so the top bolt of the bracket is fine. Assuming an even distribution of 5760 Newtons, that is 2880 Newtons on each end, a safety factor of 50% if the middle crossbar were not fitted. If I were not fitting the middle crossbar, I probably would have used 25 mm sized tube, because buckling strength is related to dimension cubed, it would be a very great deal stronger.
I will be fitting the middle cross bar however, but more to prevent any side flex of the side brackets than anything else. The far bigger long term risk here is loss of strength over time due to flex, and that middle cross bar does a fabulous job of preventing any flex anywhere at all. In any case, this bracket is now far stronger than necessary, it would take a 12 kN load which is far above when the panels would fall apart. I think they’ll do just fine.
Word count: 5240. Estimated reading time: 25 minutes.
- Summary:
- The 400b? full fat Claude Sonnet 4.5 model does by far the best summary, while the Qwen3 30b model is obviously a lot more detailed than the Llama3.1 8b model but misbalances what detail to report upon and what to leave out. The Llama3.1 8b produces a short summary consisting of only what it thinks are the bare essentials, and I would personally say it’s a fairly balanced summary choosing a fair set of things to report in detail and what to omit.
Friday 30 January 2026: 19:39.
- Summary:
- The 400b? full fat Claude Sonnet 4.5 model does by far the best summary, while the Qwen3 30b model is obviously a lot more detailed than the Llama3.1 8b model but misbalances what detail to report upon and what to leave out. The Llama3.1 8b produces a short summary consisting of only what it thinks are the bare essentials, and I would personally say it’s a fairly balanced summary choosing a fair set of things to report in detail and what to omit.
Still, that’s expensive for a shell, you’d expect about €1,000-1,200 inc VAT per m2 for a NZEB build. Obviously we have one third better insulation which accounts for most of the cost difference, but some of the rest of the cost difference is the considerable steel employed to create that large vaulted open space. We should use 4.3 metric tonnes of the stuff, and thanks to EU carbon taxes steel is very not cheap in Ireland.
To get it to weathertight, I expect glazing will cost €80k inc VAT or so. I have no control over that cost, so it is what it is. The outer concrete block I also have no control over that cost, blocks cost at least €2 inc VAT each and the man to lay them is about the same. The QS thinks that the exterior leaf will cost €25k, and the render going onto it €49k – I have no reason to disbelieve him, and again I have no control over that cost, so it is what it is.
Where I do have some control is for the roof. The QS budgeted €46k for that. I think if we drop the spec from fibre-cement tile to cheapest possible concrete tile and I fit the roof myself I can get that down by €30k or so. I put together this swinging arm and electric winch for lifting up to 100 kg of tiles at a time to the roof, it clamps onto scaffolding as you can see, and this should save us a lot of time and pain:

Put together for under €150 inc VAT delivered, the arm can extend to 1.2m and can lift up to 300 kg. The winch can only manage 250 kg at half speed, 125 kg at full speed.
Saving 30k on the roof doesn’t close the funding gap to reach weathertight, but it does make a big difference. It would be super great if a well paid six month contract could turn up soon, but market conditions are not positive: there is a very good chance I’ll have zero income between now and when the builder leaves the site.
I have no idea from where I’ll find the shortfall currently. What I do know is that next year planning permission expires as it’ll have been five years since we got the planning permission. We need the building to be raised and present ASAP. We’ll just have to hope that the tech economy improves, for which I suspect we need the AI bubble to pop so the tech industry can deflate and reinflate and the good work contracts reappear. I’ll be blunt and say I find it highly unlikely it can pop and recover within the time period we need, so I just don’t know. Cross that bridge when we get to it.
Anyway, turning to more positive topics, as there has been forward progress on the house build I’ve forced myself to work further on the services layout as I really hate doing them, so me forcing myself to get them done is a very good use of my unemployment time. Witness the latest 3D house model with services laid out:
I use a free program for this called Sweet Home 3D which has a ‘Wiring’ Plugin which lets you route all your services. The above picture overlays all the service layers at once which is overwhelming detail, however each individual service e.g. ventilation, or AC phase 1, is on its own individual layer. You can thus flip on or off whatever you are currently interested in. This 3D model is made in addition to the schematic diagrams drawn using QElectroTech which I previously covered here, and both are kept in sync. The schematic diagrams are location based, so if you’re in say Clara’s bedroom you can see on a single page all the services in there. You can get the same thing from the 3D model by leaving all the layers turned on an looking at a single room – and sometimes that is useful especially to see planned routing of things – but in the end both models do a different thing and both will save a lot of time onsite when the day comes.
I am maybe 80% complete on the 3D model, whereas I am 99% complete on the schematic diagrams. I still have to route the 9v and 24v DC lines, and I suppose there will be some 12v DC in there too for the ventilation boost fans and the pumps in the showers for the wastewater heat recovery. I hope to get those done this coming week, and then the 3D model will more or less match the schematics and that’ll be another chore crossed off.
At some point the builder will produce a diagram of set out points for my surveyor, and then we can get service popups installed and the site ready for the builder to install the insulated foundations. That’s a while away yet I suspect. Watch this space!
What’s coming next?
I continue to execute my remaining A4 page of long standing chores of course, and I likely have a few more months left in those. I have an ISO WG14 standards meeting next week – none of my papers are up for discussion so it shouldn’t be too stressful, and if I’m honest I find the WG14 papers better to read than WG21 papers, so it’s easier to prep for that meeting. I have wondered why this is the case? I think it’s because I know almost all the good idea papers won’t make it at WG21 but they’ll take years and endless revisions before they make that clear, whereas at WG14 you usually get one no more than two revisions of papers which won’t make it. Also, I personally think more of the WG14 papers are better written, but I suppose that’s a personal thing. In any case, C++ seems to be in real trouble lately, this tech bubble burst looks like it’s going to be especially hard on C++ relative to other programming languages i.e. I think it’ll bounce back in the next bubble reinflation less than other programming languages. That’s 90% the fault of that language’s leadership – as I’ve said until I’m blue in the face why aren’t they doubling down on the language’s strengths rather than poorly trying to compete with the strengths of other programming languages – but nobody was listening, which is why I quit that standards committee last summer.
More AI
I’ve started spending a lot more time training myself into AI tooling, so much of my recent maintenance work with my open source libraries has had Qwen3 Coder helping me in places. Qwen3 Coder is a 480 billion parameter Mixture of Experts model, and Alibaba give a generous free daily allowance of their top tier model which is good for about four hours of work constantly using it per day (obviously if you are parsimonious with using it, you could eke out a day of work with the free credits). As far as I am aware, they’re the only game in town for a free of cost highest end model, as Open AI, Anthropic, Microsoft, Google et al charge significant monthly sums for access to their highest end agentic AI assistants (they let you use much less powerful models for free of cost, but they’re not really worth using in my experience). Also, unlike anybody else, you can download the full fat Qwen3 Coder and run it on your own hardware and yes that is the full 480b model weighing in at a hefty 960 Gb of data. As the whole model needs to be in RAM, to run that model well with a decent input context size you would probably need 1.5 Tb of RAM, which isn’t cheap: I reckon about €10k just for the RAM alone right now. So Alibaba’s free credit allowance is especially generous considering, and you know for a fact that they can’t rug pull you down the line once you’re locked into their ecosystem – which is a big worry for me with pretty much all the other alternatives.
I’ve only used Claude Sonnet for coding before I used Qwen3 Coder, and that was what Claude was over a year ago. Claude back then was okay, but I wasn’t sure at the time if it was worth the time spent coaxing it. Qwen3 Coder, which was only released six months ago, is much better and at times it is genuinely useful, mainly to save me having to look something up to get a syntax or config file contents right. It is less good at diagnosing bugs apart from segfaults as it’ll rinse and repeat fixes on its own until it finds one where the segfault disappears, and depending on the parseability of the relevant source code it can write some pretty decent tests for that portion of code. Obviously it’s useless for niche problems it wasn’t trained upon, or bugs with no obvious solution to any human, or choosing the right strategic direction for a codebase (which is something many otherwise very skilled devs are also lousy at), so I don’t think agentic AI will be taking as many tech dev roles as some people think. But I do think the next tech bubble reinflation shall pretty much mandate the use of these tools, as without them you’ll be market uncompetitive – at least within the high end contracting world.
Having spent a lot of time with text producing AI and only a little with video and music producing AI, I have been shoring up my skills with those too. Retail consumer hardware has been able to run image generating AI e.g. Stable Diffusion for some time, but until very recently image manipulation AI required enterprise level hardware if it was going to be any good. However six months ago Alibaba released Qwen3 Image Edit which dramatically improved the abilities of what could be done on say an 18Gb RAM Macbook Pro like my own. This is a 20 billion parameter model, and with a 6 bit quantisation it runs slowly but gets there on my Mac after about twenty minutes per image edited. Firstly, one feeds it an input image, I chose the Unreal Engine 5 screenshot from three years ago:

The original from Unreal Engine 5 (but scaled down to 1k from 4k resolution)
I then asked for various renditions, of which these were the best three:

What Qwen3 image edit AI rendered as a charcoal and pencil drawing

What Qwen3 image edit AI rendered as a watercolour

What Qwen3 image edit AI rendered as a finely hatched pencil drawing
This is simple stuff for Qwen3 image edit. It can do a lot more like infer rotation, removal of obstacles in the view (including clothes!), insertion of items, posing of characters, replacing faces or clothing, and it can add and remove text, banners, signs or indeed anything else which you might use in a marketing campaign. All that has much potential for misuse of course – if you want to edit a politician into an embarrassing scene, or a celebrity into your porn scenario of choice, there is absolutely nothing stopping you bar some easy to bypass default filters in Qwen3 image edit.
I thought I’d have it edit the original picture into one of a scene of devastation like after a nearby nuclear strike to see how good it was at being creative. This model is hard on my Macbook, each twenty minute run consumes half the battery as all the GPUs and CPUs burn away at full belt, so this isn’t a wise thing to run when you’re putting the kids to bed. Still, here’s what it came up with for this prompt:
Transform this image into a scene of devastation, with the houses mostly destroyed and partially on fire, the sky dark with smoke and burned out cars and scattered children’s toys on the grass and road.
It looks rather AI generated, but that was genuinely its very first attempt and I didn’t bother refining it to reduce the unrealistically excessive number of burned out cars, the small children it decided to add on its own, or the unphotorealistic colour palette it chose. I don’t doubt that I could have iterated all that away with some time and effort, but ultimately all I was really determining was what it could be capable of, and that’s not half bad for a first attempt for an AI which can run locally on my laptop.
Finally, up until now the best general purpose LLM I’ve found works well on my 18Gb RAM Mac book has been the 8b llama 3.1. It uses little enough memory that 32k token context windows don’t exhaust memory and reduce performance to a crawl, however a potential new contender has appeared which is a per layer quantised 30b Qwen3 instruct model to get it to fit inside 10 Gb of RAM. This model went viral over the tech news last month because it’ll run okay on a 16 Gb RAM Raspberry Pi 5, albeit with a max 2k context window which isn’t as useful as it could be. Thankfully, my Macbook can do rather more, and after some trial and error I got it up to a 20k token context window which is definitely the limit of my RAM (the 18 Gb Macbook Pro has a max 12 Gb of RAM for the GPU).
I should explain quantisation for the uninitiated: models are generally made using sixteen bit floating point weights, and for use reducing those to eight bits halves the RAM consumption and doubles the performance for only a little loss in quality and capability. Below eight bits things get a bit dicier: the four bit quantisation is a common one for retail consumer hardware, there is some loss in the model but it’s usually acceptable. The Mixture of Expert sparse models like those of the Qwen series offer a further option: they work by an initial model choosing which sub-models to use, so you can quantise each of those models individually to pack the overall model more tightly. So your 30b model, which work normally consume 18 Gb of RAM or so at four bit quantisation, can be packed into less if some of the sub-models are quantised down to two bits, but others at four bits, and so on. The aforementional viral model came in a range of quantisations, and I chose the 2.75 bit average quantised model. This fits into 10 Gb of RAM, and this is how it is possible to run a 30b model on a 18 Gb RAM total computer.
I fed both the Qwen3 30b and Llama3.1 8b models the same raw transcribed will of my ancestor Walter Irvine as I fed cloud full fat Claude Sonnet 4.5 a few posts ago, and here are their summaries of that document side by side:
| Cloud full fat Claude Sonnet 4.5 400b? 8 bit? | Local Qwen3 30b 2.75 bit | Local Llama3.1 8b 4 bit |
|---|---|---|
| (nearly instant) | 16k context: 12.54 toks/sec generation 20k context: 6.65 toks/sec (it was swapping to disc) | 16k context: 20.24 toks/sec generation 32k context: 19.47 toks/sec generation |
|
This is a comprehensive probate document from Edinburgh dated November 17, 1825, concerning the estate of Walter Irvine, Esquire, who died on January 7, 1824, at Luddington House, Egham, Surrey. He held properties in the Island of Tobago (West Indies), Luddington House in Surrey, and Denino in Fife, Scotland. Scottish Estate Inventory (Total: £5,258 6s 2d)I. Rents and Arrears Due (£4,569 13s 2½d):
II. Bond from John M. Harlaw of Kirkton: £245 5s 1d plus interest to January 1824 (Total: £262 15s 1½d) III. Bond by Road Trustees (Anstruther-Crail to St. Andrews): £150 principal plus interest (Total: £162 7s 4½d) IV. Bond by Lundenfoot Road Trustees: £50 with compounded interest from 1821-1824 (Total: £55 1s 2d) V. Proceeds from sale of wood: £33 10s VI. Balance due from Walter Dickson WS (Writer to the Signet): £174 16s 11d plus interest (Total: £174 18s 11d) Will Provisions (dated October 18, 1823)Primary Beneficiaries:Wife - Catherine Irvine (née Gordon):
Three Legitimate Daughters (each receiving £35,000):
Natural Daughter - Betsey (of London):
Other Significant Bequests:Ann Stewart (sister-in-law and widow of Stewart):
Brother's Estate:
Nephews:
Nieces (Clara and Marian Irvine - daughters of Christopher William):
Jane Armstrong (formerly Jane Currie, wife of Francis Armstrong):
Mary (reputed daughter of a negro woman named Linda in Tobago):
Other Named Beneficiaries:
Continuing Annuities:
Trust Structure and Powers:Executors and Trustees:
Powers Granted to Trustees:
Specific Provisions for Children's Trusts:
West Indies Property:John Leith Dispute:
Plantation Management:
Scottish Property Requirements:Election Doctrine:
Residuary Estate:After all debts, legacies, and annuities paid, the residue divided equally among three legitimate daughters as tenants in common:
Elizabeth's share for her separate use with power to dispose by deed or will despite coverture. Probate Details:Proved: January 28, 1824, in Prerogative Court of Canterbury Initial Grant of Administration to:
Reserved Power for: John Hamilton, Christopher Irvine, and Thomas Bruce to apply later Witnesses to Will:
Codicil: Dated same day (October 18, 1823), adding £100 bequest to Susannah Ann Hamilton Scottish Confirmation: Presented November 17, 1825, before George Carshin Jr., Commissioner of Edinburgh, with deposition by W.R.K. Douglas confirming the inventory and will details. The document demonstrates extensive wealth with significant properties in Scotland, England, and the West Indies, careful provision for family members including an illegitimate daughter, and sophisticated trust arrangements typical of wealthy estates in the early 19th century. |
The document presents a detailed legal inventory and probate record of the personal estate of Walter Irvine, an Esquire who died on January 7, 1824, at Luddington House in Surrey. The inventory was compiled by William Robert Keith Douglas (a Lord of the Admiralty and Executor), following Walter Irvine’s will dated October 18, 1823, which was proved in the Prerogative Court of Canterbury in 1824. Key Details:
Major Assets & Liabilities (Personal Estate):I. Rents Due to Deceased:
II. Debt to John Harlan:
III. Bond on Road from Auchter and Crail to St. Andrews:
IV. Lundenfoot Road Trusts:
V. Wood Rents Due:
VI. Balance due to Walter Irvine from Walter Dickson (Account Current):
Summary of Estate Value:
Testamentary Will – Key Provisions:Walter Irvine’s will outlines complex trusts and legacies: 1. Annually to Wife Catherine Irvine:
2. Legacies to Children & Relatives:
3. Trusts & Management of Legacies:
4. Additional Bequests:
5. Power of Trustee:
Legal Verification & Probate:
Final Notes:This document confirms Walter Irvine’s death, the validity of his will, and provides a full inventory of his personal estate in Scotland (£5,258.6.2) and English holdings. The complex trust system ensures lasting financial protection for heirs, particularly daughters, with provisions for education, marriage, and inheritance.
|
The text appears to be a combination of two main documents: the last will and testament of Walter Irvine, dated October 18, 1823, and related probate documentation. Last Will and Testament The will is written in a formal style, detailing Walter Irvine's wishes for the distribution of his estate after his death. He appoints five executors to manage his estate: Catherine Irvine (his wife), William Robert Keith Douglas, Alexander Gordon, John Hamilton, and Thomas Bruce. Walter bequeaths various legacies to friends, family members, and charities. These include:
He also establishes a trust for his daughters Elizabeth Douglas, Christina Charles, Catherine Grace, and Betsy. Each daughter will receive one-third of the estate upon their marriage or death. The surviving husband will have a life interest in the legacy after his wife's death. Walter also directs that his real and personal property be held in trust for his three daughters (or any of them who survive him) to be divided equally between them as tenants in common with benefit of survivorship. He also provides for the education, maintenance, and advancement of each daughter's children. Probate Documentation The will was proved on January 28, 1824, in the Prerogative Court of Canterbury, where Catherine Irvine and William Robert Keith Douglas were sworn as executors. On November 12, 1825, Lord Wm K K Douglas (presumably a judge) granted probate to the executors. This grant refers back to Walter's deposition on the inventory of his personal estate on October 25, 1825. The probate documentation includes various affidavits from witnesses who attest that Walter Irvine was of sound mind and disposing memory when he made his will. The will is described as being written in eighteen sheets of paper with a seal attached to the final sheet. Inventory of Personal Estate An inventory of Walter's personal estate was prepared by Mr. Walter Dickson, Winter (a Signet). It lists various debts owed to the deceased, including:
The total value of Walter's estate in Scotland is estimated to be £5,258.6.2. Overall, the text provides a detailed picture of Walter Irvine's last will and testament, as well as related probate documentation and an inventory of his personal estate. |
Scroll down to see more ...
Unsurprisingly, the 400b? full fat Claude Sonnet 4.5 model does by far the best summary. The Qwen3 30b model is obviously a lot more detailed than the Llama3.1 8b model, but it also misbalances what detail to report upon and what to leave out e.g. it gets what was left to Betsey completely wrong. Llama3.1 8b produces a short summary consisting of only what it thinks are the bare essentials, and I would personally say it’s a fairly balanced summary choosing a fair set of things to report in detail and what to omit.
After a fair bit of testing, I think I’ll be sticking to Llama3.1 8b for my local LLM use. It has reasonable output, it follows your prompt instructions more exactly, and it’s a lot faster than the Qwen3 model on my limited RAM hardware. But Qwen3 did better in my testing than anything I’ve tested since Llama3.1 came out – I was not impressed by Gemini 12b, for example, which I found obviously worse for the tasks I was giving it. Qwen3 isn’t obviously worse, it looked better initially, but only after a fair bit of pounding did its lack of balance relative to Llama3.1 become apparent.
All that said, AI technology is clearly marching forwards, give it a year and I would not be surprised to see Llama3.1 (which is getting quite old now) superceded.
Returning to ‘what’s coming next?’, I shall be taking my children to England to visit their godparents in April which will give Megan uninterrupted free time to study. I expect that will be my only foreign trip this year. During February and March I mainly expect to clear all open bugs remaining in my open source libraries, practising more with AI tooling, and keep clearing items off the long term todo list. I am very sure that I shall be busy!