Welcome to ned Productions

by . Last updated .

Welcome to ned Productions (non-commercial personal website, for commercial company see ned Productions Limited). Please choose an item you are interested in on the left hand side, or continue down for Niall’s virtual diary.

Niall’s virtual diary:

Started all the way back in 1998 when there was no word “blog” yet, hence “virtual diary”.

Original content has undergone multiple conversions Microsoft FrontPage => Microsoft Expression Web, legacy HTML tag soup => XHTML, XHTML => Markdown, and with a ‘various codepages’ => UTF-8 conversion for good measure. Some content, especially the older stuff, may not have entirely survived intact, especially in terms of broken links or images.

Latest entries: Feed icon

Word count: 2545. Estimated reading time: 12 minutes.
Summary:
The author’s current recovery from illness is described. Detailed plans regarding the house construction are presented, including groundworks validation and roofing installation strategies. Cost savings achieved through material substitutions are detailed. Furthermore, prototype roof sections are being constructed for practical experience to be gained regarding the project’s execution.
Wednesday 15 April 2026:
14:07.
Word count: 2545. Estimated reading time: 12 minutes.
Summary:
The author’s current recovery from illness is described. Detailed plans regarding the house construction are presented, including groundworks validation and roofing installation strategies. Cost savings achieved through material substitutions are detailed. Furthermore, prototype roof sections are being constructed for practical experience to be gained regarding the project’s execution.
Woke up early this morning with plans as the groundworks got cancelled, but I began to feel increasingly awful so I went back to bed. After four hours of additional sleep, I still feel not well, so I won’t be leaving the house today. Instead I’ll be writing this entry, as I was going to write this anyway later this week.

I remember getting something a bit like this after my wedding when I assumed it was some sort of covid precursor infection as I was laid up for days with a pounding headache and my stomach all nauseous. I am beginning to wonder if this is in fact a reaction to being overly stressed for too long whereby I run at 120% capacity for many days, and then kinda collapse afterwards. Certainly the ten days in England looking after my kids alone would qualify as a wedding-like over-exertion. In any case, here’s hoping that I’m fully recovered tomorrow, as I really do need to be getting on with things.

Before I get on with the main content of this entry, I have finally replaced llama 3.1 8b as the AI summarising these entries with Gemma 4 12b! This was forced by recent Ollama versions hanging, and after fiddling with it for a while I gave up and moved over to LMStudio which now also has a REST API. LMStudio runs models faster on Mac OS than Ollama did, so I can now afford to run a 12b model and it takes about seventy seconds per entry which is a little more than Ollama did with llama 3.1 8b, but the quality of summary and keywords generated is noticeably better. I’m half tempted to let it re-summarise all the previous posts … maybe I shall in the future.

Nearly a month ago now my surveyor marked out the exact locations for the service popups for the house using paint onto the T2 stone (a very expensive structural gravel). As paint washes away and gravel moves, I made those more permanent by chopping up a 4.8m length of steel rebar into foot long lengths using an angle grinder, and hammering them into the gravel. For safety’s sake, and also because it’s very easy to miss the rebar in the gravel and trip over them, I put yellow plastic mushroom safety caps on top of each rebar point, and thus you get this sight from above:

If you overlay on top of that the groundworks plan you get this:

If you look closely, you will see that the yellow points map onto the plan perfectly except in the bottom left. That, at the time, introduced a fair panic because it was days before I took the kids to England, and groundworks were supposed to start yesterday morning the day after I got back. Obviously my surveyor was also rather panicked, so he supplied me a list of X-Y values for each point on the site as marked. I then came up with a mechanism for validating the position of the errant points using known good points:

The pencilled values are what was measured using a measuring tape onsite, and as you can see they match exactly what they should have been. Crisis averted!

So what caused this? It turns out that my drone, the DJI Mini 3 Pro, takes a very distorted picture. Onboard software undistorts that picture into a JPEG, and due to its onboard CPU not being powerful, they use a not particularly accurate but lightweight algorithm. In the future, I need to capture the RAW output and run undistortion on a proper PC and according to the internet, the output should then be much less distorted. Which is good to learn!

The roof

My attention has now shifted away from the groundworks towards getting the roof on, as after the builder has departed I need to get the roof on ASAP and given my severely limited cash flow given lack of income since June 2025, we shall be fitting the roof ourselves.

Originally, we had been thinking real slate or fibre-cement tile for the sloped roof with EPDM for the flat roof, and probably contracting in somebody to do that part as it would be quicker. We had originally budgeted €35k inc VAT for the roof:

  • There is 232 sqm of sloped roof and 40 sqm of flat roof with 29 metres of ridge, 79 metres of fascia, 44 metres of gutter, 35 metres of verge, and 10 metres of valley.

  • Assuming use of two 30 x 2.65 copper nails and one copper crampion per fibre-cement slate, I reckon materials cost for the slates is around €31 inc VAT per sqm, so €7,152 inc VAT.

  • EPDM would cost around €25 inc VAT per sqm, so €2000 inc VAT including oversizing.

  • Black uPVC fascia would cost around €800 inc VAT for materials.

  • Gutters would cost €700 inc VAT for materials (we are using extra large industrial six inch gutters to harvest rainwater).

  • Lead valleys would cost around €550 inc VAT for materials.

  • Metal trimmed verges would cost €385 inc VAT for materials.

Already you’re looking at €12k of materials and labour is €400-500 per day, so there goes your €35k of cost pretty quickly.

The gutters are interesting, they are from Irish Rollforming who are just up the road from me:

These have 0.02 m3 volume which is twenty litres per metre of length. Is this big enough?

  • Worst case recorded rainfall in Ireland was 36.2 litres per sqm per hour, and worldwide it was 107 litres per sqm per hour. Taking the Irish worst case, that is 0.01 litres per sqm per second.

  • 86 sqm of roof to the south, probably 100 sqm of roof to the north for the main gutter. So that would be 0.86-1 litres of rain per second entering each gutter.

  • A 100 mm diameter downpipe should be able to sink 65 litres per second BUT the water needs to make it the horizontal length of the gutter first, and it’s almost level (fall is 1:600, which would be 30 mm across the length of the house). I think the calculation is to divide the volume by ten to get an approximate flow capacity, so twenty litres per metre length would become two litres per second capacity.

  • Therefore, a six inch gutter is oversized twofold for the roof and worst case hourly Irish rainfall. That seems about right: gutters will fill with debris and bursts of rainfall can be more intense than the per hour recorded maximum. Also, almost certainly it’s going to rain more intensely in the near future.

The plan is to fit the six inch width gutters where we harvest the rainwater, and four inch width gutters elsewhere. Both will be naked galvanised steel colour to save money, but it should also create an interesting look against the black uPVC fascia.

Anyway, the above was the original design from two/three years ago, and given my new cash poor time rich circumstances I’ve made the following changes:

  1. Instead of slates fixed with copper nails and crampions, I’ll be fitting concrete tiles fixed with aluminium nails which is the cheapest available roofing solution in Ireland right now. I reckon it’ll cost around €13 inc VAT per sqm for a total cost of around €3,100. This saves €4k, but the concrete tiles are also harder for me to mess up installing, albeit much heavier to have to lift up to the roof (for which I have created an electric winch).

  2. Metal trimmed verges replaced with cheap polypropylene plastic ones, saving €100. This is also because the price above is for slates and equivalent ones for tiles are quite a bit more expensive.

  3. Lead valleys and flashings replaced with spare bits of EPDM, seeing as I’ll be buying a load of EPDM for the flat roofs anyway. This saves a few hundred more euro.

All in all, I think I can get the whole roof done for under €10k all in now, so that’s €25k saved all of which will be desperately needed to pay for the outer block leaf.

Flat roof construction

Here is my current hallway:

Those three large boxes contain €1,800 worth of EPDM membrane! Each box contains 40 sqm, so that is 120 sqm in total, and the thickness is 1.14 mm which is a grade above the 1.0 mm stuff sometimes used for residential applications. It weighs a lot, 57 kg per box:

… which corresponds to 1.267 g/m2 density, which is 1,111 kg/m3 for the material which is less dense than I would expect. Within each box is a roll of the material with plywood protectors for the ends:

EPDM is not a popular flat roof solution in Ireland for some reason, so finding a supplier of large sheets who will deliver to Ireland for a reasonable price is hard. I went with VEVOR for the above, who are not the cheapest but once delivery to Ireland is factored in they are competitive. Before ordering I did make sure all the sections I’d need were covered, I added a 750 mm oversize to all dimensions just to be safe:

The oversize should let me avoid flashing and give plenty of extra material to run underneath tiles etc. The tanks get two sheets because you need one onto the OSB deck below the outer PIR board insulation, and another above the insulation below the water tanks. The other flat roofs I intend to fit insulated metal roof panels above the EPDM glued to the OSB decks, those will also come from Irish Rollforming but they can be left off until I get more money. My architect had specified torch on felt for the OSB deck waterproofing, and at €12.35 inc VAT per sqm it is undoubtedly cheaper than the €14.20 inc VAT per sqm which my EPDM cost. However by the time you factor in the rental of the torch and the multiple bottles of gas it would go through, plus my zero experience with torching on felt and the fire and burn risk, I decided I’d just pay the two hundred euro more for EPDM which is a far superior flat roof anyway than torch on felt.

Prototyping a roof

Seeing as I’ve never roofed a house before, I am understandably a little anxious about it so I’ve decided to build a prototype roof eave matching what the builder will build to get some experience. You can see in the upper photo above the black uPVC fascia, the grey uPVC eaves tray, two eaves ventilators (25 mm and 10 mm, I bought one of each as I don’t know the correct one for my roof yet), and at the far end is a 1 kg box of aluminium clout nails. Here is my office currently:

The big sheet is a sample of the ‘roof tile effect’ metal roof panel from Irish Rollforming; at its bottom is a concrete roof tile ventilator used for ventilating the roof near the ridge; to the right of that is the cheapest available plastic verge which is nasty but looks sufficient; bottom of picture is a stack of concrete roof tiles. All I am currently missing is the wood from which to construct an example eave, which will be to this design:

The rafters are at 400 mm centres whereas the walls will be at 600 mm centres, so there is a ‘seat’ running on top of the wall studs which is called ‘head binder’ in the diagram above. The 295 mm thick rafters extend out as far as the fascia and soffit backing boards, which I think will be 35 mm thick wood, and it appears to be 150 mm tall. They show a 225 mm tall fascia being affixed externally, but theirs misses the lip at the bottom:

I think that means that the fascia will be 10-12 mm lower at the top, and one would therefore extend the fascia upwards with a 10 mm ventilator to support the eaves tray. The alternative is a 150 mm tall fascia, but that would stop short of the subfascia board which then would get in the way of the 25 mm ventilator – also the resulting total would be 175 mm tall, which from their diagram looks to be not enough. In any case, the eaves tray then runs from under the roof membrane over the tilting fillet then over the over-fascia ventilator with it tipping into the gutter. This is much easier to show than describe in words, so here is a picture of somebody else’s eave:

The bottom row of roof tile ends land on top of the uPVC eaves tray such that any condensation which forms underneath the roof tiles will drip down the roof felt and collect where the tile meets the eaves tray, and I assume would drain into the gutter if enough of it accumulated. The bottom of each roof tile has drains to one side – I assume that the thinking here is that any water which pools at the bottom of a tile would overflow to the side and therefore escape down the small vertical gap between tiles.

Their current design has a 210 mm eaves overhang which is comprised of a 50 mm ventilation cavity, 100 mm outer block leaf, 10 mm render, 22 mm LED strip mount, and finally 28 mm fascia lip. The LED strip clips are 16 mm wide, however we want them slightly away from the wall to project the light further.

My intention is to build two approx 1 sqm roof samples, one using concrete tiles, the other the metal box sheet above, and to fit each with fascia and gutters and see how it all looks and lines up. I’ll also come away with hopefully enough experience that I can fit the roof to the house for my very first time correctly.

What’s next?

In terms of priority queue, something rapidly moving up the list is that I need to do a full itemisation of what has been bought from Aliexpress for the house in the past, and make absolutely sure everything has been purchased already. The reason why is that on 1st June the EU will bring in levies on small packages from China, and that will make everything purchased thereafter much more expensive. So if I’m missing anything e.g. light switches, relays, whatever – those need to be ordered before end of April to ensure they’ll pass EU customs before 1st June.

I have another diary entry coming at some point on biasing LLMs, and if the groundworks get done there will be surely an entry on the results of that. Megan’s car will fail its NCT tomorrow due to at least a worn ball joint and a cracked windscreen, so I’d imagine some of my time will be going on that. There are more WG14 papers to write which keep getting pushed back as other higher priority stuff takes precedence. And there is the continuing backlog of maintenance work on my open source libraries which seems to constantly be pushed back too.

As always, nowhere near enough time to get things done! I may not be earning, but I always keep busy.

#house




Word count: 1912. Estimated reading time: 9 minutes.
Summary:
A stressful journey to England is to be undertaken. The purchase and review of a new TCL television were documented. While the unit’s affordability was noted, quality control issues and poor default picture settings were encountered. Detailed comparisons to other displays were made, suggesting that optimal performance must be achieved through software tuning.
Thursday 2 April 2026:
14:43.
Word count: 1912. Estimated reading time: 9 minutes.
Summary:
A stressful journey to England is to be undertaken. The purchase and review of a new TCL television were documented. While the unit’s affordability was noted, quality control issues and poor default picture settings were encountered. Detailed comparisons to other displays were made, suggesting that optimal performance must be achieved through software tuning.
I am feeling run off my feet as I shall be taking all my children alone to England for ten days tomorrow to give Megan uninterrupted space to study for her exams. It’s quite stressing me out – I was up until 3.30am last night clearing backlog items before I depart, and then I barely slept from the anxiety and I was up bright and early this morning to ring people to clear more backlog. In case you’re thinking ‘why didn’t he get all that done much earlier in the week?’ it’s because I didn’t have the money to pay for those backlog clearances until 1st April i.e. yesterday. So I had done my research and written up my notes ready to close everything out in these two days before we depart, but then a few things had changed since which required re-verifying everything before I pulled triggers which were going to cost me many thousands of euro – and as you know from here, I do like to really do my research before spending thousands of euro!

Anyway all that is for later diary entries here. This one will be a quick one: my father has had me purchase a new TV for him because apparently I am an expert on TV purchasing. I went, as last time, to the Reddit Home Theatre enthusiasts list of recommended models and the current cheapest TV on there is the TCL C6K which comes in either a 50 or 55 inch or more sizes, but the 55 inch has better speakers and faster CPU than the 50 inch, so my father opted for the 55 inch mainly due to those better speakers as he is going deaf.

TCL is a Chinese brand well known for supplying high specifications at low prices and not the best quality control, so you need to be extra careful checking the TV on receipt to make sure its screen isn’t full of dead or stuck pixels, that the legs do screw on, and that the case isn’t so warped it won’t lie flat on the wall (these are all commonly reported problems, and thanks to EU consumer law you will get a free of cost replacement – effectively TCL outsources the final check quality control to consumers). If you do get one of the good ones, you get a nearly vanilla Google TV OS installation and a good price for the features. Unsurprisingly, TCL TVs are a mainstay of the Reddit Home Theatre enthusiast buying guides as usually the cheapest models on that list. But that affordability does come with some work for the buyer, and some hidden caveats too …

Everybody online including the professional reviewers agree that TCL TVs come with lousy picture setting defaults. Having seen my first of these TVs, I absolutely agree: I don’t know what they are thinking with the default out of the box settings. So, very first thing you do is apply a standard set of picture settings changes which can be found on Reddit or rtings or many other places. You then do some rinse and repeating with various types of content trying to match the picture to your Macbook Pro’s display as your reference display.

After a few hours of twiddling, you DO get a good looking picture on that TV for Antenna/Satellite content and SD content. Most kinds of HDR content also look pretty good, though somehow the image is a bit ‘flat’ from what I think it ought to be. Unfortunately, anything with Dolby Vision the TV won’t let you adjust most of the settings as they’re locked, and no you can’t override ‘Dolby Vision mode’ for Dolby Vision HDR content. So, perplexingly, all Dolby Vision HDR content looks inferior to other TVs on this TV, despite that I know for a fact given I’ve seen it in other modes that the TV is perfectly capable of displaying a much better rendition of Dolby Vision HDR content if it would only allow you to change the locked settings.

The last time I reviewed TVs was for my then new Panasonic TV almost exactly this time last year. I compared the Panasonic to my previous ancient Samsung TV, which wasn’t very fair, but I didn’t at the time compare it to my current BenQ workstation computer monitor bought in 2021, so let’s fix that now:

BenQ EW3280UPhilips 65OLED937Panasonic TV43W90AEBTCL 55C6K
Year released2019202220242025
Screen size32"65"43"55"
Panel technologyIPS with dimmable LED backlightWOLED-EXVA with full array backlight dimming (FALD)HVA with quantum dot Mini-LED backlight dimming
Backlight dimming zones18.3M40180
Panel bit depth8 bit + 2 bit FRC10 bit8 bit + 2 bit FRC8 bit + 2 bit FRC
Panel resolution3840 x 21603840 x 21603840 x 21603840 x 2160
Panel max refresh rate60 Hz120 Hz144 Hz144 Hz
HDRHDR10HDR10, HDR10+, Dolby VisionHDR10, HDR10+, Dolby Vision, Dolby Vision IQHDR10, HDR10+, Dolby Vision, Dolby Vision IQ
DCI-P3 benchmarked95%99%96%89%
Rec.2020 benchmarked?75%73%66%
Max benchmarked brightness350 nits1300 nits600 nits400 nits
Contrast1000:1Infinite5400:16000:1
Viewing angle without distortion (both sides)120 degrees140 degrees50 degrees60 degrees
Max power consumption148 watts220 watts160 watts150 watts
CPUN/A4 core Mediatek MT9970B4 core Mediatek MT96534 core Mediatek MT9653
Speaker power9 watts100 watts20 watts40 watts

I feel that the 9 watts of speaker power on the BenQ monitor does it a disservice: its speakers easily beat both the Panasonic and TCL speakers in terms of rendition quality, in fact they’re so good I relatively regularly play music with them and all my Windows gaming and personal movie watching is done via those speakers. I agree that they’re not especially powerful and I usually have the volume well towards the top, but the sound is really very nice out of them, with plenty of bass. Unlike the TCL’s speakers, which are acceptable, and especially the Panasonic’s speakers which are so bad you’d really need that TV up against a wall to make anything out.

I of course took a comparison shot of the exact same scene from Starship Troopers to show you what I mean about the Dolby Vision HDR content looking wrong – unlike the photo above which was taken using UltraHDR, this one was taken as SDR:

The TCL 55C6K Mini-LED to the left, the Panasonic TV43W90AEB FALD-LED in the middle, and the Philips 65OLED937 W-OLED to the right. Note the sizes of each vary considerably: 55 inch, 43 inch, 64 inch

The Panasonic makes a picture not dissimilar to that of the Philips OLED, which has a spectacularly good picture, despite that its maximum brightness is half that of the Philips and it’s using a much inferior LED technology. The key is, it makes a very honourable attempt: the pictures are comparable to the one on the Apple MacBook Pro. Unlike the TCL, where the picture is just plain not as good: it lacks punch, any wow factor, the image kinda looks flat. I actually bumped up the saturation a little to try and lively up the picture a bit, which does work, but it makes the picture even further off what I think it should be, which is the picture my Apple Macbook Pro makes for the exact same scene.

Looking at the benchmarked not claimed specs above, you can see why: the TCL TV can output 600 nits for test screens, but on real world HDR content it seems to back right off on the brightness so it’s no brighter than my relatively old now BenQ monitor. My BenQ monitor’s HDR rendering is less accurate – at least when driven by either Windows or Mac OS both of which don’t seem to me to be using the right colour profile – but in terms of brightness with real world content they give a similar impression. Absolutely yes in terms of peak spot brightness the TCL is way brighter, but then its software doesn’t appear to often actuate its Mini-LED backlight to peaks except for test images. On top of this, somehow despite the Mini-LED display the colour gamut is lacking relative to any of the other TVs, and you can see that in the measured benchmarks above where DCI-P3 coverage is markedly lower – despite the claims of TCL – which is probably why bumping the saturation a little works so well.

All of which is rather unfortunate: the TCL TVs have the hardware capability, their relatively vanilla Google TV OS is one of the less awful choices for a smart TV, but whoever they have tuning the drivers for these TVs either doesn’t put much effort into it, or they have eyes very different to most other people. In any case, the Panasonic which has less capable hardware than the TCL noticeably makes a superior picture. And I suspect it’s 100% all software as to why.

The TCL 55C6K cost €588 inc VAT and delivery which is way less in inflation adjusted terms than the much smaller BenQ monitor which cost nearly €1,000 in today’s money. The Panasonic cost just €429, but its built in speakers are so bad you couldn’t really avoid the additional soundbar taking the total to €617 inc VAT. So, the TCL despite being 55 inches is cheaper than all the others! And I guess from that perspective this is good value for money, but as with a lot of things you get what you pay for.

As an example of that, my Panasonic and my Philips TVs do some sort of AI upscaling of SD content, so when you’re watching SD content from the satellite, yes it looks a bit blurry in places sometimes due to the low resolution but it’s pretty good. The TCL with the exact same content does not look as good. It does have a picture setting to improve the upscaling, but it looks like it’s basically a smoothing filter and the resulting image looks even more blurry. Another issue I noticed is around mismatched refresh rates, so if you play 24 fps content you will get a really juddery picture on the TCL unless you enable a bit of motion smoothing, but even then the result isn’t as good as telling your playback device to output 24 fps and let the TV sort it out – and now it looks excellent. The Panasonic and Philips TVs do much better with mismatched refresh rates, though playing back at the native frame rate also yields better results for them too.

So all in all I think this is very much a story about the quality of the background driver software which renders your frames, and not having your UI prevent the user from undoing your bad default out of the box settings. All theoretically very easy to fix, yet from what I read online TCL TVs have had these problems for years, so apparently not that easy to fix.

Still, I think Dad will be very happy with it. He’s stepping up from an old 32 inch TV so dim he keeps it in a wooden surround frame to keep surrounding light off it. It also takes about five minutes to get into Netflix, so you start Netflix, go make a cup of tea, and maybe it’s ready when you get back. All that will be fixed with this TCL TV, plus the speakers are loud enough he’ll hear it okay.

Immediately the day after I get back from England the popups installation works will begin, so I’d expect no further entries here for several weeks, and after the popups are installed. Everybody have a great Easter!

#tv




Word count: 2050. Estimated reading time: 10 minutes.
Summary:
The fibre optic cable was extended by 15 metres with a €15 extension cable from Amazon, and the ONT was moved next to the router. This upgrade did not significantly improve internet speeds, as Pure Telecom’s traffic shaping limits bandwidth to 100 Mbps per connection within ten seconds. However, latency was improved from ~22 to ~8 ms.
Friday 27 March 2026:
20:58.
Word count: 2050. Estimated reading time: 10 minutes.
Summary:
The fibre optic cable was extended by 15 metres with a €15 extension cable from Amazon, and the ONT was moved next to the router. This upgrade did not significantly improve internet speeds, as Pure Telecom’s traffic shaping limits bandwidth to 100 Mbps per connection within ten seconds. However, latency was improved from ~22 to ~8 ms.
I’ve been rather run off my feet these past couple weeks such that I kept not getting round to writing up my rented home internet upgrade. Here it is now, and just in time given that the current broadband contract runs out in a few days time.

Yes almost exactly one year ago I was writing here about our shiny new fibre broadband connection which I just had installed into our rented house. In preparation for its installation I had upgraded the Homeplug AV2 1200 powerline network between the ONT and the main router to G.hn Wave 2 and my testing back then was promising in that ~345 Mbit downloads were found. I also noticed at the time some stability problems.

Those stability problems turned out to be the AV2 and G.hn powerline adapters fighting each other – annoyingly, 98% of the time they got on just fine and both performed at full speed. But when they fought, videos stopped playing, internet disappeared, and it cost me some time to figure out it was them that were causing the problems. The obvious solution was to move entirely to G.hn, so I did that and then discovered an unpleasant surprise:

Different networks on G.hn split total bandwidth between them i.e. if total bandwidth is 300 Mbps, each network gets a fixed allocation of 150 Mbps. No, they do not dynamically share the bandwidth like Homeplug AV2.

… which is not documented ANYWHERE, including anywhere I was able to find on the internet. Indeed, if I were a paranoid person, I would say it is striking how little is documented on the internet about this G.hn powerline technology. Had I known this, obviously enough I’d have reverted to the AV2 everywhere, because they’re about the same as half the G.hn speed, and they’re far more predictable as a networking technology (which we’ll get into shortly).

For example, here is the connection between two of my Wifi routers connected by Powerline:

Powerline configiperf TXiperf RXPing range
Homeplug AV2 120075.7 Mbps93.6 Mbps5-6 ms
G.hn Wave 2 2400 (new)
single network
230 Mbps296 Mbps3-25 ms
G.hn Wave 2 2400 (year old)
single network
179 Mbps181 Mbps3-25 ms
G.hn Wave 2 2400 (year old)
double network
87.9 Mbps90.8 Mbps3-25 ms

The top two lines are from the benchmarks I took a year ago, and the bottom two are from benchmarks now. As you can see, the G.hn devices definitely have become slower than when they were when new, about one quarter slower. I hadn’t changed their firmware by then nor done anything else to them, and it is not unknown for electronics which pulse rapidly to lose a fair chunk of their birthday performance – thankfully, usually the loss of performance with age slows greatly after this.

That ping range of G.hn devices is just weird, they appear to increasingly lag due to some internal clock getting out of sync with some other clock, then they snap back into sync and the ping times drop again. Smells like bad electronics or bad firmware to me, but in any case they are what they are: as good as you’re going to get without installing physical wires or investing in expensive Wifi backhaul.

As we’re in rented accommodation, we can’t drill through walls. But two months ago I noticed that the door to the front room has a large gap at the top – one big enough to fit a fibre optic cable without drilling. So, the obvious solution to speeding up my internet was to extend the fibre optic cable which enters the ONT, and move the ONT in right next to the router, and then we don’t need to run the house broadband over Powerline anymore. A fifteen metre ONT fibre optic extension cable cost €15 inc VAT from Amazon, so with a pack of easily removable sticky cable holders, I had a solution:

After this upgrade, the internet did not feel much quicker however, and this is why:

iperf3 -R -c speedtest.serverius.net -p 5002
Connecting to host speedtest.serverius.net, port 5002
Reverse mode, remote host speedtest.serverius.net is sending
[  5] local 194.125.92.186 port 33664 connected to 5.178.66.18 port 5002
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  50.6 MBytes   424 Mbits/sec                   
[  5]   1.00-2.00   sec  60.0 MBytes   503 Mbits/sec                   
[  5]   2.00-3.00   sec  60.0 MBytes   503 Mbits/sec                   
[  5]   3.00-4.00   sec  48.5 MBytes   407 Mbits/sec                   
[  5]   4.00-5.00   sec  37.8 MBytes   317 Mbits/sec                   
[  5]   5.00-6.00   sec  25.4 MBytes   213 Mbits/sec                   
[  5]   6.00-7.00   sec  17.9 MBytes   150 Mbits/sec                   
[  5]   7.00-8.00   sec  14.6 MBytes   123 Mbits/sec                   
[  5]   8.00-9.00   sec  13.6 MBytes   114 Mbits/sec                   
[  5]   9.00-10.00  sec  12.4 MBytes   104 Mbits/sec                   
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.04  sec   343 MBytes   287 Mbits/sec   44             sender
[  5]   0.00-10.00  sec   341 MBytes   286 Mbits/sec                  receiver

As you can see, my current broadband provider Pure Telecom appears to traffic shape each connection so it only gets full speed (500 Mbps) for the first few seconds, then it ramps down to 100 Mbps max per connection within ten seconds. Obviously, they whitelist the main broadband speed testing sites so the ‘connection speed’ appears to be the advertised 500 Mbps. But that’s not what you actually get for real world connections.

Before this upgrade, we got about 85 Mbps in the exact same iperf3 test due to the Powerline in between. But replacing it with fibre didn’t meaningfully matter to bandwidth available, which is unfortunate.

One does, at least, get a large improvement in apparent latency, going from ~22 to ~8 ms:

The above is the direct before and after when swapping over to the fibre direction connection. This looks like it should result in a large improvement to internet snappiness, but in fact the G.hn Powerline adapters appear to have much better latency if there is constant load, they appear to desync themselves only when load is light e.g. a ping every thirty seconds. So as a result you don’t really feel the internet going faster, unlike at the site, where the fibre broadband there is noticeably faster both with your phone and laptop.

Looking at the past month:

The past two weeks look better than they are, here is the past week:

Pure Telecom broadband gets a little congested each evening. As a comparison to the site, which is using Digiweb as its fibre broadband provider:

Digiweb has worse congestion problems each evening than Pure Telecom, however they don’t traffic shape:

iperf3 -R -c speedtest.serverius.net -p 5002
Connecting to host speedtest.serverius.net, port 5002
Reverse mode, remote host speedtest.serverius.net is sending
[  5] local 84.203.23.237 port 33658 connected to 194.107.78.3 port 5002
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  75.2 MBytes   631 Mbits/sec                  
[  5]   1.00-2.00   sec   108 MBytes   904 Mbits/sec                  
[  5]   2.00-3.00   sec   106 MBytes   893 Mbits/sec                  
[  5]   3.00-4.00   sec   108 MBytes   906 Mbits/sec                  
[  5]   4.00-5.00   sec   107 MBytes   895 Mbits/sec                  
[  5]   5.00-6.00   sec   107 MBytes   895 Mbits/sec                  
[  5]   6.00-7.00   sec   107 MBytes   901 Mbits/sec                  
[  5]   7.00-8.00   sec   104 MBytes   868 Mbits/sec                  
[  5]   8.00-9.00   sec   103 MBytes   864 Mbits/sec                  
[  5]   9.00-10.00  sec   101 MBytes   845 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.03  sec  1.01 GBytes   862 Mbits/sec    0             sender
[  5]   0.00-10.00  sec  1.00 GBytes   860 Mbits/sec                  receiver

That test was performed during a mildly congested evening, I have seen 930 Mbps to the same test server which is Amsterdam based. That’s pretty good for a 1 Gbps connection.

My Pure Telecom twelve month contract is up at the end of March. I don’t regret my choice in them, last year they installed my physical fibre connection for free and they are cheap enough at €35 inc VAT per month. And for the vast majority of their contract, I never could have used more than 100 Mbps anyway, so it was moot that they traffic shape so aggressively.

The chances are I’ll be switching to 500 Mbps Eir Residential in a few days time for €26.67 inc VAT per month for another twelve month contract (so we won’t be moving out now until April 2027! Ha!). I’m curious how much worse it is than Eir Business, which as past entries here showed, had an uncannily flat latency history – there was ultra mild congestion of an added millisecond or two every two months! But that was probably my traffic getting priority from the expensive business package. I guess we’ll find out.

To the best of my current knowledge, the big Irish fibre broadband providers are ranked as follows in terms of network quality:

  1. Eir (their residential service is untested by me yet, but their business service was just about as perfect as you could get for the year I had it)
  2. Digiweb (~50 ms ping latencies most evenings especially between 9.30pm and 10.30pm, otherwise very good)
  3. Sky (never used them personally, but internet says they have less aggressive traffic shaping than Pure, otherwise good)
  4. Pure Telecom (aggressive per connection traffic shaping, otherwise not bad)
  5. Virgin Media (I’ve never personally used them, but people I trust say that their network routing is shockingly bad, they appear to not implement peering with other regional networks so your traffic hops over half the planet)

(to be clear, there are also lots of small specialist Irish fibre broadband providers, but they’re all expensive because they don’t compete on price like the main five providers above. Also, to be even more clear, the above ranking is for network quality only and ignores customer service – Eir’s residential customer service is widely considered to be the worst of them all)

You might wonder why am I only ordering a 500 Mbps connection for the rented home when I got a 1 Gbps connection for the site? The reason is that internal Powerline network – at the site, backhaul is 2.5 Gbps fibre, whereas in the rented home the backhaul is G.hn Powerline, and at best you’ll only get less than half of the fibre broadband speed:

Pointiperf TXiperf RXPing range
Wifi router 1 to main router181 Mbps173 Mbps1-8 ms
Wifi router 2 to main router202 Mbps166 Mbps1-8 ms

I should firstly mention that these results are after the latest firmware (2025) was installed, with the previous firmware being the 2022 version. As you can see, maximum ping times have been greatly reigned in from before and the clock skew desync thing I reckon was the problem before has been fixed. Secondly, because I no longer needed a separate network for the ONT, I repurposed one of those G.hn adapters into adding a third Wifi point upstairs as Megan keeps complaining about the Wifi up there, and I am pleased to confirm that three G.hn adapters in the same network do share bandwidth between them instead of halving it.

Thus, on a good day, you’ll get ~200 Mbps if you’re on one of the two satellite Wifi points and only get ~500 Mbps if you’re on the main router Wifi point. Therefore there seemed no point in paying more for 1 Gbps broadband.

Even with this new firmware, I’m still finding the G.hn powerline somewhat temperamental in a way the Homeplug AV2 never was. Homeplug AV2 was stable and predictable, whereas G.hn Wave 2 has good periods and bad periods. During the bad periods, all ping times across the Powerline network go to ~50-100 ms for sustained periods of time. That might last minutes, or a few hours, then it’ll go back to 1-8 ms. There appears to be no pattern to this which is obvious e.g. microwave use, so I’m going to assume it’s some sort of interference which comes and goes from time to time.

This can make using the internet a bit frustrating sometimes, as it’s so spurious. I still have the Homeplug AV2s, and I am occasionally tempted to swap back to those. After all, they were very predictable.

Next post will probably be about setting out popups for the site, y’all be happy until then!

#broadband #internet #powerline




Word count: 1151. Estimated reading time: 6 minutes.
Summary:
The current house server is being used to run various tasks, but its age and limitations are becoming apparent. It has been running continuously for almost twelve years, with an SSD that still has decades of life remaining. However, the mainboard and CPU are outdated, and a replacement is being considered.
Tuesday 17 March 2026:
21:28.
Word count: 1151. Estimated reading time: 6 minutes.
Summary:
The current house server is being used to run various tasks, but its age and limitations are becoming apparent. It has been running continuously for almost twelve years, with an SSD that still has decades of life remaining. However, the mainboard and CPU are outdated, and a replacement is being considered.
I had thought that this St. Patrick’s Day entry would be about making my rented home internet faster, but before I do that I ought to fix something I completely forgot to mention last entry which discussed how best to run Qwen3 Coder Next: I entirely forgot to discuss options for upgrading the house server with LLM possible hardware. And no, I don’t mean just fitting a graphics card to it as its idle power consumption would be too high: I mean server hardware with ultra low idle power consumption which is also able to run large LLMs as needed on their CPU i.e. no discrete additional graphics card.

My current house server is very old: I wrote up an entry on it here on the 10th April 2014, so that makes it almost exactly twelve years old. It has been powered on for almost all of that, and its current SSD (which was not its original) says it has been powered on for 98,000 hours. That SSD, a 128 Gb Samsung 830, has written about 105 Tb in its life, and its SMART data thinks decades more of life remain for it – that SSD model was massively overengineered, and > 2 Pb of write endurance would be expected for that specific drive. So, we’re only 5% through its lifetime . The mainboard is the very popular at the time Supermicro X10SL7-F, and the CPU is a quad core Intel Xeon E3-1230 v3, the last really good new CPU architecture by Intel which is Haswell, and it is fitted with 32 Gb of ECC RAM, which is the maximum possible.

As much as that server was expensive at the time of purchase, nobody can now say it wasn’t value for money: it has been utterly reliable and trouble free in the past twelve years, and fast enough for what I’ve wanted it for up until LLMs appeared. It also isn’t bad for the idle power consumption, recent Linux kernels have it down to 41 watts or so even with the constantly spinning ZFS array which is currently two 26 Tb drives.

But, it is getting a bit long in the tooth, and I am intending to upgrade it sometime after we move into the new house. My main use case for its replacement is that I want a ‘Star Trek’ like house computer which is always listening and able to respond to you at any time to do any thing or on any topic. For that, you’re going to need a frontier approaching MoE model, so at least 200 billion parameters which means ideally 256 Gb of RAM with enough bandwidth and enough compute. Additionally, the MoE model needs to be specifically designed to not suck on consumer grade CPUs, and while there aren’t many of these, there are some. One of those which I’ve been therefore watching closely is Step 3.5 Flash, which has amongst the least worst performance for a 200b model running on CPUs only.

Right now, the hardware list for house servers with a CPU powerful enough to run LLMs is very short: exactly four options in 2026, and following almost the same table format from the last entry:

Itemparse toks/sec/euroPrice (EUR)Launch yearRAM GbBandwidth Gb/secIdle power wattsFull power wattsFP16 TFLOPSllama2 7b parsellama2 7b genOther notes
Mac Studio M3 Ultra0.160448565992742025256800920057148864Must store hard drives externally connected via Thunderbolt
Mac Studio M4 Max0.211174242442242025128546102002889254surprisingly worse parse performance compared to the AMD - not enough GPU cores
AMD AI Max+ 3950.4155383623310220251282151218059128954Standard Mini-ITX! Due to low bandwidth only suits MoE models, replacement model with max 192 Gb RAM expected in 2027
nVidia DGX Spark0.62248424484919202512827325200127306257unsure about kernel support longevity

These are not the best hardware for running LLMs except in two areas:

  1. RAM capacity for your euro, which means you can fit the entire model into RAM, which means even the relatively low compute available in a CPU relative to a GPU can get you there. Buying this much VRAM costs over ten grand right now, whereas none of the above is that expensive, plus they come with a free general purpose computer, power supply and case.

  2. Idle power consumption, no GPU capable of running LLMs idles at below sixteen watts or usually more, so after you add in the idle power consumption of the rest of the server that does add up. In my future house almost all the electricity will be free of cost from the solar panels, however emitting heat does mess with the thermal balance of the house and could contribute towards overheating in summer. In comparison, all the machines above bar the nVidia idle below twelve watts – which includes their main SSD boot drive.

I managed to find performance benchmarks for some of the hardware above for Step 3.5 Flash and Qwen3 Coder Next:

AMD AI Max+ 395Apple M3 UltranVidia DGX Spark
Compute Units16 CPU + 40 GPU24 CPU + 80 GPU20 CPU + 384 GPU
Parse Step 3.5 Flash Q4_K131377530
Gen Step 3.5 Flash Q4_K233320
Parse Qwen3 Coder Next Q8_027516242162
Gen Qwen3 Coder Next Q8_0254537

The AMD Strix Halo was originally designed for gaming laptops, and they only repurposed it into an AI solution quite late in the product cycle. Had they known, if it had twice the RAM, RAM bandwidth and GPUs, they would have swept the market even if they charged twice the price.

The reason why is good old fashioned PC compatibility: none of the above apart from the AMD solution comes in a standard PC motherboard taking standard PC connections and peripherals. If I therefore want to keep my tower case which has lots of very convenient hard drive bays all of which use SATA/SAS, I am severely limited with anything other than a 100% PC compatible form factor.

However, as is obvious above, the Strix Halo is underwhelming compared to the other two for 80 billion never mind 200 billion parameter models. Even for Qwen3 Coder Next, the parse speed is problematic, which is because Strix Halo’s GPU uses a fairly ancient Radeon underneath which lacks the much improved FP8 opcodes for token parsing in newer Radeons.

It is currently expected that the next major successor to the Strix Halo, codenamed ‘Medusa Halo’, will be released in 2027. It should have twice the memory bandwidth, +50% RAM and +20% GPUs, except they’ll be the latest Radeon architecture so parse speed should take a mighty leap upwards, at least 2x over Strix Halo, and way more again for the Q8 quantisation.

Assuming Apple don’t release a price slashed M5 Ultra – which they might, if they feel this is a market share they can easily grab – and that Intel will remain asleep at the wheel, I guess I’ll be aiming to upgrade the house server to the AMD Medusa Halo architecture in 2027 to 2028.

Here’s hoping that there will be a house for me to put it into by then!

#AI #LLM #agentic




Word count: 4599. Estimated reading time: 22 minutes.
Summary:
The author’s personal history with LLMs is a long and winding road, marked by experimentation and exploration of various models and tools. They recall playing with llama3.1 8b in Autumn 2024, being impressed by its ability to call tools and search the internet, and recognizing its potential to aid productivity. The author’s use of LLMs has evolved over time, from generating summaries for their website to more complex tasks like image editing and coding assistance. They have also experimented with various models, including Qwen, Gemma3, and Claude, and have been impressed by the rapid progress in AI capabilities.
Thursday 12 March 2026:
17:40.
Word count: 4599. Estimated reading time: 22 minutes.
Summary:
The author’s personal history with LLMs is a long and winding road, marked by experimentation and exploration of various models and tools. They recall playing with llama3.1 8b in Autumn 2024, being impressed by its ability to call tools and search the internet, and recognizing its potential to aid productivity. The author’s use of LLMs has evolved over time, from generating summaries for their website to more complex tasks like image editing and coding assistance. They have also experimented with various models, including Qwen, Gemma3, and Claude, and have been impressed by the rapid progress in AI capabilities.
I just finished making my rented house internet go much faster, it took me several hours of work this morning, then this afternoon I was in a WG14 standards meeting, so I just finished the work there now. And the internet is indeed now much faster! But that’ll be discussed in detail in a later virtual diary entry, because this one will be on AI coding assistants, and I apologise in advance for the wall of text about to appear.

Obviously lots of programmers like myself have been laid off these past two years ostensibly due to being replaced by AI who will happily churn out code with quality similar to perhaps the bottom fifth worst programmers, but as management is well known to be absolutely terrible at figuring out who is a good or bad programmer, they’ve been mainly performing rounds of blind headcount decimation as usual. I’ve been without income now since last June – part of that is due to changes in US tax treatment of foreign workers, but probably more of it is due to widespread headcount reduction using AI as the excuse. To date, employers have only been investing in AI to the extent of substituting X number of human devs for Y dollars of subscription fees paid to OpenAI or Anthropic – they haven’t gone for any deeper integrations than that. There are good reasons for that: every six months AI gets quite a lot better, and with such shifting foundations there is no point investing in deep structural change to rebase your business on this new tooling until AI improvements slow down to a few percent per year.

As an example of exactly that fast progress, last month Alibaba released its newest set of Qwen models all of which can be downloaded and run locally – unlike nowadays most of the recent western AI models. That release was expected to very substantially improve capabilities over their previous models. I, along with lots of others, had been eagerly awaiting that release because the Qwen Mixture of Expert (MoE) models are the only feasible way to get large models running on hardware an individual could reasonably afford. As I have zero wish to invest my time training into AI which can be rug pulled from me later, an iron clad requirement for me personally (and I suspect lots of others like me) is that my time is only worth being invested in AI models I could have 100% personal control over. So, whatever such models can – or cannot – do is the sweet spot at which I shall aim my practice and training.

I don’t mind the moving target as those models keep leaping forwards – it’s the price of being on the leading edge. And I don’t mind if a future employer pays for some super smart AI to assist me for some piece of work they want me to do. However, for my own personal work, I will be absolutely refusing to get locked in to super smart AI I’ll never be able to fully run on hardware I 100% control.

My personal history with LLMs

The first time I played with a locally run LLM I think was about Autumn 2024, about four months after llama3.1 8b had been released. I was relatively late to that game, to be honest I had until then mostly dismissed LLMs as being little more than improved chat bots. I remember being especially impressed by its ability to call tools you had personally taught it about and it could search the internet when forming an answer to your question or instruction, plus it ran well on my Macbook and even not terribly on my ancient Haswell house server which is well over a decade old. Unlike the pure chat bots preceding which were mere curiosities, I felt at the time that this new generation of chat bot had genuine potential to aid my productivity. But the tooling, and indeed the LLM models, weren’t there yet – though, I am still using llama3.1 8b to generate summaries for the diary entries on this website to this day.

In July 2025 I wrote up how I converted ancient computer parts I had lying around plus an ancient datacenter AI accelerator board I had bought from Aliexpress into an AI video inferencing solution for the site. That used a decade old nVidia Tesla P4 with 8 Gb of VRAM, which was and still remains one of the best bang for the power watt AI accelerators you can get. I came away very impressed with its capabilities, and indeed I expect to reinstall it into the site later this month once enough sunshine falls from the sky to power it.

In October 2025 I upgraded this website’s generator scripts to invoke llama3.1 8b to summarise each entry, and last January I evaluated the then recently released Qwen models for image editing and whether the 30b MoE Qwen model could replace llama3.1 8b (it could not, at least for the limited 18 Gb RAM on my Macbook). Obviously last entry last month was all about getting Gemma3 4b to describe and categorise all 25,000 photos in our collection.

Around this time last year while I was still working at Category Labs, Anthropic’s Claude coding assistant AI was beginning to get mentioned due to Anthropic having released ‘Claude code’, their command line agentic AI programming assistant tool, in February 2025. I think they bought a subscription for anybody who wanted one around April just before they told me they’d be ending my contract early. So I only very briefly played with it during my final month working there in May, and given it cost US$20 per month and I was now unemployed, I wasn’t hugely keen to spend more money subscribing to it especially as I was 99.9% certain that six months later I wouldn’t need to. One thing that I did notice during my playtime was that I already was running into the daily usage limits of the US$20 per month plan after maybe an hour of use. Obviously they wanted you to pay them a LOT more money, which I suppose is fair – the US$20 per month plan is just their taste tester plan.

Qwen3 Coder Next (Q3CN)

It’s rare that I predict the future so accurately! Last month Alibaba released Qwen3 Coder Next, a 80b parameter MoE model specifically tuned to help you work with code. I waited for a few weeks for https://github.com/ggml-org/llama.cpp to catch up with optimising support for this latest LLM, and then I gave it a proper tyre kicking last week and this week. I have come away once again impressed!

Qwen3 Coder Next is about as capable as Claude Sonnet 3.5 is, so about where Claude was at in Summer 2024, which in practical terms is exactly where today’s US$20 per month Claude subscription is at because with that plan any newer model than 3.5 runs out of usage limits so quickly it’s useless. I therefore have exactly what I predicted: the same capability of AI assistant as what costs $20 per month from Anthropic, except runnable on my own local hardware for free of cost, if you have sufficient hardware.

My development workstation is a little old now: I last upgraded it when still working for MayStreet in 2022, and I had intended to upgrade it summer of last year, but without income it no longer made sense. It is a AMD Threadripper 5975WX based machine which has thirty-two Zen 3 CPUs with eight lanes of memory bandwidth, so should have ~180 Gb/sec of memory bandwidth, but only ~4 TFLOPs of FP16 compute. This is far too little to run LLMs well, as I found during the image analysis diary entry where even the small 4b Gemma3 model took a minute per image. But what it does have is 128 Gb of RAM and a PCI 4.0 interface, so you can theoretically run ~100 Gb footprint LLMs, if you can offload the compute to hardware with far more TFLOPs and memory bandwidth.

To run a MoE 80b model well – which is what Qwen3 Coder Next (Q3CN) is – you need a GPU with enough VRAM and compute power to run the dense levels quickly. Those dense levels then select which experts will be used, and those experts usually are run on the CPU using all available CPU cores. So long as the experts don’t touch much memory and are computationally lightweight, you absolutely can run a 80b model like Q3CN well on local hardware.

Running Q3CN locally

As you are surely inferring by now, much now hangs on what a ‘GPU with enough VRAM and compute power to run the dense levels quickly’ is, and more importantly, how much it might cost. I currently have in my workstation these two GPUs:

GPUparse toks/sec/euroPrice (EUR)Launch yearRAM GbBandwidth Gb/secFull power wattsFP16 TFLOPSllama2 7b parsellama2 7b genNotes
AMD RX 6600 XT2.872002021825616016.157454assigned to linux
AMD RX 6700 XT3.2832020211238423023.8105184assigned to windows

These were purchased principally with gaming-on-a-budget in mind – I had wanted to play the Mass Effect Legendary Edition trilogy in 4k with the updated graphics and bug fixes which was released in 2021 (and I didn’t get round to it until Autumn 2024). Hence the GPU allocated to Windows was a bit beefier (also, I bought it a year after the first one, and what you could get for €500 had improved by then).

The above table shows their Euro price today on eBay, and the llama2 7b performance numbers come from this list of llama.cpp benchmarks which are for the Vulkan backend. As you can see, despite that the RX 6700 XT is only a bit faster than the RX 6600 XT for games (about 12%), it’s 50-100% faster for running a LLM which entirely fits inside VRAM. Had I known there would have been such a performance differential, I’d have used the 6700 XT to run Gemma3 in the last diary entry and saved myself days of processing time. Oh well!

Unfortunately, running Q3CN with the Q4_K quantisation on the RX 6700 XT is not good:

  • Parse is 60 toks/sec.
  • Generation is 14.5 toks/sec.

Particularly the parse speed is the problem here: in any LLM, you need to feed all the context through parse every turn of interaction. The context gets real big quickly because it has to include all the source code for everything relevant to what you’re working on, plus all the accumulated steps so far. Modern models are able to cache and not reprocess the context from previous calls, so large contexts aren’t the problem per se, rather it’s whenever the model receives a lot of new content it hasn’t seen before e.g. you just fed it a new source file content. You could actually live just fine with slow generation speeds, it’s the parse speed of new content is the problem.

To explain, if you examine a few hours of me doing work, you’ll find about 99.2% of tokens used are input tokens (parsing context), and just 0.8% are output tokens (emitting changes). Therefore, for speedy turnarounds, you don’t really care about token generation speed much at all. Of the input tokens, 2% will be novel tokens, and the other 98% will be cached due to having been seen before. Therefore the ratio is:

  • 3.9% novel input tokens
  • 95.3% cached input tokens
  • 0.8% output tokens

This is of course an average over many interactions, and so long as novel input tokens are small and cached input tokens are large, running Q3CN locally on the developer workstation is just fine. However, when it comes to large new input content, that 60 toks/sec parse speed becomes a problem: particularly at the beginning of each task, expect minutes for it to parse context for the first time. After it is parsed, it trundles along at a fair clip and is nicely interactive with me, until it next reads a new file and then it needs more minutes. All that is fair enough: it’s got 12 Gb of VRAM running a ~40 Gb sized model, so it’s falling back onto main, slow, RAM a lot.

So what’s the best LLM running hardware bang for the buck in March 2026?

The budget LLM executing hardware market in March 2026

I assembled a list of all GPUs and data centre AI accelerator boards with 16 Gb or more VRAM currently available new or second hand to Ireland costing no more than €1,000 inc VAT. For the purposes of comparison, I threw in my existing AMD GPUs and the nVidia Tesla P4 I bought last year for the site video inferencing – these are the only 8 Gb VRAM boards below:

GPUparse toks/sec/euroPrice (EUR)Launch yearRAM GbBandwidth Gb/secFull power wattsFP16 TFLOPSllama2 7b parsellama2 7b genNotes
Intel Arc Pro B500.55428571433502025162247021.319440
Intel Arc Pro B600.8760020252445620024.552269
nVidia P401.1348837214302016243451048859needs additional fan, high idle power consumption, low compute perf
nVidia Tesla P42.09448818912720168192755.726628needs additional fan
AMD V340L2.410020182x 84103002124048old, ensure vulkan shaders
AMD RX 6600 XT2.872002021825616016.157454assigned to linux
AMD RX 6700 XT3.28437532020211238423023.8105184assigned to windows
AMD RX 7700 XT3.48837209343020231662426326150070
AMD RX 6900 XT4.405400202016512300371762106
Intel Arc A7704.43571428628020221656022534.4124255
AMD RX 7900 XTX4.73675020222496035546.73552167
AMD Radeon VII5.066666667180201916102430026.9912106needs additional fan, old, ensure vulkan shaders work
AMD RX 90705.14471544761520251664022072.33164120
AMD Mi507.466666667150201816102430026.51120108needs additional fan, old, ensure vulkan shaders work
nVidia RTX 5070 Ti8.86315789595020251689630043.98420182
nVidia RTX 5060 Ti9.53636363644020251644818023.7419694
nVidia RTX 309011.1250020212493635029.35560162

Those prices, especially for the nVidia cards even old ones, are grim.

Despite how very depressing this table is, it’s actually much improved over this time last year when I last updated my spreadsheet for AI accelerators. Back then the only games in town were the expensive nVidia GPUs above, and the Intel GPUs which suck at parsing. AMD GPUs one year ago just weren’t viable because AMD ROCm only supported the very newest GPUs, all of which cost over a grand at the time if you wanted 16 Gb of VRAM (and even today, only the RX 9070 comes in under a grand).

Twelve months later, as I noted last diary entry, now AMD ROCm ‘just works’ even on technically unsupported GPUs from the previous generation like mine. It no longer crashes and blows up like the dumpster fire it was even six months ago. However, the llama2 7b benchmarks listed above aren’t actually from the ROCm backend for llama.cpp, they’re actually from the Vulkan backend because that’s now usually faster than the ROCm backend if you’re using trunk llama.cpp. Its Vulkan backend was started last summer and it’s made enormous strides ahead in just the last three months such that it’s now almost always the fastest backend on AMD GPUs, and it’s as fast as CUDA for token generation on nVidia GPUs. Parsing performance is still one third to one half slower than CUDA on nVidia GPUs, but that gap is closing quickly.

The reason why the Vulkan backend is so game changing is because GPUs have supported Vulkan shaders (which are for high performance games) for over a decade, and that in turn means that all the ancient AMD datacenter AI accelerator boards suddenly come into play because they can all run Vulkan shaders no problem, even if they’ll never run ROCm. That expands the table above with some promising new options from what it was twelve months ago – it also proves that AMD GPUs never actually sucked at LLMs as much as people thought until recently, the actual problem with them was lousy software support, not that the hardware wasn’t capable. This much improved story for running LLMs locally is 100% the result of recent runtime software improvements, and I’m very glad for the increased menu of choice.

The table above is ordered in terms of parse speed per euro, so the bottom of the table is where the standout bang for your buck boards are listed. Unsurprisingly all of those are nVidia GPUs: nothing else can parse tokens as well for your euro. But given that all those are expensive even used on eBay, the next category of standout board is the 2018 era AMD Mi50 datacenter board which is up there with the RX 5xxx nVidia GPUs in terms of bang for the buck. Unlike those boards, the Mi50 can be sourced from China delivered for €150 inc VAT. So, naturally I’ve ordered one and I’m looking forward to its delivery. It has a similar token parse speed to my existing 6700 XT, however it has one third more VRAM and that VRAM is three times faster. I would expect maybe a +20% performance improvement at running Q3CN. I guess I’ll find out.

To get radically faster performance such that there is no waiting at all, one would need at least 48 Gb of VRAM I think, seeing as the model is ~40 Gb. That probably means two cards with 24 Gb VRAM each, and to keep under a €1,000 budget:

  1. €860 2x nVidia P40: probably not that much faster than my 6700 XT at parsing.
  2. €1000 2x nVidia RTX 3090: many times faster.

… of which clearly the RTX 3090 is by far the better option, plus it can be used for gaming. Still, that’s a cool €1,000. That’s a lot of money.

I am mindful that after this AI investment bubble bursts, there is going to be a flood of used AI accelerator hardware on the market which will depress prices. So now is a lousy time to buy, especially in my current financially straightened circumstances. Which then makes one ask: how much would it cost to rent the hardware instead?

Renting Q3CN

The idea for a ‘LLM marketplace’ is of course an obvious one, and as far as I am aware the first of these, and still the biggest, is OpenRouter who got started in 2023. What they do is to provide an OpenAI compatible REST API endpoint which proxies a marketplace of LLM providers for a +5.5% fee over whatever the underlying provider charges. You can set rules for which providers to choose and when and in what order you’d prefer – note that ‘cheapest first’ is NOT the default after account creation. You also don’t necessarily want the absolute cheapest as I found they hang frequently, so with some trial and error you’ll figure out what to ban and what to allow.

You can of course open an account directly with the providers on OpenRouter and save yourself the +5.5% fee, and there are further providers who don’t list on OpenRouter. However one enormous advantage of OpenRouter is automatic failover, because when a provider gets overloaded – and in my experience, they do during peak times – OpenRouter reroutes you to the next cheapest provider with zero outage experienced by you. Maybe down the road when these providers get much better uptime this will change, but for now I think I’ll be happy to pay OpenRouter their fee for not suddenly being put on pause mid-flow.

To rent Q3CN today, these are the four cheapest providers I could find online:

ModelSupplierMax contextInput US$ per 1M tokensCached input US$ per 1M tokensOutput US$ per 1M tokensEstimated US$ price per day (10M tokens)
qwen3-coder-nextOpenrouter (chutes)262k0.120.060.750.6786
qwen3-coder-nextOpenrouter (parasail)262k0.15N/A0.81.552
qwen3-coder-nextOpenrouter (ionstream)262k0.15N/A0.81.552
qwen3-coder-nextNanoGPT262k0.15N/A1.51.608

The reason that the chutes provider entry is struck out is because they’re the provider who kept hanging the session or corrupting the context. They’re basically useless to use, so I only list them for information only. I’ve had good experiences with Parasail and Ionstream, each has dropped out on occasion, but OpenRouter routed to the other so my work was uninterrupted. NanoGPT is actually a standalone provider not listed on OpenRouter, there are other standalone providers a LOT more expensive for Q3CN rental than those listed here, but they were so much more expensive they’re not really worth listing. In any case, about US$1.50 per day is estimated assuming a use of ten million tokens – which given that I easily chewed through seven million tokens in five hours, may well be an underestimate.

OpenRouter can supply detailed logs on request, so from those I calculated that whatever hardware Parasail is running has this performance:

  • Parse (uncached): 17,450 toks/sec
  • Generation (uncached input): 56 toks/sec

  • Parse (cached): 31,204 toks/sec

  • Generation (cached input): 81 toks/sec

… which smell to me to be nVidia A100 cards, which I suppose makes sense as they’re older and therefore cost depreciated. In case, they are more than plenty fast enough, the agentic coding AI snaps along faster than I can read its log of actions – half the speed would also be more than plenty. I should remember that for later when buying GPU hardware.

Speaking of expensive … would you like to know how much the ‘big boy’ AI agentic coding services cost for comparison?

ModelSupplierMax contextInput US$ per 1M tokensCached input US$ per 1M tokensOutput US$ per 1M tokensEstimated US$ price per day (10M tokens)Multiple of rented Q3CN cost above
qwen3-coder-plusOpenrouter (alibaba)1000k (though gets noticeably forgetful after 300k)1.170.135.852.16321.4x
gemini3.1-proGoogle200k20.2123.6462.3x
gpt-5.3-codexOpenAI400k1.750.25144.1852.7x
claude-sonnet-4.6Anthropic200k30.3155.2293.4x
claude-opus-4.6Anthropic200k50.5258.7155.6x

The cheapest frontier coding model is Alibaba’s Qwen via OpenRouter (where it is heavily discounted for some reason), followed by Google’s Gemini3.1 Pro and OpenAI’s GPT5.3 Codex, with a slightly larger price gap to Anthropic Claude Sonnet and Opus. Qwen Coder Plus is 1.4x the cost of rented Q3CN, which is a useful data point; Claude’s most capable model is 5.6x the cost for my usage patterns, which if I’m honest, is less than I had expected.

Few devs pay for frontier models by the token, and instead have monthly subscriptions. I seem to consume 100 - 120 requests per hour, so that’s 500 - 600 requests and seven million tokens per five hours. That certainly needs the highest possible US$200 per month subscription: that buys you 800 requests per five hours, but there is also a weekly usage limit of 15 - 30 hours for their Opus model. If you want more, Anthropic want you to pay by per-token billing instead, and to be honest, at an estimated US$8.72 per day for my usage pattern, paying by the token for their highest end model with an average 22 day working month would be US$191.84 per month which is cheaper than their US$200 monthly subscription and it has no usage limits, which is another useful data point. I read a lot online about people complaining about the usage limits built into their US$200 per month Claude subscriptions, yet for my AI use patterns the per-token billing would always be cheaper than the subscription. I guess a lot of people have Claude write a lot more output than I would have it do?

I’ll come back to that next section, as I’m digressing as this is a Q3CN focused diary entry: point is that renting Q3CN probably would cost US$34 per month, which is under US$400 per year and that’s if you’re using it full time. Use it only sporadically like I do as I’m unemployed, and spending €1,000 on your own hardware to run it looks like lunacy (unless that same hardware can play upcoming Grand Theft Auto VI at max graphics settings, in which case it becomes more a bird in the hand is worth two in the bush type of cost-benefit analysis ).

Conclusions

I think I’m pretty much decided: I shall use OpenRouter for my Q3CN implementation until the AI investment bubble pops and I can affordably pick up ideally a new powerful GPU also able to run GTA VI well, or I can pick up firesaled legacy AI hardware on eBay for cheap. But the well under five hundred euro sort of cheap – there’s no rational point dropping more than €500 on new hardware given the rental costs, as I’m better off renting until used component prices get under €500.

OpenRouter makes it super simple to flip over to Claude Opus for analysis, architecture and plan writing, and then flip back to Q3CN for implementation. I’m not opposed to paying tens of euro cents for analysis and planning if it reveals things I would likely have missed – especially as it’ll write all that out into documentation for me, which I can then manually review and strip out the wrong parts. I simply view that as good engineering: I welcome all good quality feedback, from any source.

I think this reveals what kind of AI using coder I will be; there appear to be two main categories:

  1. Devs who don’t like writing code much, so get AI to write as much of the code as possible, so they can focus on solving problems ASAP. The AI will therefore output lots of tokens, as it writes all the code.
  2. Devs who feel the whole point of coding is to emit high quality code, and AI isn’t good at that especially starting from a blank sheet, so they’ll always write the bulk of the initial implementation by hand, and then only use AI when appropriate to adjust and refine that codebase. In this category, the AI will mostly read tokens, and output very few as it never edits more than a few lines of existing code at a time.

The first category tends to use an AI focused IDE like Cursor which is a fork of vscode, whilst the second category tends to use AI extensions like Roo Code installed into vscode – and to be specific about the difference here, Roo Code only appears when you open its tab. Otherwise it’s as if it’s not installed, which is exactly what you want when you’re doing work you don’t want the AI to do. Whereas, in Cursor, by choosing that IDE you’re basically saying ‘there is no work I don’t want AI to do’. In other words, it’s outside-in vs inside-out.

I am probably in category two for most of my open source work which is on reference implementation libraries. These set the standard for everybody else, and they have to be very carefully written and designed. So I like the AI to help, but I’m always going to be writing most of the code by hand most of the time.

However I’m not opposed to category one for some tasks: there are a number of Python scripts I’ve written to implement some part of a processing pipeline where I would be more than happy if the AI did as much of the work as possible, as I just want a solution ASAP and I don’t especially care how we get there. For example, if I needed some Android app to solve some itch or something, chances are very high I’d just vibe code that and call it a day.

I guess this is pretty much what Linus Torvalds said about this stuff: ‘use AI to write the code you don’t care about’. That’s pretty much where I’ve arrived at too, though I do find its analysis of what I’ve written quite insightful sometimes, as it sees with eyes which are not my own.

Anyway, that’s my analysis of agentic AI coding assistants written up! I do apologise for the wall of text, but I did also want to be comprehensive as I’ll almost certainly refer back to this in the future, so I wanted all my scattered notes built over many months to be condensed into a single, albeit very long and dense, diary entry which will turn up in search in the future should I need to refer to it.

Next entry will almost certainly be about making the rented house internet go faster, but that’ll be at the earliest next week. Be happy everybody!

#AI #LLM #agentic #qwen




Click here to see older entries


Contact the webmaster: Niall Douglas @ webmaster2<at symbol>nedprod.com (Last updated: 2019-03-20 20:35:06 +0000 UTC)