Niall’s virtual diary archives – Thursday 11th August 2011

by . Last updated .

Thursday 11th August 2011: 6.05pm. Plenty of progress once again in my life since the last update, though still not much of it is yielding tangible results which is becoming a little disheartening. In May I received news that the two memory allocation academic articles I wrote last year had been rejected, so I fired them onto and one can find both of them either here on arXiv or via Google Scholar along with my other academic writing. In June I took my summer exams for my MRes degree in London, and while I haven't got back my grades yet I would assume that I received a C grade much as with all the coursework I have undertaken to date (I am unusually consistent in this MRes course - in Hull, St. Andrews and U.C.C. I had a huge variance in grades received ranging from bare passes up to firsts. Not so with this MRes, it's a C grade every time!). If you're interested in reading the coursework I submitted, it is - as always - here with all the other coursework I have ever written.

Weirdly, considering how I was moaning last entry about not being valued by tech recruiters in the middle of another IT bubble, I was separately approached by both Google and Amazon to join teams in their main office locations in Mountain View and Seattle respectively. In order to find out if they wanted to hire me specifically or just wanted a body to fill a slot, I opened the negotiations with Google by asking for their "20 percent time" to be cast-iron written into my contract, figuring that if every engineer gets it anyway it wouldn't be much to ask. It turns out that Google doesn't give its recruiters any scope to negotiate anything other than pay, and much worse there is no clear line of management authority for recruiters to refer contractual negotiations onto. The poor recruiter was basically left dangling by her line manager, much to her evident frustration. Highly unimpressive.

Google is a bit different from most IT companies in having roomfuls of professional recruiters find talent and then match them to departments according to a generic needs analysis. This contrasts with the other approach which is where team leaders specifically find talent to add to their personal team. The former has advantages in preventing team leaders from being distracted by needing to do recruitment and preventing leaders from hiring their friends and (theoretically) creating political factions which distract from organisational goals. The latter has advantages in that a team leader knows specifically what kinds of person (rather than vacant roles) can be added to their team where a HR bod simply cannot, it allows the leader to form a personal relationship with the candidate which makes getting actual talent past HR (which is remarkably good at filtering out extremes leaving just average to pass through) much easier, and of course a personal relationship enables much better team culture, work ethos and morale to be maintained over time. That last point, in particular, will keep an employee from being head hunted even when the competition are throwing money at team members (up to 30% over their existing salary according to surveys - after that even the most loyal team member will tend to get fidgety).

In short, I'd reckon that for most cases in a knowledge industry, having team leaders do their own recruitment is on balance superior for long term organisational success. It does come with much added pressure for middle-upper and upper management to contain empire building among the ranks. But I do understand where Google is coming from, even if in my opinion on average it will wreak havoc with employee morale and retention over time - after all, they designed this system having watched the latter system cause IBM and many other giants get into big trouble in the early 1990s by failing to reign in factionisation and politics-playing in their ranks.

Anyway, the Amazon approach fell into the latter category where a specific team leader approached me directly. And they were not only able to negotiate on contract, but were willing to do so. Unfortunately, we fell just short of a meeting of minds this time round so it didn't happen for this year's H1B visa intake. But in short, I was impressed. Impressed enough that I could easily see myself working for them in the near future. It's human relationships that make a knowledge industry work sustainably - Google don't seem to get this. They think it's about the technology and engineering great technology, and for that you need highly capable individuals. In truth, mediocre technology will sell just fine, it's actually about building, maintaining and retaining a superior implementation team none of whom need to be rockstar talent (though that helps). In this, Microsoft back when it was still not entirely dysfunctional, it truly proved the truth of team before technology, and interestingly it was when it began to believe it could do sweeping technology building in the form of WinFS et al it became managerially so dysfunctional that progress ceased. To be honest, big technology can't be sustainably achieved outside a team of six rockstars in my opinion. Our culture isn't mature enough to scale higher yet.

That brings me onto the third potential employer I could have had since May. Out of all those PhD applications I put in since Christmas, just one turned into an interview which was with a University of Wales joint venture with Tinopolis funded by UK government money researching business deployment of new e-Learning platforms i.e. right up my street, and one for which I ought to have been uniquely qualified to the exclusion of any other candidate given how few would be qualified in CompSci, Economics, Management and Education. Unfortunately, they autocratically set the interview date twice without consulting the applicants as to its suitability (both dates were bad for me) - already a bad sign, because in academic employment circles that's a strong hint that a role has been preselected by an internal candidate and when you see a university do that it's a strong hint to not waste your money and time attending the interview. Still, I really liked the role, and I really wanted it despite knowing it was almost certainly a waste of time. So I attended despite the murderous car + ferry + train journey to get there, and after spending €400 of my own money I knew within two minutes of the interview starting that I was wasting my time. I was spending my money as a formality to let them demonstrate they had performed due diligence in finding the "best" candidate. If they paid for the interview I'd just be annoyed, but when it's my money they're wasting I feel angry about it.

A similar thing happened at a recent interview for the Ignite business incubator programme for which I had the interview two weeks ago. This is a local initiative to try and better support early stage venture businesses because due to the recession, Ireland is losing a lot of its talent to overseas right now. In that it is highly laudable. However, within thirty seconds of the interview starting I realised that they thought I wasn't a local because I had been educated overseas, and god forbid I had worked in other places past Dublin and London - and despite having lived here since I was two, I quickly realised that I didn't stand a chance because only one person on the panel was even remotely interested and only one other bothered to try asking "bad cop" questions, and even then barely. A shame, not least because welfare have cut my dole by 20% because I started my own business (yeah they took two years to arrive at that decision) and had I entered the incubator programme it would have moved me onto dole plus €20 per week non-means tested which would be a big bump to my income. However, seeing as this waste of my time didn't cost me €400 I'm not complaining. You win some and you lose many.

What else have I been up to? Despite the interview with the University of Wales getting in the way, I managed to tele-present over the internet at the Institute of Education's Summer Doctoral Research Conference and you can see its slides here. The presentation was about the work I was about to do on Luxubrations Oxydérkeia, my super-secret R&D project, and which is making good progress since so I can now afford to be much less secret about it. Most of the hard stuff is nearly done, with the hard stuff being the interface layers with all major web browsers and with Microsoft Word. I've more or less finished the browser plugins for the web browsers - they were painful enough, even with modern web browsers being remarkably standards compliant nowadays so much so that the per-browser coding was actually quite minimal outside the specific browser plugin support code. The BIG problem is that browsers are extremely slow doing some operations at which other browsers are much faster, so for example capturing AJAX induced web page updates will kill one browser using one technique but will fly on another. I haven't gone nuts on the optimisation here - browsers change too quickly - but it's a very different problem from even two years ago when writing that Web 2.0 FIXatdl editor where browser bugs makes Web 2.0 programming very painful. Good!

As for the Microsoft Word plugin, well actually capturing change in the browsers was far easier because they expose what has changed. Believe it or not, there is no way of capturing change in Microsoft Word without hooking key and mouse presses and reading the entire document as XML, then running a diff routine over the last XML dump you had i.e. Word won't tell you what has exactly changed. In other words, it absolutely destroys performance for any substantial document, even on a beefy computer. I can get away with it for student-length essays, but I'm unhappy with the solution. I need to think of something more sane, perhaps by limiting the XML dumping to what's currently on the screen or something, or perhaps I could configure a fake change tracker and watch what it stores. Not ideal mind you, and it's annoying because obviously Word itself knows what was changed as it needs to determine what to repaint on the screen. It just doesn't expose that to the outside world (as far as I can tell).

Anyway, all changes get captured as XML diffs and fired via a JSON-RPC RESTful HTTP interface to a Python program which then pushes them into a local NoSQL database which is XML native. It then will at some near future point construct graphs linking changes into an audit trail. I had to substantially improve a JSON-RPC library for Javascript which I found on the internet to get this to work at a satisfactory speed, and I also had to do some .NET 4.0 surgery on Jayrock (the JSON library for .NET) to add dynamic RPC method invocation as amazingly Jayrock wants you to set up and tear down invocations as if one were programming in C rather than the dynamic object environment that .NET is. All these things obviously are not working on new Oxyderkeia features, hence being rather behind schedule, but hey this always happens in any cutting edge software development.

In fact, I took a major detour from Oxyderkeia last week by spending six days writing and releasing BEurtle, an issue tracking GUI plugin for the TortoiseXXX series of VCS GUI interfaces. It wasn't supposed to be six days - in fact, BEurtle took just three days to write and polish to (in my opinion) a high standard considering it is the first real program I have ever written in either C# or .NET ever. No, fully half the time writing BEurtle was spent slamming my head repeatedly against Windows Installer and WiX, a thin sanity wrapper around the mess which is Windows Installer. Windows Installer should never have been released to the public in the state it is in - it's unfinished quite frankly. It's not even at an alpha release stage it's so unfinished. I don't disagree it isn't capable, nor that it isn't a valid solution to the historical problem of doing Windows installers right, it's just that it's less than a quarter completed. It also - for some unfathomable reason - has its own (highly inferior) GUI system and it's own (highly incapable) scripting system, when as far as I can tell they should have used .NET as the GUI and perhaps a reduced subset of VBScript as the sandboxed scripting language (I'm no fan of VB, but it was already there and easily repurposed). That would have been a vastly superior - and quite frankly, much easier to implement for everybody involved - solution. While you're at it, clone the RPM or APT package system used by Linux. Hell, even Python's package system beats the pants off this mess on Windows.

Anyway, I put myself through that agony because believe it or not, WiX is the only sane, non-obscenely expensive way of generating MSI files which are in any way more complex than installing a few files into a folder - and while WiX is tough to work with, it's far more sane than the alternative. I figured I'd need to master the technology anyway for Oxyderkeia's release because I want to deploy Oxyderkeia as a self-deploying, self-updating, delta-driven, web based installer with modular parts using ClickOnce, DDay or preferably WiX ClickThrough if they ever get around to releasing a working implementation. And for that, on Windows at least, there isn't a massive amount of choice without paying obscene fees - even the venerable NSIS isn't quite up to self-healing delta-driven self-updating, though it's still much superior to Windows Installer in terms of ease of writing against it.

So, so far so good. The native XML NoSQL database is quite fun - as a little exercise, I hacked together a python script which takes this website which is a collection of technologies and HTML standards from 1998 onwards, sanitises them into XHTML, and stuffs them into the native XML database. You can then execute XQuery operations against them - XQuery is to a native XML database as SQL is to a traditional database. For example:

xquery declare namespace xhtml="";

This looks rather like good old XPath, and indeed XPath is a subset of XQuery. Here one asks for all <div> elements with a class attribute of "diaryentry" from all documents i.e. the same thing as the Atom syndication feed supplied by this site. This returns 412Kb of XHTML and some 54 items in about 200ms on a 1.6Ghz Intel Atom - hugely slower than a traditional database, and far too slow to backend a website for example, but plenty fast enough for Oxyderkeia where I think even three seconds would be okay for many operations. Most of Oxyderkeia is asynchronous, mainly for scalability across millions of simultaneous users, so you shouldn't notice your web browsing ever slowing down even if it's pushing megabytes of data around databases in the background - well, rather it's as fast as it can be made, and it can't be improved except by moving less data around. And we won't know what to thin out until we know what isn't important!

So there we go. Not a bad three months, and a good summer so far. Next entry I almost certainly will talk about M2M clothing because very kindly offered me a substantial discount on a M2M suit from them after my favourable review of their shirts below. So, till next time, be happy!

Go back to the archive index Go back to the latest entries

Contact the webmaster: Niall Douglas @ webmaster2<at symbol> (Last updated: 2011-08-11 18:05:00 +0000 UTC)