I found a second showstopping problem with Microsoft Expression Web
other than the BOM breaking PHP, and this
includes all versions up to v4: there is a bug where the formatting
engine will slowly & randomly corrupt your HTML! This is a rather
serious problem, so I resolved to fix it by having my preflight python
script perform validation of new content to catch any errors.
The following Python script fixes this problem and is an
enhancement of RemoveBOM v1. It takes your Microsoft
Expression Web directory tree and copies it to another location,
performing the following operations as it goes:
It generates a full sitemap.xml in the document root.
Tests ..html files (not .htm, this is an easy way to mark for
non-validation) for the UTF-8 BOM. If present, it removes the BOM
and validates the data as valid UTF-8.
For .html files it prepends and appends php
header rewriting code which spits out headers setting the content type
to UTF-8 if the BOM was present.
For .html files it also sets HTTP Last-Modified to the last
modified time of the php-containing html file which ensures that a
HTTP 302 Not Modified response is given by Apache should the web
browser send a "send if modified since X" request (which most do),
thus greatly lowering bandwidth costs and indeed
server load thanks to idiotic spider robots.
For .html files it also uses PHP output buffering to determine a
correct Content-Length header and enables zlib compression should
the source file exceed 64Kb - this adds latency for the compression
and decompression, but halves or quarters the amount of data needing
to be transmitted.
It passes all
XHTML declaring itself as such through a validating XHTML parser and
opens a list of found errors, if any, after completion. It uses
a HTML5 microdata enabled XHTML
DTD, so you can use HTML5 microdata just fine.
It knows when to not copy files which are unchanged, so it is
fast to run just before you upload your changes.
You may find this script useful as a base for writing your own. No guarantees or support are given with this code. Enjoy!
Contact the webmaster: Niall Douglas @ webmaster2<at symbol>nedprod.com (Last
08 July 2012 20:15:48 +0100)