Welcome to Geoffrey Swift's βlog. Please subscribe to the Atom feed.


ITV Player adverts skipped

Watching the adverts on the ITV Player was taking up enough time, that it seemed worth blocking them in the web proxy.

Considered various ways of doing this, and found it's quite easy block any URLs which start with http://sam.itv.com/XTSERVER/

ITV Player makes requests for VAST XML file there. The XML files describe each advert, and are created by the "Atlas AdManager" server software.

Since blocking the files which contain the URL for the advert videos, I don't see any adverts when watching online with this station either.

Demand Five Adverts skipped

In an old post I detailed how to avoid advert videos on the 4oD service. I have again I made use of the redirect_program feature of the Squid web proxy, but this time the adverts from Channel Five's Demand Five service are being removed.

I set about analysing how Demand Five works, by using the Fiddler web debugger from Microsoft. It became apparent that similar to the Channel Four service, there was a play list of (Flash) video files in an XML format.

So I attempted using the exact same technique as for 4oD. I created a "man in the middle" to filter out the adverts from the Demand Five play list, using my own PHP and XSLT code. Unfortunately the doctored play list I generated was rejected by the Flash player as invalid, presumably because required elements (adverts) were lacking. It's not clear whether this an intentional feature in Demand Five or not! Either way a different technique was needed.

I have avoided seeing banner adverts for some time, by having Squid serve upa 1 pixel by 1 pixel transparent GIF instead. Since modifying the XML play list on the fly didn't work out, I decided to get Squid to serve up a video of 0 seconds duration in place of an advertisement.

I searched the web and found a suitable dummy Flash video of 0 seconds length to use. So I just needed to let Squid know I wanted the dummy video instead of the advert videos. To do this I added the following line was added to my existing Perl script squid-redirector.pl:

s@http://[^ ]*\.akamai\.net/[^ ]*five\.tv[^ ]*\.flv@http://www.trollied.org/~blimey/fake/dummy.flv@;

Now I can enjoy Neighbours without having to wait for the adverts to finish. This is particularly useful when the playback gets interrupted and you have to start from the beginning.

Sudoku Solver source code

I've made available the Delphi/Kylix source code, for the server part of my online Sudoku Solver.

The program can find solutions by brute force, and in many cases using logical deduction alone. The logical solver is considered adequate for its purpose, which is to give hints and validate against mistakes.

sodit.dpr
Main project file
SudokuSets.pas
Optimised library for set based operations, such as iterating permutations.
SudokuDeclarations.pas
Constants and type declarations.

The source code for "Sudoku Solver" is provided without warranty, and for non-commercial purposes only. "Sudoku solver" is copyright © 2007 - 2008 Geoffrey Swift. All rights are reserved.

Download: http://www.trollied.org/~blimey/sodit.zip (8 KB)
strace claims select() sets errno = ERESTARTNOHAND

The strace utility is well recommended for investigating all manner of Linux problems. But while using strace to debug a crash in the Linux version of my Talker application, I came across something which proved to be a red herring. It appeared that select() was failing and setting errno to ERESTARTNOHAND.

Without being side tracked by what that would have meant, the actual problem itself was actually a segmentation fault had happened in another thread. select() was actually setting errno = EINTR, due to the corresponding SIGSEGV signal.

The error outside of the main thread was only apparent when using strace -f. The "follow forks" option is necessary, since pthreads works by forking off new Light Weight Processes (LWPs).

I found plenty of posts on the web detailing similar issues, although no solutions offered. Hopefully this information will be helpful to someone else.

Converting XSPF to M3U and PLS with XSLT

I'm now syndicating another internet radio station playlist, the new one is from http://www.dnbstations.co.uk/. This didn't take long to implement, as I was able to reuse existing code which converts from XSPF to both M3U and PLS. I did write some new code to convert a Wimpy play list to XSPF, this was pretty easy as the formats are very similar.

The XSLT code to convert a Wimpy play list to XSPF, and from XSPF to M3U and XSPF to PLS is hereby released into the public domain.

Not only does www.dnbstations.co.uk provide the Wimpy playlist, they act as a proxy for all the streams. This seems to be wasteful of their bandwidth, when they could easily let listeners access the stream directly. I was quite surprised when I noticed this, thinking that the stream URLs were just using HTTP redirects. Maybe they'll redo this in the future, depending on how successful they are!

Bypass JavaScript with Squid

I mentioned previously how I used the redirect_program feature in the Squid web proxy to filter out adverts from Channel 4's 4oD service. I did that by intercepting and modifying ASX files. In this article I explain how similarly one can deactivate unwanted JavaScript code.

One motivation for intercepting JavaScript is to avoid advertising, I'm also bypassing Google Analytics. For an entirely self contained JavaScript, your own blank script can be substituted. This works well for example with Google's advertising script show_ads.js.

To use Google Analytics, web site authors must write their own code to invoke the functions Google supply in their JavaScript files. So if Google's code is simply replaced by a blank file, JavaScript errors will result as required functions have not been defined. I've therefore written my own versions of their scripts which mimic the necessary code entry points, but don't actually do anything.

A replacement for the original Google file urchin.js was trivial:

function urchinTracker(){}

but the recently updated Google Analytics script ga.js was a bit more complicated:

var _gat = { _getTracker:function(s) { return { _initData:function(){}, _trackPageview:function(){} } } }

The scripts mentioned above are very popular on many websites. My replacement versions are obviously going to help speed up load times on several web pages, as the web browser has less work to do. Another bonus is that I avoid being included in my own Analytics data, and I get a better idea of who else is looking at my site!

XHTML DTD in XML catalog

I had read an article "W3C Gets Excessive DTD Traffic" on Slashdot last month. This struck a chord since I use XSLT to generate XHTML, and had noticed that the XHTML DTD is downloaded each time it is referenced. This is wasteful of both bandwidth and slows down the XSLT considerably.

To remedy this problem, I downloaded these files to my Linux box and added some XML catalogue entries. This ensures local copies of these files ares used instead of repeatedly downloading them from w3.org.

Here's the shell commands I used on my Slackware 12 box to set things up:

mkdir /usr/share/xml/xhtml1 cd /usr/share/xml/xhtml1 wget http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd \ http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd \ http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd \ http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent \ http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent \ http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent xmlcatalog --noout --add rewriteURI http://www.w3.org/TR/xhtml1/DTD/ \ file:///usr/share/xml/xhtml1/ \ /etc/xml/catalog

To confirm this was working as expected I ran xsltproc under strace. I did however discover what I'd done had no effect on libxslt in PHP. I looked through the source code for PHP, and found no code to initialise libxml to use XML catalogues. This would be a nice feature, hopefully it will be included in a future release of PHP.

Tinyfugue 4 patched for GCC 4

The CHANGES file in Tinyfugue 5 explains that the user interface has changed: New screen handling. See "/help windows". What this means is that when changing between worlds, the whole screen is replaced to show only output from the selected world. Previously messages from each world would all appear in succession on screen.

As I multi-spod and hold conversations across several talkers at the same time, this new interface is not my personal preference. I would rather be using the time honoured classic, which is Tinyfugue 4. However I discovered recently that this older version no longer compiles on modern Linux distributions.

I had a look into this problem, and fixed the source code by retro fitting the relevant fixes already done in Tinyfugue 5. The changes were minimal, so it didn't take long to get this compiling for GCC 4 on Linux. I made a unified diff of these changes. This has come in handy twice for me now, so I hope this proves useful to someone else.

--- history.c.orig 2008-03-04 00:49:30.000000000 +0000 +++ history.c 2008-03-04 00:47:01.000000000 +0000 @@ -66,7 +66,9 @@ static struct History input[1]; static int wnmatch = 4, wnlines = 5, wdmatch = 2, wdlines = 5; -struct History globalhist[1], localhist[1]; +struct History globalhist_buf, localhist_buf; +struct History * const globalhist = &globalhist_buf; +struct History * const localhist = &localhist_buf; int log_count = 0; int norecord = 0; /* supress history (but not log) recording */ int nolog = 0; /* supress log (but not history) recording */ --- history.h.orig 2008-03-04 00:49:37.000000000 +0000 +++ history.h 2008-03-04 00:46:51.000000000 +0000 @@ -13,7 +13,7 @@ # ifndef NO_HISTORY extern void NDECL(init_histories); -extern struct History *FDECL(init_history,(struct History *hist, int maxsize)); +extern struct History *init_history(struct History *hist, int maxsize); extern void FDECL(free_history,(struct History *hist)); extern void FDECL(recordline,(struct History *hist, Aline *aline)); extern void FDECL(record_input,(CONST char *line, struct timeval *tv)); @@ -31,7 +31,7 @@ #define record_global(aline) recordline(globalhist, (aline)) #define record_local(aline) recordline(localhist, (aline)) -extern struct History globalhist[], localhist[]; +extern struct History * const globalhist, * const localhist; extern int log_count, norecord, nolog; # else /* NO_HISTORY */
Download: http://www.trollied.org/~blimey/tf4.diff (1 KB)
Streaming Radio Formats

Implementing my radio player took more experimentation and research than actual coding. As with most software, things turn out more complicated than first anticipated!

To tune in into an Internet radio station, normally you would click on a link on the station's website. From this link your browser will download a play list file, which in turn contains the URL that your favourite media player can use to access the station's live stream. My goal was to make this accessible in a more convenient manner, by allowing access to several radio stations in one place.

So I decided to create my own play list file that contained all the relevant URLs for my favourite stations. I wrote a PHP script that would generate this this play list by downloading each station's play list and extracting the relevant stream URL.

Play list file formats

There are three main formats used for these play list files, which are ASX (Windows Media Player), M3U and PLS (WinAmp).

Easiest to parse was M3U, this is just an ordinary text file that has the URL as a line of text. As PLS is an INI file, this seemed easy using PHP's built in functions to parse INI files. I quickly found that the PHP functions were unreliable so I wrote a kludge to get the URL myself, using a simple regular expression.

In a previous article I discussed problems with ASX files not being XML, so I avoided using this format unless the station did not also offer M3U or PLS. By sheer luck the few stations I am accessing via ASX are well formed XML. Unfortunately the structure of these ASX files vary, so it took a few iterations of the XSLT code to extract the stream URL realiably.

The file format I ultimately create my play list is is XSPF, as this is a well defined format based upon XML. From the master XSPF play list, I can readily generate ASX, M3U and PLS by using XSLT. Having the playlists available for means good support for tuning in with various different media players. I would personally recommend using the XSPF play list in VLC Media Player.

Streaming from a web page

I felt the stations should ideally be available directly a browser, without requiring an external application to be installed. After trying a few different options I opted to use the "XSPF Web Music Player" written in Flash. Getting Flash working is an exercise in itself made easy using the UFO JavaScript library. This setup offers good compatibility with typical browser setups.

Fixing the stream URL

The biggest problem with this was presented by a strange behaviour of Shoutcast servers. When you access the stream URL for a Shoutcast server using a web browser, you are served up a web page about the stream. This HTML content is provided instead of the MP3 stream, on the basis of the User-Agent header of the HTTP request.

Generally browser plugins have their HTTP requests made for them by the browser, so HTML content is delivered to a Flash movie or a Quicktime regardless. The exception to this is the Windows Media Player plugin, which bypasses the browser completely; disregarding proxy settings etc.

Shoutcast servers do provide a special URL that allows you to ensure the MP3 stream is returned regardless of the User-Agent. You just append ;stream.nsv for example http://example.com:1234/ becomes http://example.com:1234/;stream.nsv.

Appending this string confuses any servers that aren't using Shoutcast though. So it became necessary to probe each server, to adjust the URL where required. This works by impersonating Firefox and when HTML comes back, that's Shoutcast.

Memory usage

One significant problem appears to be that of memory consumption while listening to a stream. If you're listening for a few hours or so, expect your memory usage to be in the 100s of megabytes. This is enough to cause problems in a typical PC setup. A workaround for this is to stop and restart the stream occasionally.

The reason behind this is that the MP3 stream is treated as if a very long MP3 file is being downloaded. What you've heard so far remains in memory even though, with the exception of Real Player, you can't seek backwards. This is a well known problem that affects nearly all media players and browser/plugin combinations except VLC Media Player.

Recommended use

If you use the M3U or PLS play lists, you may well just see IP addresses and port numbers rather than meaningful station names. This is a limitation of the file formats used, and isn't fixable. Using the XSPF play list is recommended as it has the station name and website link. Also you can use the XSPF play list in VLC Media Player, without having to worry about your memory usage.

ASX and XML are incompatible

I have been working with various playlist file formats as part of my internet radio project. This has involved creating XSPF playlists from XML sources and using XSLT to convert from XSPF to the alternative PLS and M3U formats.

According to the Simple ASX article on MSDN: an ASX file is an eXtensible Markup Language (XML)-based text file which references a Uniform Resource Locator (URL) for a piece of media content. Having read this I felt that ASX files ought fit neatly into my XML and XSLT based architecture. Only when implementing this, did I discover that ASX actually has quite limited compatibility with XML.

The XML validation tool xmllint rejected a couple of ASX files, since they included the copyright symbol © encoded using ISO-8859-1. It seemed easy to solved this problem, by explicitly specifying the encoding in the XML declaration e.g. <?xml version="1.0" encoding="iso-8859-1"?>. Although the file could then be validated as XML using xmllint, Windows Media Player would simply not load this modified ASX file. It was then apparent that ASX is not compatible with XML.

Some URLs referenced in ASX files may well have query strings, which are of course delimited with ampersands. After seeing a few of these I noticed that none of the ampersand characters were escaped as they should be in XML. For a well formed XML document, a literal ampersand charater should be represented using an escape sequence, e.g. &amp;.

I edit an ASX file to make it valid XML, by ensuring the appropriate escape sequences were used. The play list came up in Windows Media Player, but it did not play as expected. When looking at the properties of the relevant track in Windows Media Player, the text &amp; could be clearly seen as part of the URL. Although I had made an ASX file into a valid XML document, once again this was futile since the ASX file was not interpreted as intended.

To summarise here are some quirks of ASX files that make it incompatible with XML:

  • The default encoding for ASX appears to be ISO-8859-1.
  • An XML declaration cannot be included in an ASX file.
  • Ampersand characters in ASX files are parsed as literal text and should not be escaped as &amp;.

To work around these problems, I wrote a PHP script that downloads an ASX file and converts it in an XML format. To achieve this it simply escapes all the ampersands, and serves this modified content with an appropriate HTTP header for ISO-8859-1 encoding: Content-Type: text/xml; charset=iso-8859-1. Aside from XML incompatibilities many ASX files are tag soup. My PHP script therefore includes kludges to address problems specific to the ASX files I'm working with.

This wasn't quite the clean solution I originally had in mind, but I am now able to import from multiple ASX files across the web. It was less trouble working around PHP's woeful ini file parsing routines to get the PLS file format supported, but that's another story!

Atom to RSS 2.0 feed conversion

I am pleased to say that there is now a valid RSS 2.0 feed available for my blog. Previously there was just the Atom 1.0 feed, since this is format I am using for storage.

I couldn't find any available Atom to RSS conversion utilities, except for atom2rss.xsl from http://atom.geekhood.net/. I had tried using this a while ago, but found it would not produce valid RSS and even corrupted parts of my text through inappropriate handling of escape sequences.

Feeling a bit more confident in my XSLT, I set about addressing these issues. The feed validation service from w3.org now tells me that "this is a valid RSS feed", and so consider this to be good enough for my purposes.

It is slightly disappointing that RSS limits you to a single enclosure (podcast) for each item in a channel, whereas Atom allows for multiple enclosures. Because of this it is not recommended to use the RSS feed since you will actually lose out on some of the mp3 content.

My Atom feed has brought to light some cases that cause a problem with the XSLT code, and there are probably other valid Atom feeds that would not be translated to a valid RSS feed. Nonetheless I feel this is an improvement and am releasing this updated version into the public domain.

I've emailed my fixes over to porneL who originally wrote atom2rss.xsl, and he's kindly updated his site to include my changes.

Radio Tuner in IE7 - XSLT meets conditional comments

Getting Internet Explorer 7 to work with the HTML version of my internet radio tuner was no walk in the park. Read further to learn how to avoid the same pitfalls.

The first problem was that using <object type="application/x-mplayer2" data= … > yields the error "Internet Explorer has blocked this site from using an ActiveX control in an unsafe manner. As a result, this page may not display correctly." Puzzlingly I found that there is no such warning when using <embed type="application/x-mplayer2" src= … > instead.

Using <embed> rather than <object> just for the sake of Internet Explorer didn't make me feel entirely happy, as this proprietary tag is not valid HTML. To solve this problem and retain the validity of my HTML, I chose to use the even more proprietary trick specific to Internet Explorer - conditional comments. This means that the IE specific <embed> is effectively commented out, and the <object> tag is ignored by IE. See some example HTML below:

<!--[if IE]> <embed type="application/x-mplayer2" src="" width="290" height="64"></embed> <![endif]--> <!--[if !IE]><!--> <object type="application/x-mplayer2" data="" width="290" height="64"></object> <!--<![endif]-->

This was slightly tricky to get right in the XSLT. I could have tried using xsl:comment and CDATA sections, but xsl:text and disable-output-escaping seemed like less trouble:

<xsl:text disable-output-escaping="yes">&lt;!--[if IE]&gt;</xsl:text> <embed type="application/x-mplayer2" src="{@href}" width="290" height="64"/> <xsl:text disable-output-escaping="yes">&lt;![endif]--&gt;</xsl:text> <xsl:text disable-output-escaping="yes">&lt;!--[if !IE]&gt;&lt;!--&gt;</xsl:text> <object type="application/x-mplayer2" data="{@href}" width="290" height="64"/> <xsl:text disable-output-escaping="yes">&lt;!--&lt;![endif]--&gt;</xsl:text>

Problem number two relates to the "fix" Microsoft have made to avoid infringement of Eolas Technologies' patent "distributed hypermedia method for automatically invoking external application providing interaction and display of embedded objects within a hypermedia document." This means that Internet Explorer displays a message saying "Click to activate and use the control". Surprisingly the music starts playing automatically anyway, but this is still a minor annoyance. The problem was solved by calling this JavaScript function via document.onload():

function eolasWorkaround() { var i; if ('Microsoft Internet Explorer' != navigator.appName) { return; } if (typeof document.getElementsByTagName == 'undefined') { return; } var embeds = document.getElementsByTagName("embed"); for (i = 0; i < embeds.length; i++) { embeds[i].outerHTML = embeds[i].outerHTML; } }

This complete solution seems to work quite well on all the Windows based browsers I have available now, even Internet Explorer! I have no facility to develop and test this on other platforms, which may not know which plugin to use for the MIME type "application/x-mplayer2". I intend to address this in due course.

In any case the MIME type should strictly speaking be "audio/mpegurl", but this is not supported by Windows. But in case of any difficulty, you can always just click the hypertext link to download the radio.m3u playlist and play it in an external application.

Eolas is a registered trademark of Eolas Technologies Inc. Microsoft, Windows, ActiveX and Internet Explorer are registered trademarks of Microsoft Corporation in the United States and other countries.

Goodbye 4oD adverts

DISCLAIMER: THIS HACK DOES NOT WORK WITH 40D ANY LONGER

I quite like using the 4oD service from Channel 4, I can watch their programs at a time that suits me without the regular ad breaks on live TV. The adverts appear just before the program starts instead, and although you can skip to the right part of a program once it's playing, you can't fast forward the adverts.

The program is delivered to the 4oD player as a play list for Windows Media Player in the ASX file format. The first few items in the play list are the URLs for adverts, and the last one is the actual TV program. To avoid having to sit through these adverts, I saw there was the potential to to intercept this play list and remove the entries pertaining to advertising.

To achieve this, I made use of the redirect_program feature of the Squid web proxy. In this case, it allows me to tell Squid to fetch the play list from my website rather than directly from Channel 4. I wrote the following Perl script based on the example in the Squid documentation:

#!/usr/bin/perl use URI::Escape; $|=1; while (<>) { s@(http://vodapp\.grid\.channel4\.com/c4site-web/playlist\.do\?[^ ]*)@"http://www.trollied.org/~blimey/4oD.php?url=" . uri_escape($1)@e; print; }

So when the 4oD client requests a play list from the vodapp.grid.channel4.com server, it instead requests the playlist from my website. My website then downloads the desired play list, and does the required filtering using PHP and XSLT.

This PHP, very simply downloads the play list into a DOM document and applies the the required XSLT to it.

<?php $basename = basename($_SERVER['SCRIPT_NAME'], '.php'); $xslfile = $basename . '.xsl'; $xmlfile = $_GET['url']; $xml = new DOMDocument; $xml->load($xmlfile); $xsl = new DOMDocument; $xsl->load($xslfile); $xslProc = new XSLTProcessor(); $xslProc->importStylesheet($xsl); header('Content-Type: video/x-ms-asf'); $doc = $xslProc->transformToDoc($xml); echo $doc->saveXML($doc->firstChild); ?>

Here's the XSL, that simply copies everything in the document except for any ENTRY that has a TITLE starting with the word advert.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" method="xml" media-type="video/x-ms-asf"/> <xsl:template match="/ASX/ENTRY"> <xsl:if test="not(starts-with(TITLE, 'Advert'))"> <xsl:copy-of select="."/> </xsl:if> </xsl:template> <xsl:template match="/ASX/*[name() != 'ENTRY']"> <xsl:copy-of select="."/> </xsl:template> <xsl:template match="/"> <ASX VERSION="3.0"> <xsl:apply-templates/> </ASX> </xsl:template> </xsl:stylesheet>
Comparing techniques

There are lots of interesting tips and ideas relating to programming techniques and related principals bandied about. The people who advocate them have all sorts of different backgrounds, and undeniably it can be very useful to draw on the experience of others.

Some techniques are devised to solve problems that are common to all programmers alike, but the significance of an underlying problem addressed will vary for each project. While something may be critically important for you, it could well be of insignificant concern to somebody working on a different project.

Each programmer faces difference challenges, and has difference experience - so the solutions they come up with are quite personal to the individual. For example a programmer may advocate a technique that merely addresses a problem specific to certain programming languages, or is specific to the size/complexity of their application or even their development team.

Techniques in computer programming have evolved significantly over time, and opinions are constantly changing. This is no surprise, considering the rapid pace of change in what software is used for and how it is used.

Because of this, I feel it is important to have an awareness of various different techniques. In each new scenario it's then possible to decide what is the most appropriate technique, treating each case on its merits.

Programming is a complex task and difficult to make generalisations about. There are many tips floating around that contradict one another, and have no explanation of what the basis for them is. It is more difficult to evaluate a technique, without some explanation of what its purpose is.

Windows Services in Delphi

The Win32 API methods for the Service Control Manager are powerful but rather unwieldy. Download the Delphi unit service.pas which contains wrapper functions for QueryServiceStatus(), StartService() and ControlService().

See the interface section for the unit below, hopefully it's all self explanatory.

unit service; // Some useful wrapper functions for NT Services. // Geoffrey Swift (c) 2006 interface uses Windows; function GetServiceStatus(const ServiceName: String): DWORD; function StartService(const ServiceName: String; const Wait: Boolean): DWORD; function StopService(const ServiceName: String; const Wait: Boolean): DWORD;
XHTML document.writeln() imposter functions.

I thought it'd be a nice idea to include my del.icio.us tagrolls on my homepage. This is made possible by some JavaScript served up by their website. Unfortunately their implementation relies on document.writeln() which is not compatible with XHTML.

Undeterred, I created imposter versions of document.write and document.writeln. My implementations progressively concatenate the parameters into a buffer. When the web page loads, the buffered XHTML is positioned appropriately by using an empty placeholder tag.

I originally tried parsing the buffer as an XML document, with the intention of merging the XML nodes into the XHTML document. I tried using xmlDoc.documentElement.cloneNode(true) as a parameter for appendChild() on an XHTML node.

While this worked like a charm in both Firefox and Opera, Internet Explorer complains of "Interface not supported" and Safari has a fatal error! The failing browsers have an incompatibility of XML and XHTML DOM nodes, even though they both support a common DOM interface.

In the end I had to write my own version of cloneNode() to copy elements between the incompatible schemas and ensure valid nesting of elements. There's an exception handler that fails over to using innerHTML complete with IE kludges just in case.

If you're interested to see how this was achieved, please feel free to view the JavaScript source code.