Welcome to Geoffrey Swift's βlog. Please subscribe to the Atom feed.


Converting XSPF to M3U and PLS with XSLT

I'm now syndicating another internet radio station playlist, the new one is from http://www.dnbstations.co.uk/. This didn't take long to implement, as I was able to reuse existing code which converts from XSPF to both M3U and PLS. I did write some new code to convert a Wimpy play list to XSPF, this was pretty easy as the formats are very similar.

The XSLT code to convert a Wimpy play list to XSPF, and from XSPF to M3U and XSPF to PLS is hereby released into the public domain.

Not only does www.dnbstations.co.uk provide the Wimpy playlist, they act as a proxy for all the streams. This seems to be wasteful of their bandwidth, when they could easily let listeners access the stream directly. I was quite surprised when I noticed this, thinking that the stream URLs were just using HTTP redirects. Maybe they'll redo this in the future, depending on how successful they are!

Bypass JavaScript with Squid

I mentioned previously how I used the redirect_program feature in the Squid web proxy to filter out adverts from Channel 4's 4oD service. I did that by intercepting and modifying ASX files. In this article I explain how similarly one can deactivate unwanted JavaScript code.

One motivation for intercepting JavaScript is to avoid advertising, I'm also bypassing Google Analytics. For an entirely self contained JavaScript, your own blank script can be substituted. This works well for example with Google's advertising script show_ads.js.

To use Google Analytics, web site authors must write their own code to invoke the functions Google supply in their JavaScript files. So if Google's code is simply replaced by a blank file, JavaScript errors will result as required functions have not been defined. I've therefore written my own versions of their scripts which mimic the necessary code entry points, but don't actually do anything.

A replacement for the original Google file urchin.js was trivial:

function urchinTracker(){}

but the recently updated Google Analytics script ga.js was a bit more complicated:

var _gat = { _getTracker:function(s) { return { _initData:function(){}, _trackPageview:function(){} } } }

The scripts mentioned above are very popular on many websites. My replacement versions are obviously going to help speed up load times on several web pages, as the web browser has less work to do. Another bonus is that I avoid being included in my own Analytics data, and I get a better idea of who else is looking at my site!

XHTML DTD in XML catalog

I had read an article "W3C Gets Excessive DTD Traffic" on Slashdot last month. This struck a chord since I use XSLT to generate XHTML, and had noticed that the XHTML DTD is downloaded each time it is referenced. This is wasteful of both bandwidth and slows down the XSLT considerably.

To remedy this problem, I downloaded these files to my Linux box and added some XML catalogue entries. This ensures local copies of these files ares used instead of repeatedly downloading them from w3.org.

Here's the shell commands I used on my Slackware 12 box to set things up:

mkdir /usr/share/xml/xhtml1 cd /usr/share/xml/xhtml1 wget http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd \ http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd \ http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd \ http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent \ http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent \ http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent xmlcatalog --noout --add rewriteURI http://www.w3.org/TR/xhtml1/DTD/ \ file:///usr/share/xml/xhtml1/ \ /etc/xml/catalog

To confirm this was working as expected I ran xsltproc under strace. I did however discover what I'd done had no effect on libxslt in PHP. I looked through the source code for PHP, and found no code to initialise libxml to use XML catalogues. This would be a nice feature, hopefully it will be included in a future release of PHP.

Tinyfugue 4 patched for GCC 4

The CHANGES file in Tinyfugue 5 explains that the user interface has changed: New screen handling. See "/help windows". What this means is that when changing between worlds, the whole screen is replaced to show only output from the selected world. Previously messages from each world would all appear in succession on screen.

As I multi-spod and hold conversations across several talkers at the same time, this new interface is not my personal preference. I would rather be using the time honoured classic, which is Tinyfugue 4. However I discovered recently that this older version no longer compiles on modern Linux distributions.

I had a look into this problem, and fixed the source code by retro fitting the relevant fixes already done in Tinyfugue 5. The changes were minimal, so it didn't take long to get this compiling for GCC 4 on Linux. I made a unified diff of these changes. This has come in handy twice for me now, so I hope this proves useful to someone else.

--- history.c.orig 2008-03-04 00:49:30.000000000 +0000 +++ history.c 2008-03-04 00:47:01.000000000 +0000 @@ -66,7 +66,9 @@ static struct History input[1]; static int wnmatch = 4, wnlines = 5, wdmatch = 2, wdlines = 5; -struct History globalhist[1], localhist[1]; +struct History globalhist_buf, localhist_buf; +struct History * const globalhist = &globalhist_buf; +struct History * const localhist = &localhist_buf; int log_count = 0; int norecord = 0; /* supress history (but not log) recording */ int nolog = 0; /* supress log (but not history) recording */ --- history.h.orig 2008-03-04 00:49:37.000000000 +0000 +++ history.h 2008-03-04 00:46:51.000000000 +0000 @@ -13,7 +13,7 @@ # ifndef NO_HISTORY extern void NDECL(init_histories); -extern struct History *FDECL(init_history,(struct History *hist, int maxsize)); +extern struct History *init_history(struct History *hist, int maxsize); extern void FDECL(free_history,(struct History *hist)); extern void FDECL(recordline,(struct History *hist, Aline *aline)); extern void FDECL(record_input,(CONST char *line, struct timeval *tv)); @@ -31,7 +31,7 @@ #define record_global(aline) recordline(globalhist, (aline)) #define record_local(aline) recordline(localhist, (aline)) -extern struct History globalhist[], localhist[]; +extern struct History * const globalhist, * const localhist; extern int log_count, norecord, nolog; # else /* NO_HISTORY */
Download: http://www.trollied.org/~blimey/tf4.diff (1 KB)
Streaming Radio Formats

Implementing my radio player took more experimentation and research than actual coding. As with most software, things turn out more complicated than first anticipated!

To tune in into an Internet radio station, normally you would click on a link on the station's website. From this link your browser will download a play list file, which in turn contains the URL that your favourite media player can use to access the station's live stream. My goal was to make this accessible in a more convenient manner, by allowing access to several radio stations in one place.

So I decided to create my own play list file that contained all the relevant URLs for my favourite stations. I wrote a PHP script that would generate this this play list by downloading each station's play list and extracting the relevant stream URL.

Play list file formats

There are three main formats used for these play list files, which are ASX (Windows Media Player), M3U and PLS (WinAmp).

Easiest to parse was M3U, this is just an ordinary text file that has the URL as a line of text. As PLS is an INI file, this seemed easy using PHP's built in functions to parse INI files. I quickly found that the PHP functions were unreliable so I wrote a kludge to get the URL myself, using a simple regular expression.

In a previous article I discussed problems with ASX files not being XML, so I avoided using this format unless the station did not also offer M3U or PLS. By sheer luck the few stations I am accessing via ASX are well formed XML. Unfortunately the structure of these ASX files vary, so it took a few iterations of the XSLT code to extract the stream URL realiably.

The file format I ultimately create my play list is is XSPF, as this is a well defined format based upon XML. From the master XSPF play list, I can readily generate ASX, M3U and PLS by using XSLT. Having the playlists available for means good support for tuning in with various different media players. I would personally recommend using the XSPF play list in VLC Media Player.

Streaming from a web page

I felt the stations should ideally be available directly a browser, without requiring an external application to be installed. After trying a few different options I opted to use the "XSPF Web Music Player" written in Flash. Getting Flash working is an exercise in itself made easy using the UFO JavaScript library. This setup offers good compatibility with typical browser setups.

Fixing the stream URL

The biggest problem with this was presented by a strange behaviour of Shoutcast servers. When you access the stream URL for a Shoutcast server using a web browser, you are served up a web page about the stream. This HTML content is provided instead of the MP3 stream, on the basis of the User-Agent header of the HTTP request.

Generally browser plugins have their HTTP requests made for them by the browser, so HTML content is delivered to a Flash movie or a Quicktime regardless. The exception to this is the Windows Media Player plugin, which bypasses the browser completely; disregarding proxy settings etc.

Shoutcast servers do provide a special URL that allows you to ensure the MP3 stream is returned regardless of the User-Agent. You just append ;stream.nsv for example http://example.com:1234/ becomes http://example.com:1234/;stream.nsv.

Appending this string confuses any servers that aren't using Shoutcast though. So it became necessary to probe each server, to adjust the URL where required. This works by impersonating Firefox and when HTML comes back, that's Shoutcast.

Memory usage

One significant problem appears to be that of memory consumption while listening to a stream. If you're listening for a few hours or so, expect your memory usage to be in the 100s of megabytes. This is enough to cause problems in a typical PC setup. A workaround for this is to stop and restart the stream occasionally.

The reason behind this is that the MP3 stream is treated as if a very long MP3 file is being downloaded. What you've heard so far remains in memory even though, with the exception of Real Player, you can't seek backwards. This is a well known problem that affects nearly all media players and browser/plugin combinations except VLC Media Player.

Recommended use

If you use the M3U or PLS play lists, you may well just see IP addresses and port numbers rather than meaningful station names. This is a limitation of the file formats used, and isn't fixable. Using the XSPF play list is recommended as it has the station name and website link. Also you can use the XSPF play list in VLC Media Player, without having to worry about your memory usage.