In the same way as most other users do, I find myself frequently posting links on twitter using one of the many URL shortening services. I’ll say at this point that I absolutely loathe these services, the recent demise of Tr.im and the debacle which has followed enforces the view that replicating something which already exists is bound to lead to issues (the thinking behind the popular programming acronym DRY).

It would be a serious step forwards (IMHO), if twitter allowed you to convert sections of a tweet to a link in much the same way as one does in HTML so we can avoid this, but anyway, I’ve digressed, people might complain it would make it too “complicated”, rant over, back on track.

As the subject of this post suggests this is just a random little code snippet I put together today to help me hold onto my links, make them a little more searchable and use the real link not a shortened version by posting them to delicious when I tweet them. It almost certainly exists elsewhere, but this is my take on it and doesn’t rely on me using a particular browser and having a plug-in installed on all the machines I use it on which the otherwise excellent tweecious would do.
Requirements

  • PHP5 webserver
  • SimplePie Rss parser (I’m using version 1.1.3 nothing has changed since which would break this, AFAIK)
  • CRONtab or similar to make it automatic, you can always just load the page up in a browser window if you don’t have CRON on your server or use webcron.

What it does

  • Any link you post on twitter is resolved to its original address, title grabbed off the original site and posted onto your delicious account with your tweet in the description and any tags you want.
  • Links will only be posted once, so if a link is automatically posted and you then delete it from delicious then its not going to reappear the next time the script is run.
  • You can optionally blacklist domains or sections of domains which you don’t want to be saved, for instance I tweeted a link to a programme on BBC iPlayer, there’s no point me saving it as iPlayer only stores stuff for a couple of weeks
  • You can also prevent a tweet which has a specific hashtag from being posted, I use the tag “#ns” which will prompt the script to ignore the rest

To use it, just upload twittodel.php from the zip file below (or copy and paste the below source) to your webserver, add simplepie.inc and any other bits it needs, make a cache folder according to whatever you’ve set the CACHE_DIR option to (as a default it wants a folder called ‘cache’.

Download: twittodel.zip

As I say, it was put together fairly quickly so is not guaranteed to be perfect but anyway. Comments/improvements/criticism welcome. It might see the addition of Zemanta style automatic tagging like tweecious, I’ll update this post if it does.

// Author: Duncan Barnes (www.barnesdmd.co.uk)
// Updated: 16/08/09
// License: Creative Commons Attribution-Non-Commercial-Share Alike (http://creativecommons.org/licenses/by-nc-sa/2.0/)
 
//-----------------------------------------------------------------------------------------
//You'll need to enter your delicious credentials and a few other details here to make this work
//I've left in a few details as examples
 
 
//Delicious username
define('DELICIOUS_USERNAME','yourusernamehere'); 
//Delicious password
define('DELICIOUS_PASSWORD','yourpasswordhere'); 
//Tag/s to add to the delicious entry
define('DELICIOUS_TAGS','from_twitter'); 
//Your twitter account timeline rss feed
define('TWITTER_RSS','http://twitter.com/statuses/user_timeline/6752222.rss'); 
//You can put a hash tag here which you might want to use to prevent tweets being posted to delicious
define('TWITTER_OMIT','#ns'); 
//You can put bits of addresses in here which you don't want to be posted, separate with a comma, e.g 'bbc.co.uk/iplayer', no urls which have this in them would be posted
define('ADDRESS_OMIT','bbc.co.uk/iplayer,bbc.co.uk/programmes');
//Directory where this script can store cached data
define('CACHE_DIR','cache/');
//Max number of tweets to process each time the script is run, adjust based on how often you tweet and how often you run the cron controlling this script 
define('MAX_TWEETS',6); 
 
//-----------------------------------------------------------------------------------------
 
 
//Bit of quick error checking
if (!class_exists('SimplePie')){if(file_exists('simplepie.inc')){require_once 'simplepie.inc';}else{echo 'Simplepie not found, check simplepie.inc is in the same directory as this script!';exit();}}
$feed = new SimplePie(TWITTER_RSS,CACHE_DIR,3600) or die('Could not declare Simplepie, you might want to check your TWITTER_RSS and CACHE_DIR settings and also that you have any additional files which SimplePie wants.');
$curl_handle = curl_init() or die('Could not initiate curl, please check its enabled on your server!');
 
//Fetching our cache of previous things we've saved to delicious, means:
// a)we're not repeating a push to delicious 
// b) If you delete an auto created entry from delicious it won't be recreated the next time this is run!
// The script will still continue if there's a problem here, we're not going to worry about it too much as this might be the first run
if(file_exists(CACHE_DIR.'delicious.data')){$deliciouscache = unserialize(file_get_contents(CACHE_DIR.'delicious.data'));}else{$deliciouscache = array();}
 
//For ease of use, the ADDRESS_OMIT option is a constant, however we want its contents as an array for ease of processing so we'll covert it now
$address_omit = explode(',',ADDRESS_OMIT);
@array_walk($address_omit, 'trim_value');
 
//Lets do it
$i=0;
foreach ($feed->get_items() as $item){
	if($i==MAX_TWEETS){break;} //Stoppping if we've reached the limit
	$tweet = $item->get_title();
	//Checking the tweet isnt already in our cache or contains our omit string, we use an md5 of the tweet as it saves us doing any further lookups
	if(in_array(md5($tweet),$deliciouscache) || strstr($tweet, TWITTER_OMIT)){$i++;continue;}
	if(!$page = getUrl($tweet)){$i++;continue;} //We've failed to get the target url, on to the next...
	if(!domain_check($page['url'])){$i++;continue;} //Checking the resolved url isn't in our list of urls we don't want to know about (ADDRESS_OMIT option)
	// At this point we should have an array ($page) containing the title and the url provided by the getUrl function
	// so lets go ahead and try and add to Delicious account
	$url = urlencode($page['url']);
	$title = urlencode($page['title']);
	$tweetenc = urlencode($tweet);
	$tag = urlencode(DELICIOUS_TAGS);
	curl_setopt($curl_handle, CURLOPT_URL, 'https://'.DELICIOUS_USERNAME.':'.DELICIOUS_PASSWORD."@api.del.icio.us/v1/posts/add?url=$url&description=$title&extended=$tweetenc&tags=$tag");
	curl_setopt($curl_handle, CURLOPT_CONNECTTIMEOUT, 10);
	curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, 0);
	if($result = curl_exec($curl_handle)){
		array_unshift($deliciouscache, md5($tweet));
		if(count($deliciouscache) > MAX_TWEETS){ //One in one out, stopping the cache file growing exponentially
			array_pop($deliciouscache);
		}
	}
	$i++;
}
//Writing our cache back to file ready for the next run
if($res = fopen(CACHE_DIR.'delicious.data','w')){fwrite($res,serialize($deliciouscache));}
curl_close($curl_handle);
//Done
 
//Helper functions
function domain_check($url){global $address_omit;if(is_array($address_omit)){foreach($address_omit as $address){if(!empty($address) && strstr($url, $address)){return false;}}}return true;}
function trim_value(&$value){$value = trim($value);} //Used to trim the $domain_omit array
function getUrl($tweet){
	global $curl_handle;
	if(!$tweet){return false;}
    preg_match('/http:\/\/\S*?/U', $tweet, $result);
	curl_setopt($curl_handle, CURLOPT_URL, $result[0]);
	curl_setopt($curl_handle, CURLOPT_CONNECTTIMEOUT, 10);
	curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($curl_handle, CURLOPT_FOLLOWLOCATION, 1);    // follow redirects (in the case of shortened urls)
	if(!$buffer = curl_exec($curl_handle)){return false;}
	$info = curl_getinfo($curl_handle);
	if($info['http_code'] !== 200){return false;} //If the page is gone then cancel
	// match the title of the page at the url
	preg_match( "/<title>(.*)<\/title>/s", $buffer, $match );
	$title = preg_replace('/&lt;\/?(title)[^&gt;]*&gt;/iU', '', $match[1]);
	return array('title'=>$title,'url'=>$info['url']);
}

Half way through last year I had decided that looking back on pictures from five years ago and not really being sure of where they were taken sort of sucked and fervently went around my photo folder armed with Picasa and Google Earth, geotagging as many pictures as I could. This is all very well with UK pictures, and made more difficult when it involved any older sailing ones for instance, as coastlines where present are often not too distinguishable. Of course its damn near impossible with the pictures from Northern Italy I’d taken in 2007, although I had a good go at it.

When I visited Cape Town last year I took what was then my new phone (HTC TyTn II) which thoughtfully includes an inbuilt GPS. Pairing this with an early version of LEM’s TrackMe logging application and I felt fairly well prepared to geotag all my pictures on my return. The trip itself went well and the phone managed to log most days movements (at 2 minute intervals) without the battery giving up (phone part switched off), giving me an acceptable accuracy to tag the pictures by matching up the nearest position fix with the timestamp on the photos.

Royaltek RBT2300However after this I decided that although the phone did the job ok, I ‘d rather not have the constant worry of the battery running flat and bought a Royaltek RBT-2300 gps logger off ebay. Its a basic little unit, the software is a little lacking but it was cheap and it does the job admirably while at the same time using nokia phone batteries meaning that if I have need I can get a second battery for it.

It came with only a car charger which isn’t great but I’ve found that its quite happy running off 5v so made an adapter cable in work so it can be charged from my standard travel usb charger.

So after experiementation, and some scares with software causing data corruption on the photos (always do backups!), my process seems to go something along these lines when returning with fresh pictures:

  • Copy photo’s to computer, leaving originals on SD card as a backup
  • Use Phil Harveys ExifTool to set common meta data on each picture, so thats author, copyright and a basic description of the occasion if the photo’s were all taken on one occasion, this is stored as EXIF and IPTC tags.
  • Extract tracklogs from the Royaltek logger using their software.
  • Convert this to the GPX file format using GPSBabel (I tend to use the version with the user interface for this, beats remembering command line options).
  • Geotag all pictures with GPicSync (this also produces a Google Earth KML file)
  • Done!

This may seem like quite a long drawn out painful way of doing this, in actual fact it takes me about 5 minutes each time. There are other commercial bits of software which may be able to shorten some/all of this, in fact I’ve contemplated buying Downloader Pro which I hear good things about for a while.  The software used above however is all free/open source and does 99% of the process, I tend to use InfranView for image rotation.

charingxtrainMy approach to photography now is pretty much to store as much information as possible about the photo’s and never to delete any no matter how bad you think they are at the time (storage is cheap),  there are software packages which can store more information, the only problem half the time is that its more data stored in a separate database which you then rely on to keep track of your pictures.

This method ensures that the data gathered is all stored in a standard format. Its also great when uploading to Zooomr or Flickr, thats when the meta data comes into its own when you see the photo sharing site build the page around it, including a map, author information and the description already filled in among others.

Its a workflow which has done me well for the last few years, hopefully in 10-15 years time it’ll make it worthwhile to remember where that picture was taken…

Saw a post about this on TorrentFreak last night and was very kindly sent an invite for the service by Ernesto. 

Spotify‘s tagline is ‘Everyone loves music’, the interface is a media player with an iTunes feel to it so nice and familiar and easy to pick up for most people. Built in collaboration with most of the big record labels, it has a vast library of music streamed from the Internet (around 160kbps according to the TorrentFreak article), apart from some of the more indie stuff it seems to do quite well at finding anything your after.

Its not a complete collection of the worlds music by any means but its growing, the only random things I can’t find so far are the Metallica S&M recording from the live set in San Fransisco (I was listening to it in the car this morning so it sprung to mind as something to search for), and anything by 2manyDJ’s.

Music starts playing as soon as you click on it (I’m on almost exclusive use of a 100mbps pipe at the moment so it might not be quite as fast on a normal connection!) and its easy to find more tracks by an artist, find what compilations they appear on etc. Reviews and Biographies are gathered from a range of sources including Wikipedia and All Music.

Of course the main advantage of Spotify is that a) its free! (some unobtrusive adverts) b) it would appear to be the first credible, legal and usable music service to emerge. The guys behind it seem to be taking tips quite happily from the beta community, Last.Fm integration is there and it would seem that more will follow.

Of course, the disadvantages of the system at the moment are no connection with portable devices, according to TorrentFreak this might come in the future. Of course with  better connected media devices including phones (iPhone, G1 etc) and improved data plans we might end up with a situation where we can just stream whatever music we want and don’t bother keeping it ourselves.

There’s a little introduction video here: https://www.spotify.com/en/about/press/concept-video/

(and no I still havn’t had a chance to sort out the template issues on this site!)

I was thinking about putting in a complaint to the Advertising Standards Agency (ASA) ages ago about this but didnt get around to it/couldnt be bothered!

Tonight I saw Ian’s post mentioning he was considering changing to Virgin Media beacuse of the advertised fast fibre optic connection and was reminded of my annoyance over the claims.

It turns out that BSkyB, Talk Talk and a few less idle members of the public did make a complaint. The below two stories cover the situation quite well with ThinkBroadband also carrying quite a few user comments on its page. Essentially Virgin Media as many have probably seen have been advertising fibre optic internet connections direct to the home, the fibre is actually only to one of the street boxes, its copper/steel/aluminium after that. A Virgin Media engineer I was speaking to recently, who works on the core fibre network around London said that it could be anything from 500 to a 1000 households sharing the connection!Virgin Media Fibre Optic ad

Amazingly, the ASA quashed the claims, essentially deeming the copper portion of the network insignificant. By the same logic, as ThinkBroadband have pointed out, BT etc could now make the same claim as from the telephone exchange upwards, the ADSL network is fibre.

Wish the ASA had taken a stand with this and the ‘Unlimited’ claims (There’s a nice little piece about this on one of the BBC blogs), hardly surprising so many people are confused about Broadband.

Update (13/04/08): Of course in addition to the above truth twisting, a worrying blow for net neutrality comes with Virgin Media’s CEO Neil Berkett branding it “a load of bollocks”. Crude and worrying words, glad I’m not with them and supporting his point of view any more.

We got a letter from them in the mail today saying, ‘Sorry to hear your leaving, how about sleeping on it?’, this is accompanied by the type of mask you get on night flights. Be nice if they quit with all the marketing gimmiks and actually delivered on the fibre to the home claim!

Ten months ago I was debating what route to go down with home servers, I’ve debated this in my own head for a while and about 6 months ago briefly fired up a spare mini-itx board and a couple of drives as a home Debian based headless server. While it worked, it wasn’t quite what I had previously envisaged as one of my main points was to have something which wasn’t going to cause my electricity bill to sky rocket while at the same time filling as many roles as possible. I discussed these in the previous post, but the main points were:

  1. File server
  2. Bittorrent capable
  3. Low power consumption (as low as possible)
  4. Secure external access

One of the embedded linux boxes I had my eye on before was the Bubba, it had its upsides including Debian as a base OS which made it pretty flexible. However, I already had a 500Gb SATA drive sitting around and the Bubba is IDE only.

A little look around and I found the Synology website and a fantastic list of compact low power devices, add to this their forums which include a dedicated section on modding their products and I was pretty much sold on buying something from them. Certainly for me, a company who are willing to openly encourage modifications and further development of their product range is a big attraction.Synology DS107+ NAS Station

I chose the DS107+ in the end, only a single internal drive but with an eSATA port and 3 USB2.0 ports, 500mhz processor, 128mb RAM, Gigabit Ethernet, a selection of third party bootstraps, several glowing reviews and the aforementioned forum/wiki it looked like the perfect solution.

I’ve been playing around with it for a week now and am very pleased, its quiet, plays well with Xbox Media Centre and while including an array of useful web apps is flexible enough to let me replace them where necessary.

I’ve replaced the included bittorrent client (rTorrent based) with Transmission/Clutch and set up a few CRON jobs and shell scripts to control it and its running very well. The main reason for doing this is that the included one whilst having a handy desktop download redirector application is based on an old version of rTorrent and doesn’t seed very much (not in the spirit of things really!). The OS is BusyBox linux which is not at the time of writing totally open on the 107, annoying but currently not a big issue.

One other point to add, while putting the drive in I noticed the motherboard headers for a second SATA drive as well. It looks like the DS107 and DS207 products share the same board. Not something which I’m concerned with now, and it would without a doubt void the warranty, but if needed in the future I have a feeling it would take the firmware for the 207 to give me a mirrored RAID.

Here’s a couple of the reviews I found helpful:

I was reading on the train this morning (I don’t think I could go back to commuting without mobile internet!), the BBC have reported on a leaked draft paper from the UK Government published by the Times, proposing to force ISP’s to take action against illegal downloaders with a three strikes and out style system. As has been pointed out, the government, seem to be ignoring the serious privacy implications this has for internet access and what would seem an insurmountable task for the ISP’s to economically and practically undertake. I’ve got an image in my head of a satirical political cartoon showing an MPAA/RIAA figure pushing Gordon Brown along but anyway.

Hero of the guitar by Unhindered by Talent on Flickr

But anyway, aside from these points, the question is, are the media industries as a collective gradually responding better to consumer desires and the internet in general as a result of piracy? The rapid rise in piracy of music, films and television has left the industry models looking increasingly dated and almost ridiculous, indeed, if it could be charted it would correlate quite well with broadband uptake. They were left with choices, embrace, fight or ignore it, as we’ve seen, they’ve steadily fought it and probably thought after Napster that they could go back to sleep again.

CD Brulé by *** Fanch The System !!! *** on Flickr

In the last couple of years we’ve seen numerous press releases by the RIAA and MPAA citing the massive sales losses they’ve sustained as a result of piracy, the press has reported on numerous cases against kids, the elderly and mothers across America and elsewhere, in dubious attempts to deter the masses. Apple introduced iTunes and its worked fairly well with some debate, but that wasn’t the industries idea, that was Apple, and many of the other major online music stores have been third parties approaching the music industry not them reaching out to the consumer. The same sort of scenario has been playing out for video content with the same issues, indeed the upcoming pirate bay case, is seeing the founders gaining steady international notoriety as thieves and the worst of criminals in one light and justified and innocent in another.

Only really in the last year are we seeing perhaps the start of serious attempts at DRM free, decent (ish) quality music and video from the industry and this is encouraged in many ways by piracy (catering for the users so they don’t pirate the material). Its argued so much more now that piracy is slowly forcing the industry to respond better to the consumer, that piracy is a positive driving force for change and that it will eventually lead to a better solution.

Perhaps arguing that the end necessarily justifies the means is not an argument to start, but I do think its valid, the cost to the ISP’s who already have looming demand induced upgrade costs as well as the time and effort for the government to pass through legislation for this ultimatly means it will accomplish little. Others such as Michael Arrington have suggested that you make money from associated products and live events as opposed to the music but I’m far more inclined to agree with the general principles of Paul Glazowski’s rebuttal and in particular the “linear chicken and egg” analogy he makes in his first comment on his post, its unfortunate however to read in the comments a lack of understanding of the items value as opposed to its distribution costs.

As with most people I know, if I like music I’ll buy it, aside from anything else, I still prefer having a physical copy (maybe I’m getting old). I’m not advocating illegal downloads as such, however, piracy is giving the media industries a good kick down the path of progress.

TechCrunch UK alerted me to this first, the Guardian followed up with a piece later on (edited after comment from Mike Butcher, see comments). Finally, the heavyweights of the UK Broadcast industry are teaming up to offer a combined on-demand television service.

A little while ago I wrote among other things the excerpt below in a follow up entry to my college dissertation:

The biggest problem in my opinion facing the large scale adoption of both download and streaming television services is that everyone is offering their own solutions, instead of flicking the TV channel to see something different you end up closing down and then starting up another proprietary application or browsing to another webpage to view content from that one provider which seems from a user perspective a most unworkable and undesirable solution.

Nice to see they got the message one way or another, just a pity so much money had to be spent on the BBC iPlayer before this happened (I know ITV and C4 spent money to built thier respective offerings as well but they are commercial entities not tax beneficiearies and so entitled to do what they want without having to justifiy it to the country).

The piece by the Guardian makes mention of third party content but also interestingly delivering content ultimatly to the TV, it would be interesting to see if there are plans to perhaps integrate this with FreeSat which is due to launch next year. There is of course hope on the TechCrunch post that they’ll employ a more user friendly (read: non-existant) DRM system which might also be a bit more open to other platforms and browsers but I rather doubt this will happen. Both the BBC iPlayer and the 4od service from C4 use the Kontiki system which is very restrictive and very in love with Windows XP, add to this the fact that the chap who’s managing all of this is Lesley MacKenzie of Sky fame who also use the Kontiki system and that seems like the easiest solution to employ from their point of view and their developers.BBC, ITV, C4 and a Kangaroo

Whats frustrating in many ways is why they can’t go with Joost or similar as a front end for this. Looking at the Whats On page on the Joost website there’s a lot of known channels popping up here and there and the mecanism is already built. The cynic in me thinks the reason for not going with something like this is slight desperation by the Broadcasters to hang onto whatever control they can, plus of course not loosing too much of the money two of them have previously invested in the aforementioned Kontiki based system!

Update: Jeremy Stone (BBC) posted on the BBC Backstage mailing list about an article Ashley Highfield has just written which explains why they’re not going with Joost or similar as a distribution method. I’m not convinced by his argument, I can see a certain point of view with advertising revenue (will this be there for UK users?), however I believe this will actually only confuse users more having two offerings which at face value provide very similar services.

Anyway, no point in speculating too much at this point, we’ll have to wait a bit for the facts to come out.

Well I’ve just finished re-installing the latest stable version of WordPress with my new wordpress theme enabled on it. I shouldn’t really say new theme, I should say first as the reason I’ve had to re-install wordpress is beacuse for my sins I didn’t take the time to understand how wordpress templates were meant to function when I installed two years ago and as a result modified wordpress core files.

This has caused me no end of problems over the last two years and I’m pleased I’ve dealt with it finally. I’ll be building the site up a bit more over the next couple of days, but I think I’m almost there after dealing with issues with the simplepie plugin which has its quirks, adding a few choice entries to my .htaccess files to hopefully make sure that previous urls are not broken and a few other little issues.

Ah well, lessons learnt:

  • Take mroe time to understand applications before charging in
  • Don’t modify application code unless really unavoidable.
  • Take time when starting out to think about a long term usable url structure so that you don’t end up potentially breaking trackbacks and other linkage (important for preserving ones pagerank (however low it is!)).

All very obvious in many ways but all the same I ignored them a couple of years ago (web development for me I must admit has been a learning by doing excercise so I’m excusing myself in some ways!).

Anyway, enough for tonight…

At first I thought this was just a bug on the latest v9.5 build of Opera, but no it seems that I’m not alone. When Google finally added search to Google Reader the whole thing has sort of stopped working in Opera, thank goodness for OPML and Bloglines (which has improved quite a bit since I last commented on it). I’m still trying to understand why search was’nt built into Google reader in the first place, seeing as its what Google is good at and all and also how they managed to release without testing in Opera. Now I’m not going to go on a ‘mines better than yours’ rant over it as I am aware of the market figures for Opera desktop, however the mobile version has a very strong following and this has stopped working on Google reader as well. Quite a short sighted move, seeing as its taken so long to add search as a feature one might have thought they would of had time to test it!

Just picked up this one from one of my work related RSS feeds, essentially a DVB to IP multicast convertor, this is designed for corporate enviroments by the sound of things and definatly not the home network but it can’t be too long before someone makes a more compact version to do similar things on a home network, it’d be a perfect add-on for a home server. I know someone who has done this in linux, sending out the transport stream from a DVB-T receiver over multicast and then using VLC to act as the ‘tuner’, certainly something I’ll be trying once I get my home server up and running…