<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Stacked Notion</title>
	<atom:link href="http://www.stackednotion.com/feed" rel="self" type="application/rss+xml" />
	<link>http://www.stackednotion.com</link>
	<description></description>
	<lastBuildDate>Wed, 11 Aug 2010 08:30:33 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Tweeted Links: An introduction, and a review</title>
		<link>http://www.stackednotion.com/2010/08/11/tweeted-links-an-introduction-and-a-review</link>
		<comments>http://www.stackednotion.com/2010/08/11/tweeted-links-an-introduction-and-a-review#comments</comments>
		<pubDate>Wed, 11 Aug 2010 08:30:33 +0000</pubDate>
		<dc:creator>Luca</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/?p=297</guid>
		<description><![CDATA[If you follow me on Twitter you have most likely seen my rants about my new service, Tweeted Links. The method behind the madness of it is basically this: I post a lot of links to Twitter, and often want to revisit them at a later time. I usually don&#8217;t Tweet anything meaningful with them [...]]]></description>
			<content:encoded><![CDATA[<p>If you follow me on Twitter you have most likely seen my rants about my new service, <a href="http://tweetedlinks.com/">Tweeted Links</a>. The method behind the madness of it is basically this: I post a lot of links to Twitter, and often want to revisit them at a later time. I usually don&#8217;t Tweet anything meaningful with them that will help searching (e.g. KK) and usually shorten the URLs. Tweeted Links solves that problem! It keeps the links in a single location (a la del.icio.us) and fetches the title of the page. It also solves the problem of shortened URLs by attempting to expand those too.</p>
<p>Another big point that I rather liked about it, was that from design, through development to deployment it took about 6 hours to get the first release out! I fixed a few bugs a couple of days later, so there has altogether been about 8 hours work put into it. Not too shabby, for something useful! I decided to use something new to build it, so used the <a href="http://monkrb.com/">Monk &#8216;glue&#8217; framework</a> which by default combines Sinatra, Ohm and Redis. It is a pretty nice system, however like most Ruby software the default skeletons (and at least the ones mentioned on the site) are rather opinionated. I myself prefer Bundler however none of them had this for dependency management. I wouldn&#8217;t have thought it would be terribly difficult to have a nice system that lets you add or remove different gems from the skeleton so you can customise it to your liking, that would be a nice feature for a future revision of the website. I then rounded it up with running on Ruby 1.9.2 and Unicorn.</p>
<p>In terms of actually using the tools, I have used Sinatra before and played around with Redis and Ohm a bit, however for those I mainly relearnt everything I (thought I) knew. This was the first time I had properly used a key-value storage engine, so something things were a bit strange. As an example I wanted to &#8216;expire&#8217; old users after 6 hours so that their statuses would be refetched. In SQL this is dead simple, however how to replicate that in Redis? In the end I ended up fetching all the records and comparing the created_at field in code, I have doubts this is the optimum solution however&#8230;.</p>
<p>I also had a few teething problems that went unnoticed (for about 15 days &gt;_&lt;) which I recently resolved. I&#8217;m not sure what happened however the expiry daemon just stopped working, even restarting it didn&#8217;t help. In the end I restarted Redis which sorted it, so I&#8217;m not sure what was to blame there. <a href="http://twitter.com/johnno_uk">JNo</a> was the last user to be added to the system, so I blame him. Everything seems to be working again now, at least for a while&#8230;.</p>
<p>Another major problem (which I still haven&#8217;t resolved) is an issue with the Twitter search. Certain users just return no search results. The first user I found with this was the BBC News Twitter account, it had over 70,000 tweets so I thought maybe the search just didn&#8217;t index large accounts. However I then found a few more accounts that only had a couple of hundred tweets. I still haven&#8217;t got this resolved, mainly because I have yet to ask for help from the gods of Twitter <img src='http://www.stackednotion.com/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
<p>So that about sums up Tweeted Links. I&#8217;m quite happy with the results so far, and am looking to expand it a bit further when I get time (I&#8217;m currently focussing on some Android work, which I&#8217;m sure there will be a few posts on over the next couple of weeks). I expect I will probably rewrite the system from scratch though, as to be honest, I&#8217;m not really too happy about how this key-value stuff has worked out so far. The rewrites will also involve some big data restructuring, so I don&#8217;t think it is a bad time to do so. Back to MySQL I say, yep I just said no to NoSQL!</p>
<div style='display:none' id="post-refEl-297"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2010/08/11/tweeted-links-an-introduction-and-a-review/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Unicorns and Rainbows</title>
		<link>http://www.stackednotion.com/2010/07/22/unicorns-and-rainbows</link>
		<comments>http://www.stackednotion.com/2010/07/22/unicorns-and-rainbows#comments</comments>
		<pubDate>Thu, 22 Jul 2010 19:00:14 +0000</pubDate>
		<dc:creator>Luca</dc:creator>
				<category><![CDATA[Presentation]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/?p=288</guid>
		<description><![CDATA[Last week I gave what we call a &#8220;Munch &#8216;n&#8217; Watch&#8221; at work. If anyone comes across a new technology they like the look of, over lunch they give a presentation about it. I have been playing with the Unicorn server recently, and I am using it to serve one of my side projects.

It is [...]]]></description>
			<content:encoded><![CDATA[<p>Last week I gave what we call a &#8220;Munch &#8216;n&#8217; Watch&#8221; at work. If anyone comes across a new technology they like the look of, over lunch they give a presentation about it. I have been playing with the <a href="http://unicorn.bogomips.org/">Unicorn server</a> recently, and I am using it to serve one of my <a href="http://www.tweetedlinks.com/">side projects</a>.</p>
<p><iframe src="http://docs.google.com/present/embed?id=ajh57dctftcs_221gsvk4ff5&#038;size=m" frameborder="0" width="555" height="451"></iframe></p>
<p>It is also available as a <a href='http://www.stackednotion.com/wp-content/uploads/2010/07/Unicorns_and_Rainbows_Munch_And_Watch.pdf'>PDF version</a>.</p>
<p>(On a side note I have been using Google Docs for everything recently. The number of features has grown enormously since I first tried it out. No more Open Office. Yay!)</p>
<div style='display:none' id="post-refEl-288"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2010/07/22/unicorns-and-rainbows/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What are BFS and CFS?</title>
		<link>http://www.stackednotion.com/2010/06/04/what-are-bfs-and-cfs</link>
		<comments>http://www.stackednotion.com/2010/06/04/what-are-bfs-and-cfs#comments</comments>
		<pubDate>Fri, 04 Jun 2010 21:30:00 +0000</pubDate>
		<dc:creator>Luca</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/?p=281</guid>
		<description><![CDATA[A couple of weeks ago I was browsing the XDA Developers forums looking at custom kernels for my Nexus One. I came across a kernel that looked good, however I was a bit confused about the two different versions on offer. One was called BFS and the other CFS. At the time I must have [...]]]></description>
			<content:encoded><![CDATA[<p>A couple of weeks ago I was browsing the XDA Developers forums looking at custom kernels for my Nexus One. I came across a kernel that looked good, however I was a bit confused about the two different versions on offer. One was called BFS and the other CFS. At the time I must have been having a bit of a blonde moment as the two acronyms completely passed me by&#8230;..</p>
<p>BFS and CFS are both different types of task schedulers used by the Linux kernel. CFS (Completely Fair Scheduler) is the default scheduler in the majority of distributions, however it isn&#8217;t though to be great.  I won&#8217;t go into the details, however being relatively old it has built up quite a lot of bulk and the algorithms used are rather complicated.</p>
<p>BFS (Brain Fuck Scheduler) is the new kid on the block. It was written in 2007 after the author became annoyed with the random stalls experienced while using a Linux-based desktop machine. The scheduler is designed to offer low latency when used interactively, for example on a desktop machine, or a phone!</p>
<p>As stated latency is usually reduced, and random stalls should be reduced. However BFS also has another trick up its sleeve. In benchmarks it performed 80% better when encoding a video in x264 format!</p>
<p>BFS however isn&#8217;t (yet) going to make your Linux based systems super fast. Benchmark results are rather mixed, and discussions around it are rather heated. It is not currently included in the mainline Linux tree, and doesn&#8217;t look likely to be included anytime soon.</p>
<p>Either way, if you want to try it on your Android device, it is a quick flash away. Cyanogen Mod includes it by default, and there are plenty of different kernels out there you can try. Go ahead and try it, YMMV!</p>
<p><b>Further reading</b></p>
<ul>
<li>
<a href="http://ck.kolivas.org/patches/bfs/sched-BFS.txt">BFS &#8211; The Brain Fuck Scheduler by Con Kolivas</a> &#8211; An introduction by the author
</li>
<li>
<a href="http://ck.kolivas.org/patches/bfs/bfs-faq.txt">FAQS about BFS</a> &#8211; More details
</li>
<li>
<a href="http://x264dev.multimedia.cx/?p=185">Open source collaboration done right</a> &#8211; Thoughts of a codec developer on BFS
</li>
</ul>
<div style='display:none' id="post-refEl-281"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2010/06/04/what-are-bfs-and-cfs/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>ext4 options for a media drive</title>
		<link>http://www.stackednotion.com/2010/05/25/ext4-options-for-a-media-drive</link>
		<comments>http://www.stackednotion.com/2010/05/25/ext4-options-for-a-media-drive#comments</comments>
		<pubDate>Tue, 25 May 2010 20:45:03 +0000</pubDate>
		<dc:creator>Luca</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Random]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/?p=265</guid>
		<description><![CDATA[At the moment I am reinstalling my desktop PC, and setting it up to be a media server / HTPC. One of the things that has baffled me is the best options to use for ext4 on the media partition. A quick search revealed nothing definite, hence this post. It is mainly as a reminder [...]]]></description>
			<content:encoded><![CDATA[<p>At the moment I am reinstalling my desktop PC, and setting it up to be a media server / HTPC. One of the things that has baffled me is the best options to use for ext4 on the media partition. A quick search revealed nothing definite, hence this post. It is mainly as a reminder for when I reinstall (although I&#8217;ll probably use brtfs when it is more stable), however hopefully someone else will find it useful!</p>
<p><b>Intro</b><br />
I have a 1TB drive which I use as a media drive. I currently have it partitioned as follows:<br />
<code>
<pre># fdisk /dev/sdb -l

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xffffffff

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1         608     4883728+  83  Linux
/dev/sdb2             609      121601   971876272+  8e  Linux LVM
</pre>
<p></code></p>
<p>Yeah nothing fancy, a small 5GB partition at the start and the rest setup in LVM. The small partition is to serve as a simple recovery partition if I want to access the contents of the drive on another OS; I can do so just by booting that up in a virtual machine. 5GB is probably a bit big, however it isn&#8217;t as if space is scarce.</p>
<p>In terms of LVM I just have a volume taking up the full size of the disk:<br />
<code>
<pre># lvdisplay
  --- Logical volume ---
  LV Name                /dev/vg/media
  VG Name                vg
  LV UUID                2OJIN6-sAx8-KRIz-SNvl-RU0I-2Xgl-nzUH4V
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                926.85 GiB
  Current LE             237274
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:0
</pre>
<p></code></p>
<p>I have previously setup different partitions for certain things, however in the end that just got a pain when one partition got too small. This setup kind of defeats the point of LVM, however if I want another partition (e.g. to dual boot) or to add another disk I can easily do so.</p>
<p><img src="http://www.stackednotion.com/wp-content/uploads/2010/05/default.aspx_.jpeg" alt="" title="default.aspx" width="499" height="374" class="aligncenter size-full wp-image-274" /></p>
<p><b>ext4</b><br />
Using the default options, on the face of it, ext4 doesn&#8217;t provide any significant advantages over ext3, or even ext2 when used for a media drive. However there are some nice features behind the scenes:</p>
<ul>
<li>Larger maximum filesystem size (ext3 is limited to 16TB)</li>
<li>Extents (a set of contiguous blocks; improves performance and leads to decreased fragmentation)</li>
<li>Multiblock allocation (as opposed to single block allocation; improves write performance for large files)</li>
<li>Delayed allocation (doesn&#8217;t write the file to disk immediately, waiting until more of the file is in the cache, and then allocating a larger number of blocks; leads to decreased fragmentation)</li>
<li>Faster fsck (Try running fsck on a 1TB ext2 volume ^_^)</li>
<li>Persistant preallocation (Reserves a large number of blocks, which aren&#8217;t going to be used yet but will be at some point; on par with what modern P2P applications do by filling a file with zeros, however implemented into the filesystem so much more efficient)</li>
</ul>
<p>A good introduction to ext4 (which these were taken from) is the <a href="http://kernelnewbies.org/Ext4">ext4 article on Kernel Newbies</a>.</p>
<p><b>Creating the filesystem</b></p>
<p>The options for mkfs.ext4 are basically unchanged from ext3, in order to enable the high performance options most of the work will be done in tune2fs and mount options. Even so for now</p>
<p><code>
<pre>
</pre>
<p></code></p>
<p>The b option sets the block size (this will most likely default to 4096 bytes), the M options set the percentage of space reserved for the super user, and the L option sets the filesystem label. This command shouldn&#8217;t take more than a couple of minutes to run.</p>
<p><img src="http://www.stackednotion.com/wp-content/uploads/2010/05/funny-pictures-kitten-erases-your-hard-drive.jpeg" alt="" title="funny-pictures-kitten-erases-your-hard-drive" width="400" height="300" class="aligncenter size-full wp-image-276" /></p>
<p><b>Optimising</b></p>
<p>Next up is to pass some options to tune2fs to enable the advanced features. These options could have been passed to the previous command, however to easily explain they have been listed separated. You can also run this command to upgrade an existing ext3 file system. A lot of these are enabled by default, however they are listed here for completeness sake.</p>
<ul>
<li>has_journal &#8211; Enables the ext4 journal</li>
<li>extent &#8211; Enables extents (see above)</li>
<li>huge_file &#8211; Enables files over 2GB in size</li>
<li>flex_bg &#8211; Allows inode metadata to be placed anywhere on the partition (as opposed to traditionally at the start)</li>
<li>uninit_bg &#8211; Enables new blocks to not be initialised and enables block checksumming. Significantly speeds up creation and fsck</li>
<li>dir_nlink &#8211; Enables unlimited sub directories</li>
<li>dir_index &#8211; Enables a hashed B-tree for linking subdirectories</li>
<li>extra_isize &#8211; Enables nanoseconds to be stored in timestamps</li>
</ul>
<p><code>
<pre>
# tune2fs -O has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink,dir_index,extra_isize /dev/vg/media
tune2fs 1.41.12 (17-May-2010)
</pre>
<p></code></p>
<p>As this just enables various options on the file system it will execute rather quickly, especially if the file system is empty. Note that if you are migrating an existing partition, most of these options will only apply to newly created files.</p>
<p><b>Mount options</b></p>
<p>The real killer performance can be obtained by setting certain mount options. Note that these seriously increase the likelihood of corruption if a system crash occurs, however in the case of a media drive, where there are a limited number of writes, I don&#8217;t feel that this is an issue.</p>
<ul>
<li>barrier=0 &#8211; Disables a protection option that ensures everything is written to the journal before being committed, however decreases performance by ~30%; this is the default on the majority of systems</li>
<li>commit=60 &#8211; Only commits the journal to disk every 60 seconds</li>
<li>noatime &#8211; Disables the access time attribute which causes the metadata to be updated every time a file is accessed</li>
<li>data=writeback &#8211; Enables journaling of metadata only; can cause files to become corrupt if a system crash occurs</li>
<li>journal_async_commit &#8211; Write the journal to disk asynchronously</li>
</ul>
<p>These can be set in your /etc/fstab, e.g.:</p>
<p><code>
<pre>/dev/vg/media	/media	ext4	defaults,user,barrier=0,commit=60,noatime,data=writeback,journal_async_commit	0	0
</pre>
<p></code></p>
<p>So thats it! If you find any other options that you deem useful, feel free to post them in the comments!</p>
<div style='display:none' id="post-refEl-265"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2010/05/25/ext4-options-for-a-media-drive/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Getting Social</title>
		<link>http://www.stackednotion.com/2010/04/19/getting-social</link>
		<comments>http://www.stackednotion.com/2010/04/19/getting-social#comments</comments>
		<pubDate>Mon, 19 Apr 2010 20:30:24 +0000</pubDate>
		<dc:creator>Luca</dc:creator>
				<category><![CDATA[Random]]></category>
		<category><![CDATA[Social Networks]]></category>
		<category><![CDATA[Updates]]></category>
		<category><![CDATA[XAuth]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/?p=239</guid>
		<description><![CDATA[If you are reading this post on the full site, you will notice a new bar at the bottom of the page. I&#8217;ve added the Meebo Bar which lets you do cool social-shary stuff.
At the moment it isn&#8217;t really that impressive, however it looks like they are going to be doing some cool stuff with [...]]]></description>
			<content:encoded><![CDATA[<p>If you are reading this post on the full site, you will notice a new bar at the bottom of the page. I&#8217;ve added the <a href="http://bar.meebo.com/">Meebo Bar</a> which lets you do cool social-shary stuff.</p>
<p>At the moment it isn&#8217;t really that impressive, however it looks like they are going to be doing some cool stuff with <a href="http://www.xauth.org/">XAuth</a>, a service that shares what social connections you have with a site you are visiting. So instead of a trillion &#8216;Share on <enter site>&#8216; buttons, you will just have one to share on everything. Nice!</p>
<p>Checkout the video below for an introduction to XAuth by the Meebo guys:</p>
<p><object width="600" height="400"><param name="movie" value="http://www.youtube.com/v/-UjXswWs7xg&#038;color1=0xb1b1b1&#038;color2=0xcfcfcf&#038;hl=en_US&#038;feature=player_embedded&#038;fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowScriptAccess" value="always"></param><embed wmode="transparent" src="http://www.youtube.com/v/-UjXswWs7xg&#038;color1=0xb1b1b1&#038;color2=0xcfcfcf&#038;hl=en_US&#038;feature=player_embedded&#038;fs=1" type="application/x-shockwave-flash" allowfullscreen="true" allowScriptAccess="always" width="600" height="400"></embed></object></p>
<div style='display:none' id="post-refEl-239"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2010/04/19/getting-social/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Watch your UTCing&#8230;</title>
		<link>http://www.stackednotion.com/2010/04/15/watch-your-utcing</link>
		<comments>http://www.stackednotion.com/2010/04/15/watch-your-utcing#comments</comments>
		<pubDate>Thu, 15 Apr 2010 14:00:58 +0000</pubDate>
		<dc:creator>Luca</dc:creator>
				<category><![CDATA[BST]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[DST]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Time]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/?p=234</guid>
		<description><![CDATA[After a good few hours of investigating a bug at work yesterday, I came across a weird Ruby feature&#8230;. The #utc method on Time objects is destructive and modifies the receiver.
$ irb
ruby-1.8.6-p383 > a = Time.now
 => Wed Apr 14 15:36:21 +0100 2010
ruby-1.8.6-p383 > a.utc
 => Wed Apr 14 14:36:21 UTC 2010
ruby-1.8.6-p383 > a
 => [...]]]></description>
			<content:encoded><![CDATA[<p>After a good few hours of investigating a bug at work yesterday, I came across a weird Ruby feature&#8230;. The #utc method on Time objects is destructive and modifies the receiver.</p>
<p><code>$ irb<br />
ruby-1.8.6-p383 > a = Time.now<br />
 => Wed Apr 14 15:36:21 +0100 2010<br />
ruby-1.8.6-p383 > a.utc<br />
 => Wed Apr 14 14:36:21 UTC 2010<br />
ruby-1.8.6-p383 > a<br />
 => Wed Apr 14 14:36:21 UTC 2010</code></p>
<p>This is actually <a href="http://ruby-doc.org/core/classes/Time.html#M000265">in the Ruby docs</a>, however who reads those? This is kind of unexpected behaviour, as in most cases in Ruby (however this is more of a Rails thing), if there are two methods the more dangerous of the two is suffixed by an exclamation mark. A good example is String#strip and String#strip!. The version without an exclamation mark doesn&#8217;t modify the original object, however the version with does:</p>
<p><code>$ irb<br />
ruby-1.8.6-p383 > a = " hello "<br />
 => " hello "<br />
ruby-1.8.6-p383 > a.strip<br />
 => "hello"<br />
ruby-1.8.6-p383 > a<br />
 => " hello "<br />
ruby-1.8.6-p383 > a.strip!<br />
 => "hello"<br />
ruby-1.8.6-p383 > a<br />
 => "hello" </code></p>
<p>So if you are going to use #utc, be careful and make sure it you aren&#8217;t reliant on the time offset later. You should really be using #dup or #getutc which won&#8217;t modify the receiver:</p>
<p><code>$ irb<br />
ruby-1.8.6-p383 > b = Time.now<br />
 => Wed Apr 14 15:36:59 +0100 2010<br />
ruby-1.8.6-p383 > b.getutc<br />
 => Wed Apr 14 14:36:59 UTC 2010<br />
ruby-1.8.6-p383 > b<br />
 => Wed Apr 14 15:36:59 +0100 2010</code></p>
<p>This is present in all current versions of Ruby, so upgrading won&#8217;t help either!</p>
<div style='display:none' id="post-refEl-234"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2010/04/15/watch-your-utcing/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How I used Twitter to harvest 8000 email addresses in twelve hours</title>
		<link>http://www.stackednotion.com/2010/04/07/how-i-used-twitter-to-harvest-8000-email-addresses-in-twelve-hours</link>
		<comments>http://www.stackednotion.com/2010/04/07/how-i-used-twitter-to-harvest-8000-email-addresses-in-twelve-hours#comments</comments>
		<pubDate>Wed, 07 Apr 2010 14:00:28 +0000</pubDate>
		<dc:creator>Luca</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[Social Networks]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/?p=220</guid>
		<description><![CDATA[After arriving home from work this evening I turned to the Twittersphear to see what was happening in the Blogosphear. I came across an article detailing how some guy got sued by Facebook for scraping their site. It is a rather interesting read, and rather eye-opening at how easy it is to do such things [...]]]></description>
			<content:encoded><![CDATA[<p>After arriving home from work this evening I turned to the Twittersphear to see what was happening in the Blogosphear. I came across an article detailing <a href="http://petewarden.typepad.com/searchbrowser/2010/04/how-i-got-sued-by-facebook.html">how some guy got sued by Facebook</a> for scraping their site. It is a rather interesting read, and rather eye-opening at how easy it is to do such things on social networking sites. Facebook has to be one of the worse in terms of security, or at least apparent security, at least on Twitter you know everyone can access it unless your profile is private.</p>
<p>Back to the topic, the idea I had was to investigate how (ridiculously) easy it would be to scrape email addresses from Twitter. Having already created <a href="http://twitter.com/Rainbow_bot">the Rainbow Bot</a>, I knew that searching for particular terms on Twitter using <a href="http://github.com/jnunemaker/twitter">a certain gem</a> was a piece of cake. So I started searching for &lsquo;@gmail&rsquo;.</p>
<p>The results were pretty much as expected, rather a lot of users Tweeted their email address. Most of them used some sort of simple filtering (a space between the user and domain; non word characters before the domain), and it was rather easy to tell an address which was invalid (three consecutive dots is a sure fire no-no). Also interesting were the number of users who replaced the user of the address with &lsquo;username&rsquo; or &lsquo;twitter&rsquo;, for example &#8216;Send CVs to my twitter @gmail.com&#8217;.</p>
<p>I then decided to expand this for a couple of other free email services (Hotmail and Yahoo), and as expected, the number of addresses went up. A simple search gave me <a href="http://www.zemskov.net/free-email-domains.html">a list of free email domains</a> (NB: I have doubts that is a list of &#8216;every possible email domains&#8217; <img src='http://www.stackednotion.com/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> ), which would allow me to expand this further. I didn&#8217;t even bother investigating handling of pagination, and without special permissions the Twitter search API only returns a limited subset of the total number of Tweets.</p>
<p>In total I managed to scrape 8000 addresses over a period of approximately twelve hours. So in conclusion, social networks appear to be a spammers paradise. <img src='http://www.stackednotion.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<div style='display:none' id="post-refEl-220"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2010/04/07/how-i-used-twitter-to-harvest-8000-email-addresses-in-twelve-hours/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Nice Loader</title>
		<link>http://www.stackednotion.com/2010/03/03/nice-loader</link>
		<comments>http://www.stackednotion.com/2010/03/03/nice-loader#comments</comments>
		<pubDate>Wed, 03 Mar 2010 21:51:11 +0000</pubDate>
		<dc:creator>Luca</dc:creator>
				<category><![CDATA[Canvas]]></category>
		<category><![CDATA[HTML5]]></category>
		<category><![CDATA[Javascript]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/?p=213</guid>
		<description><![CDATA[Since the start of Web 2.0, there have been a number of fancy loaders for Ajax requests and the type. Most of them are all based on the spinner type, which to be honest, is getting a bit old. Today I saw something original in Tweetdeck, and decided to create a HTML5 / Canvas version. [...]]]></description>
			<content:encoded><![CDATA[<p>Since the start of Web 2.0, there have been a number of fancy loaders for Ajax requests and the type. Most of them are all based on the spinner type, which to be honest, is getting a bit old. Today I saw something original in Tweetdeck, and decided to create a HTML5 / Canvas version. Enjoy! </p>
<p><center><br />
<canvas id="block-loader" width="200" height="100"></canvas><br />
</center></p>
<p><script src="/uploads/prototype.js" type="text/javascript"></script><br />
<script src="/uploads/2009/08/rgbcolor.js" type="text/javascript"></script><br />
<script src="/uploads/2010/03/block-loader.js" type="text/javascript"></script></p>
<div style='display:none' id="post-refEl-213"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2010/03/03/nice-loader/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Daylight Savings Time Woes</title>
		<link>http://www.stackednotion.com/2009/12/22/daylight-savings-time-woes</link>
		<comments>http://www.stackednotion.com/2009/12/22/daylight-savings-time-woes#comments</comments>
		<pubDate>Tue, 22 Dec 2009 00:00:00 +0000</pubDate>
		<dc:creator></dc:creator>
				<category><![CDATA[BST]]></category>
		<category><![CDATA[DST]]></category>
		<category><![CDATA[Time]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/articles/2009/12/22/dst_woes</guid>
		<description><![CDATA[Daylight Savings Time (DST) is one of those things you don&#8217;t really think about. You get an extra hour in bed in the Spring and an hour less in the Autumn, other than that and having to change the clocks you don&#8217;t have to think about much. Oh my, just be wait until you have [...]]]></description>
			<content:encoded><![CDATA[<p>Daylight Savings Time (DST) is one of those things you don&#8217;t really think about. You get an extra hour in bed in the Spring and an hour less in the Autumn, other than that and having to change the clocks you don&#8217;t have to think about much. Oh my, just be wait until you have to work on a system that has to deal with these&#8230;</p>
<p>Here in Blighty we call it British Summer Time (BST). This year (2009) BST started on Sunday, March 29th at 01:00 GMT, and ended on Sunday, October 25th at 01:00 GMT (02:00 BST).</p>
<p>If you system is dealing with dates that are in GMT everything is fine and dandy. Not quite so if you are given dates in BST, however still fairly easy to handle if you have a good time library. The issues pop up when you get given times in local time, and have no idea whether they are BST or GMT. When the times are changed back in the autumn there is a period when 01:33 could be BST or GMT.</p>
<p>        The solution for this is quite simple though: NEVER EVER SEND TIMES IN LOCAL TIME, ESPECIALLY NOT WITHOUT A TIMEZONE.</p>
<div style='display:none' id="post-refEl-162"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2009/12/22/daylight-savings-time-woes/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Off To The City</title>
		<link>http://www.stackednotion.com/2009/09/12/off-to-the-city</link>
		<comments>http://www.stackednotion.com/2009/09/12/off-to-the-city#comments</comments>
		<pubDate>Sat, 12 Sep 2009 23:00:00 +0000</pubDate>
		<dc:creator></dc:creator>
				<category><![CDATA[City]]></category>
		<category><![CDATA[Moving]]></category>
		<category><![CDATA[Real Life]]></category>
		<category><![CDATA[Updates]]></category>

		<guid isPermaLink="false">http://www.stackednotion.com/articles/2009/09/13/off_to_the_city</guid>
		<description><![CDATA[Well tomorrow I am going to be moving up to London. Job interview last Monday, told the next day that I am starting on Tuesday. It is all a bit sureal really. I haven&#8217;t yet got any accomodation sorted out, so I am going to be crashing with family who live just outside London, and [...]]]></description>
			<content:encoded><![CDATA[<p>Well tomorrow I am going to be moving up to London. Job interview last Monday, told the next day that I am starting on Tuesday. It is all a bit sureal really. I haven&#8217;t yet got any accomodation sorted out, so I am going to be crashing with family who live just outside London, and commuting in everyday. I hope to have my own place sorted by this time next week.</p>
<div class="image">
          <img src="/uploads/city_map.jpg" /><br />
          Original by<br />
          <a href="http://www.flickr.com/photos/travelmatt/514467812/"><br />
            travelmatt<br />
          </a>.
        </div>
<p>I am rather excited about the job though. My first proper job with a proper company! The guys at the interview seemed really cool, so it should be good working with them. The actual role is going to involve mainly doing backend Ruby development, so expect to see lots of Ruby posts in the future! They run a full scrum environment, so it should be good to see how that works in practice. I don&#8217;t think it was even mentioned during my degree, let alone put into practice&#8230; The location of the offices are in a great location as well, right next to Waterloo station. A quick hunt after the interview, there is a Sainsbury&#8217;s, a Starbucks, and lots of pubs, what more do I need?</p>
<p>Regarding the accomodation I have had a quick look online, however I figured it would be better to wait until a) I have a phone signal and b) I can go to places in person. As such tomorrow is going to be the prime house hunt day. As recommended by the recruitment company, I am looking in the South West, around Clapham Junction area. This apparently has lots of young professionals, and isn&#8217;t too pricey. I can&#8217;t afford anything in the North (yet) and the East is a bit far out from everything.</p>
<p>Prices aren&#8217;t too bad either, for a shared house you are looking at about £125, and depending on the area, about double that for a small flat / bedsit. I am looking on sites such as <a href='http://www.gumtree.com/london_houses_to_rent_offered.html'>Gumtree</a> as most of the letting agencies seem to have the more expensive properties. I am not really sure of the best way to go about this house hunt, so that will be another post when I have a better clue.</p>
<div style='display:none' id="post-refEl-163"></div>]]></content:encoded>
			<wfw:commentRss>http://www.stackednotion.com/2009/09/12/off-to-the-city/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
