<?xml version="1.0"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
   <channel>
      <pubDate>Sat, 11 Oct 2008 07:13:33 GMT</pubDate>
      <lastBuildDate>Sat, 11 Oct 2008 07:13:33 GMT</lastBuildDate>
      <language>en</language>
      <docs>http://www.rssboard.org/rss-specification</docs>
      <title>CrunchBang ~ language</title>
      <link>http://crunchbang.org/tags/language/</link>
      <description>Code, Design &amp; GNU/Linux</description>

<item>
    <title>Contractions, Waffle, Mrs Briggs &amp; Data</title>
    <link>http://crunchbang.org/archives/2008/06/16/contractions-waffle-mrs-briggs-and-data/</link>
    <pubDate>Mon, 16 Jun 2008 15:04:06 GMT</pubDate>
    <dc:creator>Philip Newborough</dc:creator>
    <guid>http://crunchbang.org/archives/2008/06/16/contractions-waffle-mrs-briggs-and-data/</guid>
    <description><![CDATA[
    <p><img src="http://crunchbang.org/uploads/061508192407-data.png" alt="A stylised image of the Star Trek character Data." /></p>

<p>For the last month or so <strike>I&#39;ve</strike> I have been attempting to eliminate contractions from my blog posts. Initially I found the process quite difficult and <strike>I&#39;d</strike> I would often find myself struggling with basic English. One word which troubled me was, &#34;cannot&#34;, which for a while at least, existed in my head as two separate words; I <strike>can&#39;t</strike> <strike>can not</strike> cannot imagine why? Anyhow, I think <strike>I&#39;m</strike> I am finally beginning to get the hang of it.</p>

<p><strike>I&#39;m</strike> I am not entirely sure why I decided to stop using contractions; maybe <strike>it&#39;s</strike> it has got something to do with my need to experiment? Or, maybe <strike>I&#39;d</strike> I had previously read somewhere that contractions cause issues with non-human translation services. Either way, <strike>I&#39;m</strike> I am quite enjoying the experience, although I fear that it <strike>doesn&#39;t</strike> does not aid the flow of my written gibberish.</p>

<p>While <strike>I&#39;m</strike> I am on the subject of my poorly scribed waffle, <strike>it&#39;s</strike> it has got to be said that writing <strike>doesn&#39;t</strike> does not come naturally to me. The reason my writing <strike>isn&#39;t</strike> is not often easy to read <strike>isn&#39;t</strike> is not entirely due to my recent sans-contraction experiment, no, I believe <strike>it&#39;s</strike> it has more to do with Mrs Briggs, who was both my secondary school English teacher and the biggest distraction throughout my secondary education. Actually, <strike>that&#39;s</strike> that is not completely true, the distractions were her long legs, short skirts and fancy knickers [<em><strike>don&#39;t</strike> do not ask</em>]; which in my humble opinion, <strike>isn&#39;t</strike> is not suitable attire for a secondary school English teacher. Maybe I <strike>should&#39;ve</strike> should have said something at the time? Thinking about it now, <strike>I&#39;m</strike> I am glad I <strike>didn&#39;t</strike> did not say anything because <strike>I&#39;m</strike> I am sure <strike>she&#39;d&#39;ve</strike> she would have flipped out; besides, no normal hormonal teenage boy is going to complain about such things.</p>

<p>Anyway, back to the subject of contractions; if <strike>you&#39;re</strike> you are wondering how all this relates to <a href="http://memory-alpha.org/en/wiki/Data" title="Star trekking across the universe..">Data</a>, well, <strike>it&#39;s</strike> it is a known fact that <strike>Data&#39;s</strike> Data has got issues with verbal contractions in ordinary speech, which is amusing when you consider <strike>he&#39;s</strike> he has got a total linear computational speed rated at sixty trillion operations per second, yet he <strike>can&#39;t</strike> <strike>can not</strike> cannot say, &#34;can&#39;t&#34;. Silly android.</p>

<p>P.S. I thought <strike>it&#39;d</strike> it would be fun to write like this, but to be honest, <strike>&#39;tisn&#39;t</strike> it is not. <strike>&#39;tisn&#39;t</strike> It is not going to happen again ;-)</p>

    <p style="font-size:smaller;">Tags: <a href="http://crunchbang.org/tags/fun/" title="Browse all posts tagged with &#8220;fun&#8221;">fun</a>, <a href="http://crunchbang.org/tags/language/" title="Browse all posts tagged with &#8220;language&#8221;">language</a>, <a href="http://crunchbang.org/tags/life/" title="Browse all posts tagged with &#8220;life&#8221;">life</a>, <a href="http://crunchbang.org/tags/random/" title="Browse all posts tagged with &#8220;random&#8221;">random</a></p>
    ]]></description>
</item>

<item>
    <title>5 Or More Consecutive Consonants</title>
    <link>http://crunchbang.org/archives/2008/05/10/5-or-more-consecutive-consonants/</link>
    <pubDate>Sat, 10 May 2008 09:29:11 GMT</pubDate>
    <dc:creator>Philip Newborough</dc:creator>
    <guid>http://crunchbang.org/archives/2008/05/10/5-or-more-consecutive-consonants/</guid>
    <description><![CDATA[
    <p><img src="http://crunchbang.org/uploads/051008084936-carol_vorderman.jpg" alt="Carol Vorderman" style="float:left;border:0px;margin-right:20px;margin-bottom:10px; padding:4px; background: #babdb6;" /></p>

<p>&#34;A consonant please Carol, and another, and another, and another, and another.&#34; &#8212; actually, this post is not about <a href="http://en.wikipedia.org/wiki/Carol_Vorderman " title="Wikipedia - Carol Vorderman">Carol Vorderman</a> or <a href="http://en.wikipedia.org/wiki/Countdown_%28game_show%29 " title="Wikipedia - Countdown">Countdown</a>, it is about some interesting[<em>?</em>] script output I came across when attempting to write a new spam filter. I will explain&#8230;</p>

<p>Just lately my website has been receiving some rather odd junk comments. The comments make no sense and they have quite obviously been sent by some automated junk flinging robot. The reason the comments make no sense is because they seem to be constructed from random characters. Apart from making no sense, these comments <em>were</em> also becoming a nuisance as they <em>were</em> easily slipping past my existing keyword filters.</p>

<p>So, the other night I decided to sit down and write a new filter to try and catch these random character junk comments. I started by analysing some previously submitted comments to try and find any common patterns. One such pattern I found was multiple strings containing 5 or more consecutive consonants. Thinking this to be unusual, I ran some tests against a <a href="http://crunchbang.org/misc/common_words.txt " title="flat file containing 21110 common English words">flat file containing 21110 common English words</a>. I thought the results were interesting. Here is what I found:</p>

<ul>
<li><a href="http://crunchbang.org/wiki/strings-containing-5-or-more-consecutive-consonants/ " title="85 unique strings containing 5 or more consecutive consonants.">85 unique strings containing 5 or more consecutive consonants</a>.</li>
<li><a href="http://crunchbang.org/wiki/words-containing-5-or-more-consecutive-consonants/ " title="113 words containing 5 or more consecutive consonants.">113 words containing 5 or more consecutive consonants</a>.</li>
<li>9 words containing 5 or more consecutive consonants and no vowels: <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=crypt " title="dict.org - crypt">crypt</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=lymph " title="dict.org - lymph">lymph</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=lynch " title="dict.org - lynch">lynch</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=myrrh " title="dict.org - myrrh">myrrh</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=nymph " title="dict.org - nymph">nymph</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=pygmy " title="dict.org - pygmy">pygmy</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=rhythm " title="dict.org - rhythm">rhythm</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=sylph " title="dict.org - sylph">sylph</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=tryst " title="dict.org - tryst">tryst</a></li>
<li>10 words containing 6 or more consecutive consonants: <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=latchstring " title="dict.org - latchstring">latchstring</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=metempsychosis " title="dict.org - metempsychosis">metempsychosis</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=polysyllabic " title="dict.org - polysyllabic">polysyllabic</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=polysyllable " title="dict.org - polysyllable">polysyllable</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=porphyry " title="dict.org - porphyry">porphyry</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=rhythm " title="dict.org - rhythm">rhythm</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=skyscraper " title="dict.org - skyscraper">skyscraper</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=strychnine " title="dict.org - strychnine">strychnine</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=synchronize " title="dict.org - synchronize">synchronize</a>, <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=synchronous " title="dict.org - synchronous">synchronous</a></li>
<li>1 word containing 6 consecutive consonants and no vowels: <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=rhythm " title="dict.org - rhythm">rhythm</a></li>
<li>1 word containing 7 consecutive consonants: <a href="http://www.dict.org/bin/Dict?Form=Dict1&amp;Strategy=*&amp;Database=*&amp;Query=strychnine " title="dict.org - strychnine">strychnine</a></li>
</ul>

<p>I should state that the above results are in no way definitive. I know this because I also ran the same test against another file containing 311141 words found in the <a href="http://en.wikipedia.org/wiki/An_American_Dictionary_of_the_English_Language " title="Wikipedia - Merriam-Webster dictionary">Merriam-Webster dictionary</a>. Still, by using the results of the initial test I was able to construct a list of safe words to use with my new spam filter.</p>

<p><em>Finally, yes, I did consider not writing this post; however, I am sure my publishing of these results will not change anything. Besides, Arthur, my 80 year old neighbour, is the biggest Countdown fan on the planet, he is also quite Internet savvy and definitely thinks <a href="http://images.google.com/images?q=carol+vorderman+countdown " title="Carol Vorderman, hot or not?">Carol Vorderman is hot</a> &#8212; he may find these results quite useful in increasing his daily Countdown score!</em></p>

    <p style="font-size:smaller;">Tags: <a href="http://crunchbang.org/tags/antispam/" title="Browse all posts tagged with &#8220;antispam&#8221;">antispam</a>, <a href="http://crunchbang.org/tags/language/" title="Browse all posts tagged with &#8220;language&#8221;">language</a>, <a href="http://crunchbang.org/tags/programming/" title="Browse all posts tagged with &#8220;programming&#8221;">programming</a>, <a href="http://crunchbang.org/tags/projects/" title="Browse all posts tagged with &#8220;projects&#8221;">projects</a>, <a href="http://crunchbang.org/tags/whird/" title="Browse all posts tagged with &#8220;whird&#8221;">whird</a></p>
    ]]></description>
</item>

 </channel>
</rss>