<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>crunchlife: Tag CJ5</title>
    <link>http://crunchlife.com/articles/tag/cj5</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description></description>
    <item>
      <title>Another Ruby Image Scraper</title>
      <description>&lt;p&gt;I&amp;#8217;ve been pouring over a lot of vintage Willys pictures since starting the restoration of &lt;a href="http://crunchlife.com/articles/2008/10/27/the-cj-5" target="_blank"&gt;my 58&amp;#8217; CJ-5&lt;/a&gt; and anyone that has worked with me knows that I tend to obsess over detail. The few quality images I&amp;#8217;ve found has been driving me crazy and I&amp;#8217;m amazed at how much contradicting information I&amp;#8217;ve found about a vehicle that is &lt;strong&gt;only&lt;/strong&gt; 50 years old. Given my career in technology, I&amp;#8217;m always surprised when a Google search returns little or nothing of value.&lt;/p&gt;

&lt;p&gt;My hard drive is steadily filling with what I have found and the old &lt;i&gt;&amp;#8220;Right-click, Save Image As&amp;#8230;&amp;#8221;&lt;/i&gt; has become tedious. Late last night I remembered a little &lt;a href="http://crunchlife.com/articles/2007/08/13/code-snippet-ruby-image-scraper" target="_blank"&gt;image scraping script&lt;/a&gt; I wrote back in August of 2007. I&amp;#8217;ve since cleaned it up, added a nifty progress bar, and replaced scrAPI with the &lt;a href="http://github.com/why/hpricot/tree/master" target="_blank"&gt;Hpricot&lt;/a&gt; HTML parser. Neat!&lt;/p&gt;

&lt;p&gt;I plan on doing some web crawling with it soon. Stay tuned for that. Without further ado:&lt;/p&gt;

&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="comment"&gt;# RB&lt;/span&gt;

&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;rubygems&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;fileutils&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;hpricot&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;open-uri&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;require&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;progressbar&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;

&lt;span class="ident"&gt;attributes&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;['&lt;/span&gt;&lt;span class="string"&gt;href&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;src&lt;/span&gt;&lt;span class="punct"&gt;']&lt;/span&gt;
&lt;span class="ident"&gt;file_extensions&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="punct"&gt;['&lt;/span&gt;&lt;span class="string"&gt;jpg&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;jpeg&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;gif&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;png&lt;/span&gt;&lt;span class="punct"&gt;',&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;tiff&lt;/span&gt;&lt;span class="punct"&gt;']&lt;/span&gt;

&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;fetch_extension&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;      
  &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;split&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;.&lt;/span&gt;&lt;span class="punct"&gt;').&lt;/span&gt;&lt;span class="ident"&gt;last&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;fetch_file&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;uri&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="ident"&gt;progress_bar&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;nil&lt;/span&gt; 
  &lt;span class="ident"&gt;open&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;uri&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:proxy&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="constant"&gt;nil&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt;
    &lt;span class="symbol"&gt;:content_length_proc&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="ident"&gt;lambda&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;length&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
      &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;length&lt;/span&gt; &lt;span class="punct"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="number"&gt;0&lt;/span&gt; &lt;span class="punct"&gt;&amp;lt;&lt;/span&gt; &lt;span class="ident"&gt;length&lt;/span&gt;
        &lt;span class="ident"&gt;progress_bar&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;ProgressBar&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;new&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;uri&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_s&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;length&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
      &lt;span class="keyword"&gt;end&lt;/span&gt; 
    &lt;span class="punct"&gt;},&lt;/span&gt;
    &lt;span class="symbol"&gt;:progress_proc&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="ident"&gt;lambda&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;progress&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
      &lt;span class="ident"&gt;progress_bar&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;set&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;progress&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;progress_bar&lt;/span&gt;
    &lt;span class="punct"&gt;})&lt;/span&gt; &lt;span class="punct"&gt;{|&lt;/span&gt;&lt;span class="ident"&gt;file&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt; &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="ident"&gt;file&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt;&lt;span class="punct"&gt;}&lt;/span&gt;        
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;save_file&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;file_uri&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;  
  &lt;span class="ident"&gt;open&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;file_uri&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;to_s&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;gsub!&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;[&lt;span class="escape"&gt;\/&lt;/span&gt;:]&lt;/span&gt;&lt;span class="punct"&gt;/,&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;_&lt;/span&gt;&lt;span class="punct"&gt;'),&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;wb&lt;/span&gt;&lt;span class="punct"&gt;')&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;file&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt; 
    &lt;span class="ident"&gt;file&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;write&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;fetch_file&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;file_uri&lt;/span&gt;&lt;span class="punct"&gt;));&lt;/span&gt; &lt;span class="ident"&gt;puts&lt;/span&gt;
  &lt;span class="punct"&gt;}&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;scrape_urls&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;attributes&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;      
  &lt;span class="constant"&gt;Hpricot&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;buffer_size&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="number"&gt;262144&lt;/span&gt;
  &lt;span class="ident"&gt;attributes&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;attribute&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
    &lt;span class="constant"&gt;Hpricot&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;).&lt;/span&gt;&lt;span class="ident"&gt;search&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;[@&lt;span class="expr"&gt;#{attribute}&lt;/span&gt;]&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;).&lt;/span&gt;&lt;span class="ident"&gt;map&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;tag&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
      &lt;span class="keyword"&gt;yield&lt;/span&gt; &lt;span class="ident"&gt;tag&lt;/span&gt;&lt;span class="punct"&gt;[&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{attribute}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;]&lt;/span&gt;
    &lt;span class="punct"&gt;}&lt;/span&gt;
  &lt;span class="punct"&gt;}&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;to_absolute_uri&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;original_uri&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="ident"&gt;url&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;URI&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;parse&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;downcase&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;     
  &lt;span class="ident"&gt;url&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;original_uri&lt;/span&gt; &lt;span class="punct"&gt;+&lt;/span&gt; &lt;span class="ident"&gt;url&lt;/span&gt; &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;relative?&lt;/span&gt;  
  &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;normalize&lt;/span&gt;        
&lt;span class="keyword"&gt;end&lt;/span&gt;

&lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="punct"&gt;'&lt;/span&gt;&lt;span class="string"&gt;Enter a URL:&lt;/span&gt;&lt;span class="punct"&gt;'&lt;/span&gt;
&lt;span class="ident"&gt;original_uri&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;URI&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;parse&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;gets&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;chomp!&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;

&lt;span class="ident"&gt;html&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;nil&lt;/span&gt;

&lt;span class="keyword"&gt;begin&lt;/span&gt;
  &lt;span class="ident"&gt;open&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;original_uri&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="symbol"&gt;:proxy&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="constant"&gt;nil&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;{|&lt;/span&gt;&lt;span class="ident"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt; &lt;span class="ident"&gt;html&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;source&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt;&lt;span class="punct"&gt;()}&lt;/span&gt;

  &lt;span class="ident"&gt;scrape_urls&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;html&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;attributes&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
    &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="ident"&gt;file_extensions&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;include?&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;fetch_extension&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt; &lt;span class="keyword"&gt;then&lt;/span&gt;
      &lt;span class="ident"&gt;save_file&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;to_absolute_uri&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;original_uri&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;url&lt;/span&gt;&lt;span class="punct"&gt;))&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="punct"&gt;}&lt;/span&gt;
&lt;span class="keyword"&gt;rescue&lt;/span&gt; &lt;span class="punct"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="ident"&gt;e&lt;/span&gt;
  &lt;span class="ident"&gt;puts&lt;/span&gt; &lt;span class="ident"&gt;e&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
      <pubDate>Wed, 07 Jan 2009 16:22:00 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:32992b2b-63b9-4751-bdc2-cfc4aae59484</guid>
      <author>Ryan Baxter</author>
      <link>http://crunchlife.com/articles/2009/01/07/another-ruby-image-scraper</link>
      <category>Code Snippets</category>
      <category>CJ5</category>
      <category>Ruby</category>
    </item>
    <item>
      <title>The CJ-5</title>
      <description>&lt;p&gt;A couple of weeks ago I became the owner of a 1958 Willy&amp;#8217;s CJ-5.  I&amp;#8217;ve always wanted a Jeep and only seriously started looking for one about a month ago.  &lt;a href="/files/cowboy.jpg" target="_blank"&gt;Being the web-savvy guy that I am&lt;/a&gt; - my search started with eBay and craigslist.  Not having any luck online, I contacted a friend and fellow Jeepster for advice.  Apparently I should have started my search a little closer to home.  Sitting in a back lot of &lt;a href="http://www.tripleamotors.com/" target="_blank"&gt;Triple A Motors&lt;/a&gt; in Williamsport, Pennsylvania was the CJ-5.&lt;/p&gt;

&lt;p&gt;&lt;center&gt;
&lt;img src="/files/cj5_trailer_1.jpg" class="photo"&gt;
&lt;/center&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve begun busting my knuckles and as a Web Developer, it&amp;#8217;s a much welcome diversion.  There is something extremely gratifying in wrenching on a vehicle and hearing it&amp;#8217;s engine roar to life for the first time.  Well, sputter and die in my case, but it did run briefly.  I&amp;#8217;ve already replaced the distributor cap and rotor, plugs, wires, and fuel pump.  Hopefully with some new vacuum lines she&amp;#8217;ll be ready for a proper test drive.&lt;/p&gt;

&lt;p&gt;Since this will be an ongoing project, I&amp;#8217;ll have more pictures as progress is made.&lt;/p&gt;</description>
      <pubDate>Mon, 27 Oct 2008 10:12:00 -0700</pubDate>
      <guid isPermaLink="false">urn:uuid:f731966a-8d77-419f-935c-417fa70c3f2d</guid>
      <author>Ryan Baxter</author>
      <link>http://crunchlife.com/articles/2008/10/27/the-cj-5</link>
      <category>CJ5</category>
      <category>Life</category>
      <enclosure type="image/jpeg" length="102440" url="http://crunchlife.com/files/cj5_trailer_1.jpg"/>
    </item>
  </channel>
</rss>
