Ruby on Rails

How to write code efficiently (and this has nothing to do with code)

Posted on Updated on

The world is falling down with books, methodologies, techniques, tips and tricks on how to be more efficient in life, work, and everything else you can think of. What isn’t often discussed is how the desktops, both real and virtual can be tailored to act as a strong foundation, and take things to the next level.

I’ve been thinking about this ever since I tidied my (physical) desktop a couple of weeks ago. Wow.. what a difference. Not my strength keeping that tidy, but it got me thinking about my virtual desktop too. As a Ubuntu User with multiple virtual desktops available to me, I’ve always had a strong sense of standard placement of specific applications but after some thought, have taken that to the next level.

The Physical Desktop

I love my latpop, and there are times when I get in the habit of sitting in front of the fire for a few days with it on my lap, but nothing beats the productivity benefits of a desk, a second monitor, a real keyboard and mouse, kick ass sound, and somewhere to set your coffee.

desk

I’m not really going to say much more about it than that  – the pictures says 1000 words.

The Virtual Desktop(s)

Now things get interesting.  If you are a Windows user then unless things have changed since I last braved using one, you are SOL when it comes to virtual desktops.  For Mac and Linux uses, multiple virtual desktops are things that we’ve been using (or haven’t bothered to use) for years.

Anyhoo. Ubuntu and Mac users can have as many virtual desktops as they like.  Not sure about Mac, but with Ubuntu you can configure how they are arranged, and maybe because of my old Cube days, I like 4 virtual desktops side by side, and configured so you when you get to one end you wrap straight around to the other.

That means I have four full desktops that I can flick back and forward between, and each has 2 monitors.. 8 distinct areas.

So the secret to making this a haven of joy and efficiency is always keeping things in the same place relative to each other.  For example,  say all you use is a browser and Word Processor all day long as part of your core job, then an email client, and a music player.

You might do something like this:

Virtual Desktop 1

Laptop: Music Player

Monitor: Email

Virtual Desktop 2

Laptop: Browser

Monitor: Word Processor

Why is that good?  If you are working on a document, and you need to send an email you know that the email client is just over there on the Virtual Desktop to the right.  Not a great example, but what happens when things get a bit more complex.  I’m a rails developer, and at the very least that involves:

  • a browser
  • a terminal running the rails server (and putting out useful information)
  • an editor
  • a terminal running a rails console (used constantly while writing code)
  • often a MySQL GUI

As if this isn’t enough, in development one is typically working with many open files at the same time, so multiple tabs within the editor.  It’s this last part that made me really rethink my old strategy and for the first time move editing and the browser to different virtual desktops.. and try something that has turned out to be incredibly valuable.

I now have 4 different editors open, with the folder tree in each open to a specific folder of a rails project.

  1. Models
  2. Views
  3. Controllers
  4. Project Root (for other.. migrations, configs, helpers, css / js)

Each of these editors

  • is not full screen
  • has a corner visible no matter what (so you can get to it with a click)
  • is always in the same on the screen (I go M/V/C/Other clockwise starting top corner).

It’s all kinds of awesome.  The result of this separation of browser and editor left some really great gaps for other things.  Here’s a rundown of my 4 Virtual Desktops going left to right.

face1

face2

face3

face4

I have to say after running with this for a couple of weeks, I can’t imagine going back.  My fingers have absolutely learned things like CTRL-S, CTRL-ALT-LEFT,F5 (Save code, move left one desktop, refresh browser) and having the Rails server beside the browser and the rails server beside the editor makes so much sense.

Give it a go!

How to add some anonymity to your data scraping Ruby and Rails apps

Posted on Updated on

I have no idea how often people ACTUALLY look at their logs looking for someone scraping their pages, but sometimes you want to just fly under the radar.  I generally don’t agree with stealing web content by scraping, but I do believe that if someone is in the data distribution business, but they suck at it,  it’s ok to bend the rules a little. For example, if they offer an RSS feed that is buggy, slow, huge etc. but their homepage offers the same information more reliably – go for it.

There are basically two things at play here:

  1. Spoofing a user-agent  (pretending to be a browser not a script)
  2. Spoofing the source of the request.

Here’s a little function you can call to get a random user agent, based on a list of really common user agents.. Thanks to the guy who posted this to a blog, sorry, I don’t have the reference anymore.

def self.random_desktop_user_agent
    user_agents = [
      "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11",
      "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/536.26.17 (KHTML, like Gecko) Version/6.0.2 Safari/536.26.17",
      "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)",
      "Mozilla/5.0 (Windows NT 5.1; rv:13.0) Gecko/20100101 Firefox/13.0.1",
      "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 1.1.4322; PeoplePal 6.2)",
      "Mozilla/5.0 (Windows NT 5.1; rv:5.0.1) Gecko/20100101 Firefox/5.0.1",
      "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1",
      "Opera/9.80 (Windows NT 5.1; U; en) Presto/2.10.289 Version/12.01",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11",
      "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)",
      "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) )",
      "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)",
      "Mozilla/5.0 (Windows NT 6.1; rv:2.0b7pre) Gecko/20100921 Firefox/4.0b7pre",
      "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322)",
      "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11",
      "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 3.5.30729)",
      "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11",
      "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11",
      "Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 5.0) Opera 7.02 Bork-edition [en]",
      "Mozilla/5.0 (Windows NT 6.1; rv:5.0) Gecko/20100101 Firefox/5.02",
      "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11",
      "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0",
      "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MRA 5.8 (build 4157); .NET CLR 2.0.50727; AskTbPTV/5.11.3.15590)",
      "Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20100101 Firefox/16.0",
      "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11",
      "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0",
      "Mozilla/5.0 (Windows NT 6.1; rv:16.0) Gecko/20100101 Firefox/16.0",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.57.2 (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2",
      "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)",
      "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.91 Safari/537.11",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/536.26.17 (KHTML, like Gecko) Version/6.0.2 Safari/536.26.17",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:16.0) Gecko/20100101 Firefox/16.0",
      "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0",
      "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; TencentTraveler ; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 2.0.50727)",
      "Mozilla/5.0 (iPad; CPU OS 6_0_1 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A523 Safari/8536.25",
      "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:16.0) Gecko/20100101 Firefox/16.0",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11",
      "Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/20100101 Firefox/17.0",
      "Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/20100101 Firefox/17.0"]
   return user_agents.sample
  end

I usually put such things in a model utility.rb so I can just call it with Utility.random_desktop_agent

That takes care of user agent, now on to proxy.  You’re going to go out and get your own list of proxies.. whether you get some reliable free ones or pay for services.  The best services you call a single proxy of theirs, and it will then cycle through a bunch of IP addresses with the call.  Each call, through a different one round robin.  They then dump those IPs every 30 minutes or so.. not bad.

def self.random_proxy_server
    proxies = [["proxy_server1","proxy_port", "proxyuser_if_authenticated", "proxypassword_if_authenticated"],["proxy_server2",
      "proxy_port","proxyuser_if_authenticated", "proxypassword_if_authenticated"]]
]] return proxies.sample end

Again – I put that in the Utility.rb model.

So put it all together..

Calling a page looks something like this with open-uri

proxy = Utility.random_proxy_server
open( url :proxy_http_basic_authentication => ["#{proxy[0]}:#{proxy[1]}", "#{proxy[2]}", "#{proxy[3]}"], "User-Agent" => Utility.random_desktop_user_agent)

That’s about it.

K