Thursday, December 13, 2012

Refinements in Ruby: an ingenuous implementation

UPDATE: I've worked on Namebox, an improved way to protect methods from changes, inspired on this implementation of Refinements.

I'm back to programming after some months of pause. The last thing I've heard about Ruby before pausing was Refinements. And I fell in love with it.

I found that idea so smart that I couldn't continue programming without it. I couldn't wait for Ruby 2.0. I had to implement it on my own.

Ruby 1.8.7 give us enough tools for designing a lexically scoped activation of refinements. I could use using with set_trace_func to detect the end of blocks (scopes), but I preferred to use enable and disable, because:

  • it's simpler to implement;
  • it's explicit and easy to read;
  • the programmer has the freedom to enable and disable the refinements whenever he/she considers it necessary.

My solution is so simple that I called it "an ingenuous implementation". It has many differences from the original proposal, as I will discuss later, but it brings which I consider the most important feature to me: the refinements are limited to physical ranges within the text file. There's no outside consequences. Anyone can use my refined libraries with no (unpleasant) surprises. And the unrefined methods are not affected (if you're thinking about performance impact).

# Refinements for Ruby: an ingenuous implementation
#
# (c) 2012 Sony Fermino dos Santos
# http://rubychallenger.blogspot.com/2012/12/refinements-in-ruby-ingenuous.html
# 
# License: Public Domain
# This software is released "AS IS", without any warranty.
# The author is not responsible for the consequences of use of this software.
#
# This code is not intended to look professional,
# provided that it does what it is supposed to do.
#
# This software was little tested on Ruby 1.8.7 and 1.9.3, with success.
# However, no heavy tests were made, e.g. threads, continuation, benchmarks, etc.
#
# The intended use is in the straightforward flux of execution.
#
# Instead of using +using+ as in the original proposal, here we use
# Module#enable and Module#disable. They're lexically scoped by the
# file:line of where they're called from.
#
# E.g.: Let StrUtils be a module which refine the String class.
# module StrUtils
#   refine String do
#     def foo
#       #...
#     end
#   end
# end
#
# Using it in the code snippets:
#
# StrUtils.enable
# "abc".foo                       #=> works (foo is "visible")
# def bar; puts "abc".foo; end    #=> bar is defined where foo is "visible"
# StrUtils.disable
# "abc".foo                       #=> doesn't work (foo is "invisible")
# bar                             #=> works, as bar was defined where foo is "visible"
# def baz; puts "abc".foo; end
# baz                             #=> doesn't work.
#
# You can enable and disable a module at any time, since you:
# * enable and disable in this order, in the file AND in the execution flow;
# * disable all modules that you enabled in the same file;
# * don't reenable (or redisable) an already enabled (or disabled) module.
#
# See refine_test.rb for more examples.

# Refinements is to avoid monkey patches, but
# we need some minimal patching to implement it.
class Module

  # Opens an enabled range for this module's refinements
  def enable
    info = ranges_info

    # there should be no open range
    raise "Module #{self} was already enabled in #{info[:file]}:#{info[:last]}" if info[:open]

    # range in progress
    info[:ranges] << info[:line]
  end

  # Close a previously opened enabled range
  def disable
    info = ranges_info

    # there must be an open range in progress
    raise "Module #{self} was not enabled in #{info[:file]} before line #{info[:line]}" unless info[:open]

    # beginning of range must be before end
    r_beg = info[:last]
    r_end = info[:line]
    raise "#{self}.disable in #{info[:file]}:#{r_end} must be after #{self}.enable (line #{r_beg})" unless r_end >= r_beg
    r = Range.new(r_beg, r_end)

    # replace the single initial line with the range, making sure it's unique
    info[:ranges].pop
    info[:ranges] << r unless info[:ranges].include? r
  end

  # Check whether a refined method is called from an enabled range
  def enabled?
    info = ranges_info
    info[:ranges].each do |r|
      case r
      when Range
        return true if r.include?(info[:line])
      when Integer
        return true if info[:line] >= r
      end
    end
    false
  end

  private

  # Stores enabled line ranges of caller files for this module
  def enabled_ranges
    @enabled_ranges ||= {}
  end

  # Get the caller info in a structured way (hash)
  def caller_info
    # ignore internal calls (using skip would differ from 1.8.7 to 1.9.3)
    c = caller.find { |s| !s.start_with?(__FILE__, '(eval)') } and
        m = c.match(/^([^:]+):(\d+)(:in `(.*)')?$/) and
        {:file => m[1], :line => m[2].to_i, :method => m[4]} or {}
  end

  # Get line ranges info for the caller file
  def ranges_info
    ci = caller_info
    ranges = enabled_ranges[ci[:file]] ||= []
    ci[:ranges] = ranges

    # check whether there is an opened range in progress for the caller file
    last = ranges[-1]
    if last.is_a? Integer
      ci[:last] = last
      ci[:open] = true
    end

    ci
  end

  # Here the original methods will be replaced with one which checks
  # whether the method is called from an enabled or disabled region,
  # and then decide which method to call.
  def refine klass, &blk
    modname = to_s
    mdl = Module.new &blk

    klass.class_eval do

      # Rename the klass's original (affected) methods
      mdl.instance_methods.each do |m|
        if method_defined? m
          alias_method "_#{m}_changed_by_#{modname}", m
          remove_method m
        end
      end

      # Include the refined methods
      include mdl
    end

    # Rename the refined methods and replace them with
    # a method which will check what method to call.
    mdl.instance_methods.each do |m|
      klass.class_eval <<-STR
        alias_method :_#{modname}_#{m}, :#{m}

        def #{m}(*args, &b)
          if #{modname}.enabled?
            _#{modname}_#{m}(*args, &b)
          else
            begin
              _#{m}_changed_by_#{modname}(*args, &b)
            rescue NoMethodError
              raise NoMethodError.new("Undefined method `#{m}' for #{klass}")
            end
          end
        end
      STR
    end
  end
end


Here there are some examples:

#!/usr/bin/ruby

require "./refine"

class A
  def a
    'a'
  end
  def b
    a + 'b'
  end
end

module A2
  refine A do
    def a
      a + '2'     # A#a, since here A2 is disabled
    end

    A2.enable     # You must make A2 explicit here
    def d
      a + 'd'     # A2#a, since here A2 is enabled
    end
    A2.disable
  end

  refine String do
    def length
      length + 1  # Original String#length, since A2 is disabled here
    end
  end
end

a = A.new
str = 'abc'

puts a.a        # a
puts a.b        # ab
puts str.length # 3

A2.enable

class A
  def c
    a + 'c'   # a2c, as A2 is enabled
  end
end

puts ''
puts a.a      # a2
puts a.b      # ab (b was not refined nor defined where A2 is enabled)
puts a.c      # a2c
puts a.d      # a2d
puts str.length

A2.disable

puts ''
puts a.a      # a
puts a.b      # ab
puts a.c      # a2c (it was defined where A2 is enabled)
# puts a.d      # NoMethodError, since A2 is disabled
puts str.length

# In-method enabling test

def x(y)
  A2.enable
  puts y.a    # a2
  A2.disable
end

x(a)          # a2
x(a)          # enabling multiple times at same line with no error

# Lazy enabling test

def e
  A2.enable
end

def z(y)
  puts y.a    # defined between enable and disable, but affected only after running e() and d()
end

def d
  A2.disable
end

z(a)          # a
e             # now, activating the range for refinements
d
z(a)          # a2

def e2
  A2.enable
end

e2            # running before d(), but...
d             # error, as you are enabing *after* the disable (physically in the text file)


Differences from the original proposal:
  • enable and disable instead of using;
  • calls to refined methods only works if it's within the enabled range in the file; so subclasses won't be affected unless their code is in an enabled range;
  • super doesn't work for calling the original methods, but you can call it by its name from an un-enabled range; or by calling the renamed methods (see the code for refine).


I think this solution is good enough for me, and I guess it won't have the evil side of refinements which was very well discussed in this post (and I agree).

I'm open to discuss about errors, consequences and improvements to my code; feel free. ;-)

Monday, July 23, 2012

Problems with permission in Apache

If you're getting Error 403 Forbidden in your site, and the owner and the filesystem's permissions of your files are correctly set, maybe you must check your Apache configuration files to see whether your DirectoryIndex directive includes the file that you want as index, and whether the Directory directive of your project's path is allowing the requests.

Saturday, July 14, 2012

RVM + Apache + CGI Scripts

Hello!

I've configured a new server on Ubuntu 12.04 and I started to use RVM, an excellent version manager which permits to have multiple versions of Ruby installed on a single server (and many versions of the gems - see gemsets), and it makes easy to switch among them.

I've installed RVM under my user (as myself, not as root with sudo) by following the Ryan Bigg's guide, with no previous system-wide installed Ruby. So, I didn't have any Ruby under /usr/bin. My first task then was to replace the shebang line of all my CGI scripts, from

!#/usr/bin/ruby

to

!#/usr/bin/env ruby
# encoding: utf-8

(The second line is needed to define the encoding of the string literals in my code, for Ruby 1.9.)

However my scripts didn't run under Apache. In the terminal I could run them (by typing ./index.cgi, for example), but not over a browser. A relevant note: in both the user is the same, i.e., the Apache user is the same as the one logged on terminal. Through php tests, I've checked the RVM enviroment was not loaded under Apache. (If anyone can solve that, please let me know.)

I saw this tip for running CGI scripts with RVM, which suggests to put the complete path of specific version of Ruby in the shebang line. That can be useful if you have scripts which run on different versions of Ruby. But that solution doesn't work for me, because my scripts must run on different machines, with different users, different ruby versions and different paths.

The solution which works for me is to put a symlink of the desired Ruby version under /usr/bin:

sudo ln -s /home/sony/.rvm/rubies/ruby-1.8.7-p370/bin/ruby /usr/bin/ruby

(Notes: sony is my username and I chose 1.8.7 because by now my scripts aren't 1.9-compliant yet.)

Therefore I didn't need to have changed the shebang lines. :) But I guess that will be useful in the future.

Tuesday, May 22, 2012

How to UPDATE a table using data from another row

The Problem: I have a table with name, num, and diff. The unusual case here is that the diff must be updated with the difference between two sibling nums for the same name. So I'll need to know the num of the next row (for the same name) to update the current one. AND I want to do that in SQL with a single UPDATE.

Here I'll assume the rows must be sorted by num, but it would be sorted by another field, like id or timestamp.

The solution in MySQL is to use a inline temporary table to get the num of next row and associate it with current id, and use that table in the UPDATE statement.

Here is the code. Enjoy!
 
-- the table used for test (MySQL syntax)
CREATE TABLE `test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(30) NOT NULL,
  `num` int(11) NOT NULL,
  `diff` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=9 ;

-- some values to play on
INSERT INTO `test` (`id`, `name`, `num`, `diff`) VALUES
(1, 'a', 10, NULL),
(2, 'b', 8, NULL),
(3, 'a', 18, NULL),
(4, 'a', 21, NULL),
(5, 'b', 14, NULL),
(6, 'a', 32, NULL),
(7, 'b', 20, NULL),
(8, 'b', 21, NULL);

-- a select to test the ability to retrieve the desired diff value
select id, name, num, (select min(num) from test where name = t1.name and num > t1.num)-num
from test t1
where diff is null
order by name, num;

-- updating test with the calculated diff using another row on same table.
update test t2, (
  select id, (select min(num) from test where name = t1.name and num > t1.num)-num as diff
  from test t1
) t3
set t2.diff = t3.diff
where t2.id = t3.id
and t2.diff is null
 
-- an alternative query to get more than one column from t2
select t1.id, t1.name, t1.num, t2.name, t2.num
from test t1, test t2
where t2.id = (select id from test where name = t1.aome and num > t1.num order by num limit 1)
order by t1.name, t1.num 

See this post in portuguese.

Tuesday, March 13, 2012

Capitalized methods names

You can define methods with capitalized names:

#!/usr/bin/ruby

def Foo
  puts 'foo'
end

def Bar x
  puts x
end

At call time, to avoid Ruby to interpret them as constants, you must make clear that you are using them as functions, by using parenthesis or parameters:

Foo()      #=> 'foo'
Bar 3      #=> 3
Foo        #=> Error: not initialized constant

Sequel uses this feature to define methods named String, Integer, etc. for creating/altering tables.

Tuesday, March 6, 2012

How to install sqlite3-ruby gem on linux

If you are having trouble to install the sqlite3-ruby gem, try the following:

In systems with apt-get:

# apt-get install libsqlite3-dev

In systems with yum:

# yum install sqlite-devel

How to install bson_ext gem on linux

MongoDB requires the bson_ext gem to increase performance. However, it's common to be not ready to install it.

In ubuntu (10.04 32bit) I had to run:

# apt-get install ruby1.8-dev

In systems with yum I had to run:

# yum install gcc
# yum install make
# yum install ruby-devel


Finally in both systems I could run, with no errors:

# gem install bson_ext

A little bit on require behavior

require filename will:
  • return true when it finds filename;
  • return false when it already loaded filename;
  • raise LoadError when it doesn't find filename.
That was not properly documented; the official documentation lean us to guess it would return false when the file is not found, and that's not the case.

Monday, February 27, 2012

Changing aliased method does not alter the original one

Changing aliased method does not alter the original one, and vice-versa.

So, if you need to alter some method that you know it's aliased, you may stay unworried: you won't affect the other aliased methods, and you can use them if you need the original behavior.

See my tests and the results below.

#!/usr/bin/ruby

class A
  def original_method
    puts "original content"
  end
  alias aliased_method original_method
  alias_method :alias_methoded_method, :original_method
end

class B < A
  def original_method
    puts "modified content"
  end
end

class C < A
  def aliased_method
    puts "modified content"
  end
end

class D < A
  def alias_methoded_method
    puts "modified content"
  end
end

[A, B, C, D].each do |klass|
  puts "#{klass}:"
  obj = klass.new
  [:original_method, :aliased_method, :alias_methoded_method].each do |meth|
    print "#{meth}: "
    obj.send meth
  end
  puts
end
The results:
A:
original_method: original content
aliased_method: original content
alias_methoded_method: original content

B:
original_method: modified content
aliased_method: original content
alias_methoded_method: original content

C:
original_method: original content
aliased_method: modified content
alias_methoded_method: original content

D:
original_method: original content
aliased_method: original content
alias_methoded_method: modified content

Friday, February 17, 2012

RewriteRule running twice

Sometimes it seems that RewriteRule is running twice, even we use the [L] flag.

The truth is: it really runs twice (or even more times)! But only when URL is changed.

The [L] flag stops the running of rules following it, but if URL is changed, the new URL will be parsed again from the beginning.

There are many solutions to that (e.g., by using RewriteCond), but the one I used is to put, as first rule, one to tell the RewriteEngine to do nothing if the URL is what I want:


RewriteEngine on

# index is the last rule - is what I want, so doesn't change anything
# and go to it (thank's to [L])!
RewriteRule ^index.php$ - [L,QSA]

# get user id - URL changed, so [L] will cause the new URL to be reparsed
# - and so it will be matched on the above rule.
RewriteRule ^user/(.*)$ index.php?user=$1 [L,QSA]

# in case of user/..., following rules don't apply,
# since the above rule has [L]
# ...

See this post in Portuguese.