2cfbd4a81664e5d31b4b2fd0ace83c42

I want to refactor this code in order to allow multiple incoming feeds without having to launch it multiple times.
The idea would be to split the polling of identi.ca feed and share that information with has many source feeds as necessary.
I didn't think about it too much yet... Info & updates about this code will be available at http://hackerspaces.org/wiki/P0wn1e
Thanks,
==
hk

#!/usr/bin/env ruby1.8
#
# p0wn1e is the hackerspaces.org notifier.
# It reads an ATOM feed and send updates to identi.ca
#

$options = {}
# Identi.ca credentials
$options[:username] = 'p0wn1e'
$options[:password] = 'I_can_haz_sekrit?'
# URI of the ATOM feed for the wiki pages
$options[:feed_uri] = 'http://hackerspaces.org/w/index.php?title=Special:NewPages&feed=atom'
# Uncomment the following line to turn on debugging output
$DEBUG = true

%w(rubygems atom net/https yaml).each { |lib| require(lib) }

 
class P0wn1e
  # sleep time in seconds
  @@delay = 42*2
  # dent message prototype
  @@dent = "!hs %s created '%s' at %s"
  # and corresponding regex prototype
  @@dentr = "!hs.*created\s.(.+).\sat.*"
 
  # P0wn1e haz atomz
  DENT_URI = "http://identi.ca/api/statuses/user_timeline/__USER__.atom"
  # P0wn1e posts updates to identi.ca
  POST_URI = URI.parse('https://identi.ca/api/statuses/update.xml')
  
  def initialize(options = {})
    @user = options[:username]
    @pass = options[:password]
    @feed_uri = options[:feed_uri]
    @all_dents = []
    @all_pages = []
    @new_pages = []
    debug("initialization complete")
  end
  
  # Run p0wn1e! Run!
  def run!
    @http = http_post_connect
    loop do
      refresh_all_feeds
      update if pending_updates?
      debug("haz no pending updates... Sleeping 10 minutes")
      sleep 600 # sleep 10 minutes between batches
    end
  rescue Exception => e
    debug("failed to start! [#{e.class}] #{e.message}")
    raise e
  end
  
  # We only send to identi.ca as the account will forward to twitter
  def dent(message)
    debug("denting #{message}")
    req = Net::HTTP::Post.new(POST_URI.path)
    req.form_data=({ 'status' => message })
    req.basic_auth(@user, @pass)
    debug("authenticated")
    res = @http.request(req)
    debug("sent request")
    if res.body =~ %r{<text>#{@@dentr}</text>.*<id>(\d+)</id>}
      @last_update = [$1, $2.to_i]
      debug("sent new message: #{@last_update.inspect} #{message}")
    end
    res
  rescue Exception => e
    error = "dent failed for message: #{message}\nwith error [#{e.class}] #{e.message}"
    debug(error)
    sleep @@delay
    retry
  end
  
  def last_update
    debug("last_update: #{@last_update.inspect}")
    @last_update ||= find_last_update
  end
  
  def new_pages
    candidates = @all_pages.last.map { |p| [p.title, p.authors.first.name, p.links.first.href] }.reverse
    start_idx = 0
    last_dent_page_name = last_update.first
    candidates.each_with_index do |c, i|
      if c.first == last_dent_page_name
start_idx = [i+1, candidates.length].min
break
      end
    end
    debug("new_pages start at #{start_idx} / #{candidates.size}")
    candidates[start_idx..candidates.length]
  end
  
  def pending_updates?
    !@all_dents.map(&:first).include?(@all_pages.first)
  end
  
  private
  
  # Return an <tt>Atom::Feed</tt> object from a given +url+
  def atom_read(url)
    Atom::Feed.new(Net::HTTP.get(URI.parse(url)))
  end
  
  # Print a debugging message to +STDERR+
  def debug(message = "debug")
    $stderr.puts("p0wn1e " << message) if $DEBUG
  end
  
  # Return [ page_name, notice_id ] from upstream
  def find_last_update
    read_identica_feed.first
  end
  
  # Open HTTP connection to identi.ca API for posting updates
  def http_post_connect
    debug("preparing HTTPS connection to POST")
    http = Net::HTTP.new(POST_URI.host, POST_URI.port)
    http.verify_mode = OpenSSL::SSL::VERIFY_NONE
    http.use_ssl = true
    http
  end
 
  # Return an Array of [ page_name, notice_id ]
  # for all entries corresponding to wiki new pages dents
  def read_identica_feed
    feed = atom_read(DENT_URI.sub(/__USER__/,@user))
    debug("got feed #{feed.title} with #{feed.entries.size} entries")
    feed.entries.map do |entry|
      if entry.title =~ %r{^#{@@dentr}$}
[ $1, entry.id.sub(/.*\//,'').to_i ]
      end
    end.compact || []
  rescue Exception => e
    debug("P0wn1e can't haz atoms: [#{e.class}] #{e.message}")
    sleep(@@delay)
    retry
  end
  
  # Return an Array of [ last_updated_page_name, all_entries ]
  def read_new_pages_feed
    feed = atom_read(@feed_uri)
    debug("got feed #{feed.title} with #{feed.entries.size} entries")
    [feed.entries.first.title, feed.entries]
  rescue Exception => e
    error = "P0wn1e can't haz atoms: [#{e.class}] #{e.message}"
    sleep(@@delay)
    retry
  end
  
  # Fetch identi.ca and wiki atom feeds from upstream
  def refresh_all_feeds
    debug("refreshing identi.ca feed")
    @all_dents = read_identica_feed
    @last_update = @all_dents.first || ['',0]
    debug("refreshing new pages feed")
    @all_pages = read_new_pages_feed
    debug("computing new pages")
    @new_pages = new_pages
    debug("refresh_all_feeds done")
  end
  
  # Send updates about all newly created pages since last check
  def update
    debug("starting update")
    while (page = @new_pages.shift) != nil
      page_name, author, link = page
      dent(@@dent % [author, page_name, link])
# debug(@@dent % [author, page_name, link])
      sleep(@@delay) # Don't post too often, but post all!
    end
    debug("done updating")
  end
end
 
# Run P0wn1e! Run!
P0wn1e.new($options).run!

Refactorings

No refactoring yet !

2cfbd4a81664e5d31b4b2fd0ace83c42

hellekin

September 14, 2009, September 14, 2009 23:27, permalink

No rating. Login to rate!

There are a number of problems in this code:

* first, it's really _shy_: it's afraid of breaking everywhere, rescuing a whole lot of situation that might happen, but there are too many rescues.
* then, the two feeds are too much coupled. They should be taken care of in different classes, so that:
* one identi.ca watcher can share its info with multiple sources (e.g. wiki, blog, you name it)
* the find_last_update method would use a cached version on first call, and not ask everytime (although it should between batches)

Here are my late night thoughts. Hopefully someone can propose some change, or a suggestion, before I have the opportunity to work on it again (I'm very busy(tm) ATM)

Cd37908cb3140002ce5974dbc9d98996

Carol

May 4, 2010, May 04, 2010 09:30, permalink

No rating. Login to rate!

Dating website for adult friend finder. Find your lover Now!

Dating website for adult friend finder. Find your lover Now!

Your refactoring





Format Copy from initial code

or Cancel