55502f40dc8b7c769880b10874abc9d0

Is it good way to extract pattern match from string?

import re

def process_html(str):
    pattern = re.compile('<object ([\w="\d+"]|\s)+>([\x20-\x7E\s])+</object>')
    match   = pattern.match(str)
    return match.group()

Refactorings

No refactoring yet !

7f69b0a9f0a030c37dca69736abb9f39

nicerobot

March 17, 2010, March 17, 2010 12:46, permalink

No rating. Login to rate!

1. Regular Expressions Are Not A Good Idea for Parsing XML, HTML, or e-mail Addresses http://wiki.tcl.tk/4164
2. Your code refers to matches as 'm' (line 6) but the matches are named 'match' (line 5).
3. Groups are referenced by specifying the (one-based) group number.
Note: I didn't change your re. I just changed lines 5 and 6.

import re

def process_html(str):
    pattern = re.compile('<object ([\w="\d+"]|\s)+>([\x20-\x7E\s])+</object>')
    m = pattern.match(str)
    return m.group(1)
55502f40dc8b7c769880b10874abc9d0

rullon.myopenid.com

March 17, 2010, March 17, 2010 13:33, permalink

No rating. Login to rate!

2nicerobot, thx for reply!
goal was to clean vimeo(or any other service) embed player code. so i decided to not parse anything.

we have: 
--------

<object width="400" height="300"><param name="allowfullscreen" value="true" />
    <param name="allowscriptaccess" value="always" /><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=9851483&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" />
    <embed src="http://vimeo.com/moogaloop.swf?clip_id=9851483&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"></embed>
</object>
<p><a href="http://vimeo.com/9851483">Gorillaz - Stylo</a> from <a href="http://vimeo.com/uccimaru">mario ucci</a> on <a href="http://vimeo.com">Vimeo</a>.</p>

we want:
--------

<object width="400" height="300"><param name="allowfullscreen" value="true" />
    <param name="allowscriptaccess" value="always" /><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=9851483&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" />
    <embed src="http://vimeo.com/moogaloop.swf?clip_id=9851483&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"></embed>
</object>
7f69b0a9f0a030c37dca69736abb9f39

nicerobot

March 17, 2010, March 17, 2010 17:28, permalink

2 ratings. Login to rate!

If that's the entire file, all the time, i'd replace "<p><a.*" with "".
And i'd do it with perl!

perl -pi -e 's|<p><a.*||s'
Bf7ae349354a0923338d89b1abc70295

234234

November 12, 2010, November 12, 2010 02:17, permalink

No rating. Login to rate!

42344324

vghfghfgh
8792b8cf71d27dc96173b2ac79b96e0d

4444444444444

November 12, 2010, November 12, 2010 02:17, permalink

No rating. Login to rate!

444444444444444

BITCHES!

Your refactoring





Format Copy from initial code

or Cancel