[CrackMonkey] PyDeCSS

Aaron Malone aaron at mancala.semo.net
Thu Feb 17 15:52:20 PST 2000


This is how bored I am.

I ported Mr. Bad's DeCSS to python.  The code's cheap as hell, but it
gave me something mildly amusing to do, and I fixed a couple of the
deficiencies from the original version (namely, its lack of case
insensitivity and inability to strip CSS tags that used
single-quotes).

-- 
Aaron Malone (aaron at semo.net)
System Administrator
Poplar Bluff Internet, Inc.
http://www.semo.net

----------------------------------
PyDeCSS, v0.01, now without MIME! 
----------------------------------

import sys, getopt, string, re

USAGE="""
PyDeCSS 0.01: a utility to strip Cascading Style Sheets (CSS) tags 
            from HTML documents

USAGE: PyDeCSS [-h] [-i input file] [-o output file]

options:
	-h		print this help message
        -i input file	input file to strip (default: standard input)
        -o output file	place to put the output (default: standard
output)

Hm.  I just copied Mr. Bad's usage message and changed three
characters.
         Damn, I'm lazy.
"""

OUT=0  # is there an 'if undef'-type test in python?
IN=0   #      ...see below.

optlist, args = getopt.getopt(sys.argv[1:], 'hi:o:')

for opt in optlist:
    if opt[0] == '-h':
        print USAGE
        sys.exit(0)
    if opt[0] == '-i':
        try:
            IN = open(opt[1], 'r').readlines()
        except:
            pass
    if opt[0] == '-o':
        try:
            OUT = open(opt[1], 'w')
        except:
            pass

if not IN:
    IN = sys.stdin.readlines()  # this is what I meant.
if not OUT:                     # 'twould be much simpler with undef.
    OUT = sys.stdout

css =
re.compile("""(<link.*?rel=(\"|\')stylesheet(\"|\').*?>)|(<style>.*?</style>)|(style=(\"|\').*?(\"|\'))|(class=(\"|\').*?(\"|\'))|(id=(\"|\').*?(\"|\'))""", 
re.I)
output = re.sub(css, "", string.join(IN))

OUT.write(output)





More information about the Crackmonkey mailing list