[CrackMonkey] PyDeCSS
Aaron Malone
aaron at mancala.semo.net
Thu Feb 17 15:52:20 PST 2000
This is how bored I am.
I ported Mr. Bad's DeCSS to python. The code's cheap as hell, but it
gave me something mildly amusing to do, and I fixed a couple of the
deficiencies from the original version (namely, its lack of case
insensitivity and inability to strip CSS tags that used
single-quotes).
--
Aaron Malone (aaron at semo.net)
System Administrator
Poplar Bluff Internet, Inc.
http://www.semo.net
----------------------------------
PyDeCSS, v0.01, now without MIME!
----------------------------------
import sys, getopt, string, re
USAGE="""
PyDeCSS 0.01: a utility to strip Cascading Style Sheets (CSS) tags
from HTML documents
USAGE: PyDeCSS [-h] [-i input file] [-o output file]
options:
-h print this help message
-i input file input file to strip (default: standard input)
-o output file place to put the output (default: standard
output)
Hm. I just copied Mr. Bad's usage message and changed three
characters.
Damn, I'm lazy.
"""
OUT=0 # is there an 'if undef'-type test in python?
IN=0 # ...see below.
optlist, args = getopt.getopt(sys.argv[1:], 'hi:o:')
for opt in optlist:
if opt[0] == '-h':
print USAGE
sys.exit(0)
if opt[0] == '-i':
try:
IN = open(opt[1], 'r').readlines()
except:
pass
if opt[0] == '-o':
try:
OUT = open(opt[1], 'w')
except:
pass
if not IN:
IN = sys.stdin.readlines() # this is what I meant.
if not OUT: # 'twould be much simpler with undef.
OUT = sys.stdout
css =
re.compile("""(<link.*?rel=(\"|\')stylesheet(\"|\').*?>)|(<style>.*?</style>)|(style=(\"|\').*?(\"|\'))|(class=(\"|\').*?(\"|\'))|(id=(\"|\').*?(\"|\'))""",
re.I)
output = re.sub(css, "", string.join(IN))
OUT.write(output)
More information about the Crackmonkey
mailing list