import sys
import re
inifile = sys.argv[1]
quotes = ["'", '\"']
f = file(inifile)
d = {}
def get_quote_char(line):
for char in line:
if char in quotes:
return char
def getkey(line):
#swallow everything up to the =
return line[ : line.find('=') ].strip()
def getval(line):
#swallow everything after the =
line = line[ line.find('=') + 1 : ].strip()
q = get_quote_char(line)
startq = line.find(q)
#start scanning the line from the quote onwards
position = 0
for char in line[ startq : ]:
if char not in quotes or line[ position - 1 ] == '\\':
pass
else:
#might hit some remote corner-case with this
if position > 0:
return line[ startq + 1 : position ]
position+=1
for line in f:
line = line.strip()
#skip comments and empty lines
if line.startswith(';') or line=='': pass
#store sections as dicts
elif line.startswith('['):
section_name = line[ 1 : len(line) - 1 ].strip()
section_dict = { section_name : {} }
d.update(section_dict)
else:
k = getkey(line)
v = getval(line)
#print k,v
try:
d[section_name].update( {k:v} )
except TypeError:
print 'The ini file contains invalid characters'
print d
[foo]
greeting = 'hello'
;this is a comment
name = 'Eddie'
[bar]
lastname = 'Vedder';that's another comment;
[ malformed section ]
city='Prague'
country="\'Czech Republic\'"
whatever='this ; is nasty'
[bad]
dog='bau'
cat = 'miao'
mouse = "squeak"
[tabbed section]
dogname = '\Oliver'
catname = 'Barbara'
[one more]
appliance='lcd \'monitor\''
car = "Alfa \"Romeo\" - Giulietta";"foo"
Refactorings
No refactoring yet !
jaredgrubb
November 11, 2007, November 11, 2007 20:50, permalink
I know this may not be the answer you're looking for, but if someone asked me that in an interview, I would say "Well, I can't give the complete code off the top of my head, but it would start with 'import ConfigParser', a built-in module for Python."
lbolognini
November 13, 2007, November 13, 2007 09:18, permalink
Hi Jared,
that solution wouldn't apply. It didn't even cross my mind to say smt like "I'll use a library" because the point of the question, as i assumed, was to see how i would solve a problem that i was unlikely to have solved before (because of the availibility of libraries).
Besides I believe that my version, while not perfect, goes to some length to ensure that no matter how badly formatted the ini file is, it will be parsed anyway ;)
Thanks anyway,
L.
jaredgrubb
November 17, 2007, November 17, 2007 18:34, permalink
If what you're looking for is robustness... then I would recommend using regular expressions (which it looks like you thought of with the 'import re'.) You can trim this program down to a dozen lines that way... Maybe if I get ambitious and no one else beat me to it, I'll give it a shot soon.
John
January 9, 2008, January 09, 2008 06:04, permalink
import sys
import re
SECTION = re.compile('^\s*\[\s*([^\]]*)\s*\]\s*$')
PARAM = re.compile('^\s*(\w+)\s*=\s*(.*)\s*$')
COMMENT = re.compile('^\s*;.*$')
d = {}
f = open(sys.argv[1])
for line in f:
if COMMENT.match(line): continue
m = SECTION.match(line)
if m:
section, = m.groups()
d[section] = {}
m = PARAM.match(line)
if m:
key, val = m.groups()
d[section][key] = val
for k, v in d.items():
print k, v
This is an excercise to implement an .ini parser that I was given at an interview. It needs to be able to read even some pretty malformed .ini files.
Things to note:
1) ; (semicolon) is the start of a comment
2) string values need to be taken verbatim from the file
3) quotes can be either 'single' or "double"
4) test.ini is your test file of course ;)
Can you make it smarter/more robust?
Thanks,
Lorenzo