vendredi 31 juillet 2015

XML Parsing to .txt file Python

I need to parse this XML Document and move the date and time into a %Y-%m-%d %H:%M:%S format as well as the variables hourly-qpf and probability-of-precipitation to columns in a tab-delimited .txt file.

All I have managed to do is read in the XML file using this code:

page = urllib2.urlopen('http://ift.tt/1SP09yO')
page_content = page.read()
with open('KBFI.xml', 'w') as fid:
    fid.write(page_content)

I am at a loss after this. I've only parsed one XML doc before, and it looked completely different from this.

EDIT

Sorry for not having anything to give you guys before, but I wasn't sure what module to use, as I only have experience with minidom and it didn't seem like the right choice. I've been messing around with Element Tree and I have come up with this:

data = []
import xml.etree.ElementTree as ET
tree = ET.parse('KBFI.xml')
root = tree.getroot()
for data in root.findall('data'):
    for time-layout in root.findall('time-layout'):
        start-valid-time = time-layout.find('start-valid-time')
        time = datetime.datetime.strptime(start-valid-time, '%Y-%m-%dT%H:%M:%S')
    for parameters in root.findall('parameters'):
        for probability-of-precipitation in root.findall('probability-of-precipitation'):
            value = probability-of-precipitation.find('value')
    for hourly-qpf in root.findall('hourly-qpf'):
            value2 = hourly-qpf.find('value')
data = data.append([time,
                    value,
                    value2])
with open('KBFI.txt','w') as file:
    file.writelines('\t'.join(map(str,i)) + '\n' for i in data)

However, there is a problem because the variables are hyphenated and I do not know how to change them to underscores or remove them. Also, because of this, I have no idea if my code is any good!

Aucun commentaire:

Enregistrer un commentaire