从 Web 响应中解析 Xml

2024-01-06

我正在尝试获得对数千个城市进行地理编码的提名的回复。

import os
import requests
import xml.etree.ElementTree as ET

txt = open('input.txt', 'r').readlines()
for line in txt:
 lp, region, district, municipality, city = line.split('\t')
 baseUrl = 'http://nominatim.openstreetmap.org/search/gb/'+region+'/'+district+'/'+municipality+'/'+city+'/?format=xml' 
 # eg. http://nominatim.openstreetmap.org/search/pl/podkarpackie/stalowowolski/Bojan%C3%B3w/Zapu%C5%9Bcie/?format=xml
 resp = requests.get(baseUrl)
 resp.encoding = 'UTF-8' # special diacritics
 msg = resp.text
 # parse response to get lat & long
 tree = ET.parse(msg)
 root = tree.getroot()
 print tree

但结果是:

Traceback (most recent call last):
File "geo_miasta.py", line 17, in <module>
    tree = ET.parse(msg)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
    tree.parse(source, parser)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 647, in parse
    source = open(source, "rb")    
IOError: [Errno 2] No such file or directory: u'<?xml version="1.0" encoding="UTF-8" ?>\n<searchresults timestamp=\'Tue, 11 Feb 14 21:13:50 +0000\' attribution=\'Data \xa9 OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright\' querystring=\'\u015awierczyna, Drzewica, opoczy\u0144ski, \u0142\xf3dzkie, gb\' polygon=\'false\' more_url=\'http://nominatim.openstreetmap.org/search?format=xml&amp;exclude_place_ids=&amp;q=%C5%9Awierczyna%2C+Drzewica%2C+opoczy%C5%84ski%2C+%C5%82%C3%B3dzkie%2C+gb\'>\n</searchresults>'

这有什么问题吗?

编辑: 感谢@rob,我的解决方案是:

#! /usr/bin/env python2.7
# -*- coding: utf-8 -*-

import os
import requests
import xml.etree.ElementTree as ET

txt = open('input.txt', 'r').read().split('\n')

for line in txt:
    lp, region, district, municipality, city = line.split('\t')
    baseUrl = 'http://nominatim.openstreetmap.org/search/pl/'+region+'/'+district+'/'+municipality+'/'+city+'/?format=xml'
    resp = requests.get(baseUrl)
    msg = resp.content
    tree = ET.fromstring(msg)
    for place in tree.findall('place'):
    location = '{:5f}\t{:5f}'.format(
        float(place.get('lat')),
        float(place.get('lon')))

    f = open('result.txt', 'a')
    f.write(location+'\t'+region+'\t'+district+'\t'+municipality+'\t'+city)
    f.close()

您正在使用xml.etree.ElementTree.parse() http://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.parse,它接受文件名或文件对象作为参数。但是,您传递的不是文件或文件对象,而是 unicode 字符串。

Try xml.etree.ElementTree.fromstring(text) http://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.fromstring.

像这样:

 tree = ET.fromstring(msg)

这是一个完整的示例程序:

import os
import requests
import xml.etree.ElementTree as ET

baseUrl = 'http://nominatim.openstreetmap.org/search/pl/podkarpackie/stalowowolski/Bojan%C3%B3w/Zapu%C5%9Bcie\n/?format=xml'
resp = requests.get(baseUrl)
msg = resp.content
tree = ET.fromstring(msg)
for place in tree.findall('place'):
  print u'{:s}: {:+.2f}, {:+.2f}'.format(
    place.get('display_name'),
    float(place.get('lon')),
    float(place.get('lat'))).encode('utf-8')
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

从 Web 响应中解析 Xml 的相关文章

随机推荐