TL;DR
根本问题是缺少适用于 Windows 的 LibKML 库。我的解决方案是通过函数直接从 KML 中提取数据。
Problem
我遇到了同样的问题,经过一番谷歌搜索后,这似乎与 LibKML 和 Windows 有关。在我的 Ubuntu 机器上执行相同的代码会产生不同的结果,即加载保存的 KML 文件时检索到了 ExtendedData。
library(rgdal)
library(dplyr)
poly_df<-data.frame(x=c(1,1,0,0),y=c(1,0,0,1))
poly<-poly_df %>%
Polygon %>%
list %>%
Polygons(ID="1") %>%
list %>%
SpatialPolygons(proj4string = CRS("+init=epsg:4326")) %>%
SpatialPolygonsDataFrame(data=data.frame(test="this is a test"))
writeOGR(poly,"test.kml",driver="KML",layer="poly")
poly2<-readOGR("test.kml")
poly2@data
如果有人能够成功构建 LibKML [1],他/她将能够使用 ExtendedData [2] 加载 KML 文件。
在 Windows 上,LibKML 需要使用 Visual Studio 2005 [1] 构建。不再支持此 Visual Studio 版本 [3]。在 [3] user2889419 中提供了 2005 版本的链接。
我下载并安装了该版本,但构建 LibKML 最终失败,并出现大量错误和警告(某些文件不存在)。这是我停下来的原因,因为我远离了自己的舒适区,但想分享我的追逐结果。
R 中的解决方案
我的解决方案是直接读取KML,然后在通过rgdal的readOGR加载空间对象时提取扩展数据。我的假设是 readOGR 从文件顶部开始,就像我的提取例程一样。然后将两者合并,输出是 SpatialPolygonsDataFrame。
起初,我在从 KML 文件中提取节点时遇到了一些麻烦,因为我不知道命名空间的概念 [4]。 (编辑了以下函数,因为我遇到了其他来源的 KML 文件的问题。)
readKML <- function(file,keep_name_description=FALSE,layer,...) {
# Set keep_name_description = TRUE to keep "Name" and "Description" columns
# in the resulting SpatialPolygonsDataFrame. Only works when there is
# ExtendedData in the kml file.
sp_obj<-readOGR(file,layer,...)
xml1<-read_xml(file)
if (!missing(layer)) {
different_layers <- xml_find_all(xml1, ".//d1:Folder")
layer_names <- different_layers %>%
xml_find_first(".//d1:name") %>%
xml_contents() %>%
xml_text()
selected_layer <- layer_names==layer
if (!any(selected_layer)) stop("Layer does not exist.")
xml2 <- different_layers[selected_layer]
} else {
xml2 <- xml1
}
# extract name and type of variables
variable_names1 <-
xml_find_first(xml2, ".//d1:ExtendedData") %>%
xml_children()
while(variable_names1 %>%
xml_attr("name") %>%
is.na() %>%
any()&variable_names1 %>%
xml_children() %>%
length>0) variable_names1 <- variable_names1 %>%
xml_children()
variable_names <- variable_names1 %>%
xml_attr("name") %>%
unique()
# return sp_obj if no ExtendedData is present
if (is.null(variable_names)) return(sp_obj)
data1 <- xml_find_all(xml2, ".//d1:ExtendedData") %>%
xml_children()
while(data1 %>%
xml_children() %>%
length>0) data1 <- data1 %>%
xml_children()
data <- data1 %>%
xml_text() %>%
matrix(.,ncol=length(variable_names),byrow = TRUE) %>%
as.data.frame()
colnames(data) <- variable_names
if (keep_name_description) {
sp_obj@data <- data
} else {
try(sp_obj@data <- cbind(sp_obj@data,data),silent=TRUE)
}
sp_obj
}
旧:通过 ReadLines 提取
我的解决方案是直接读取KML,然后在通过rgdal的readOGR加载空间对象时提取扩展数据。我的假设是 readOGR 从文件顶部开始,就像我的提取例程一样。然后将两者合并,输出是 SpatialPolygonsDataFrame。
library(tidyverse)
library(rgdal)
readKML<-function(file,keep_name_description=FALSE,...) {
# Set keep_name_description = TRUE to keep "Name" and "Description" columns
# in the resulting SpatialPolygonsDataFrame. Only works when there is
# ExtendedData in the kml file.
if (!grepl("\\.kml$",file)) stop("File is not a KML file.")
if (!file.exists(file)) stop("File does not exist.")
map<-readOGR(file,...)
f1<-readLines(file)
# get positions of ExtendedData in document
exdata_position<-grep("ExtendedData",f1) %>%
matrix(ncol=2,byrow = TRUE) %>%
apply(1,function(x) {
pos<-x[1]:x[2]
pos[2:(length(pos)-1)]
}) %>%
t %>%
as.data.frame
# if there is no ExtendedData return SpatialPolygonsDataFrame
if (ncol(exdata_position)==0) return(map)
# Get Name of different columns
extract1<-f1[exdata_position[1,] %>%
unlist]
names_of_data<-extract1 %>%
strsplit("name=\"") %>%
lapply(function(x) strsplit(x[[2]],split="\"") ) %>%
unlist(recursive = FALSE) %>%
lapply(function(x) return(x[1])) %>%
unlist
# Extract Extended Data
dat<-lapply(seq(nrow(exdata_position)),function(x) {
extract2<-f1[exdata_position[x,] %>%
unlist]
extract2 %>%
strsplit(">") %>%
lapply(function(x) strsplit(x[[2]],split="<") ) %>% unlist(recursive = FALSE) %>%
lapply(function(x) return(x[1])) %>%
unlist %>%
matrix(nrow=1) %>%
as.data.frame
}) %>%
do.call(rbind,.)
# Rename columns
colnames(dat)<-names_of_data
# Check if Name and Description should be dropped
if (keep_name_description) {
map@data<-cbind(map@data,dat)
} else {
map@data<-dat
}
map
}
[1] https://github.com/google/libkml/wiki/Building-and-installing-libkml https://github.com/google/libkml/wiki/Building-and-installing-libkml
[2] https://github.com/r-spatial/sf/issues/499 https://github.com/r-spatial/sf/issues/499
[3] 哪里可以下载 Visual Studio Express 2005? https://stackoverflow.com/questions/1330852/where-to-download-visual-studio-express-2005
[4] 在 R 中解析 XML:不正确的命名空间 https://stackoverflow.com/questions/29170161/parsing-xml-in-r-incorrect-namespaces