相关表格有一个“导出为 CSV”链接:
如果你点击它,你会直接得到6.36MB的CSV文件,这很好。我假设您需要/想要以编程方式执行此操作,所以这对我有用:
以编程方式进行的步骤“单击导出至 CSV”
- I'm using Firefox, but Chrome has a similar capability: Inspector. I opened it (Ctrl-Shift-I) and went to the "Network" tab.
- 单击“导出为 CSV”按钮。您应该在检查器框架中看到一个新的“POST”行。当它完成时...
-
右键单击“POST”行并选择“复制POST数据”;这提供了:
__EVENTTARGET
__EVENTARGUMENT
__VIEWSTATE=...
ctl00$ctl00$ctl00$ctl00$ctl00$ContentPlaceHolderDefault$pageBody$pageBody$rightColumn$ctl01$AlligatorHarvestExport_6$RadGrid1$ctl00$ctl02$ctl00$ExportToCsvButton=+
ctl00$ctl00$ctl00$ctl00$ctl00$ContentPlaceHolderDefault$pageBody$pageBody$rightColumn$ctl01$AlligatorHarvestExport_6$RadGrid1$ctl00$ctl02$ctl03$FilterTextBox_Year
ctl00$ctl00$ctl00$ctl00$ctl00$ContentPlaceHolderDefault$pageBody$pageBody$rightColumn$ctl01$AlligatorHarvestExport_6$RadGrid1$ctl00$ctl02$ctl03$FilterTextBox_AreaNumber
ctl00$ctl00$ctl00$ctl00$ctl00$ContentPlaceHolderDefault$pageBody$pageBody$rightColumn$ctl01$AlligatorHarvestExport_6$RadGrid1$ctl00$ctl02$ctl03$FilterTextBox_AreaName
ctl00$ctl00$ctl00$ctl00$ctl00$ContentPlaceHolderDefault$pageBody$pageBody$rightColumn$ctl01$AlligatorHarvestExport_6$RadGrid1$ctl00$ctl03$ctl01$PageSizeComboBox=20
ctl00_ctl00_ctl00_ctl00_ctl00_ContentPlaceHolderDefault_pageBody_pageBody_rightColumn_ctl01_AlligatorHarvestExport_6_RadGrid1_ctl00_ctl03_ctl01_PageSizeComboBox_ClientState
ctl00_ctl00_ctl00_ctl00_ctl00_ContentPlaceHolderDefault_pageBody_pageBody_rightColumn_ctl01_AlligatorHarvestExport_6_RadGrid1_rfltMenu_ClientState
ctl00_ctl00_ctl00_ctl00_ctl00_ContentPlaceHolderDefault_pageBody_pageBody_rightColumn_ctl01_AlligatorHarvestExport_6_RadGrid1_ClientState
__VIEWSTATEGENERATOR=CA0B0334
(我用“替换了长的base64字符串...
".) 值得注意的行是第四行,以$ExportToCsvButton=+
。这是您需要包含在 POST 数据中的参数 (param
).
-
使用上面的代码并包括定义param
, 继续:
param$`ctl00$ctl00$ctl00$ctl00$ctl00$ContentPlaceHolderDefault$pageBody$pageBody$rightColumn$ctl01$AlligatorHarvestExport_6$RadGrid1$ctl00$ctl02$ctl00$ExportToCsvButton` <- "+"
request <- httr::POST(url, body = param, encode = 'form')
您现在将拥有:
request
# Response [http://myfwc.com/wildlifehabitats/managed/alligator/harvest/data-export/]
# Date: 2017-06-01 18:09
# Status: 200
# Content-Type: text/csv; charset-UTF-8;
# Size: 6.36 MB
# <U+FEFF>"Year","Area Number","Area Name","Carcass Size","Harvest Date","Location"
# "2000","101","LAKE PIERCE","11 ft. 5 in.","09-22-2000",""
# "2000","101","LAKE PIERCE","9 ft. 0 in.","10-02-2000",""
# "2000","101","LAKE PIERCE","8 ft. 10 in.","10-06-2000",""
# "2000","101","LAKE PIERCE","8 ft. 0 in.","09-25-2000",""
# "2000","101","LAKE PIERCE","8 ft. 0 in.","10-07-2000",""
# "2000","101","LAKE PIERCE","8 ft. 0 in.","09-22-2000",""
# "2000","101","LAKE PIERCE","7 ft. 2 in.","09-21-2000",""
# "2000","101","LAKE PIERCE","7 ft. 1 in.","09-21-2000",""
# "2000","101","LAKE PIERCE","6 ft. 11 in.","09-25-2000",""
# ...
旁注:网站以以下内容开头文件<U+FEFF>
,一个 unicode 字符。这会抛出read.csv
并给你一个列名X.U.FEFF.Year
,完全是装饰性的。
保存到文件
如果您不关心建议的文件名,您可以简单地执行以下操作
write(as.character(request), file="quux.csv")
如果您想使用网站建议的文件名,您可以通过以下方式找到它:
httr::headers(request)$`content-disposition`
# [1] "inline;filename=\"FWCAlligatorHarvestData.csv\""
解析应该是直接的。
即时消费
如果您不想/不需要保存到中间文件,您始终可以立即使用它:
head(read.csv(textConnection(as.character(request))))
# Invalid encoding : defaulting to UTF-8.
# X.U.FEFF.Year Area.Number Area.Name Carcass.Size Harvest.Date Location
# 1 2000 101 LAKE PIERCE 11 ft. 5 in. 09-22-2000
# 2 2000 101 LAKE PIERCE 9 ft. 0 in. 10-02-2000
# 3 2000 101 LAKE PIERCE 8 ft. 10 in. 10-06-2000
# 4 2000 101 LAKE PIERCE 8 ft. 0 in. 09-25-2000
# 5 2000 101 LAKE PIERCE 8 ft. 0 in. 10-07-2000
# 6 2000 101 LAKE PIERCE 8 ft. 0 in. 09-22-2000