我正在使用下面的脚本来删除“股票报价”数据http://fortune.com/fortune500/xcel-energy/ http://fortune.com/fortune500/xcel-energy/,但其给出空白。
我也使用过硒驱动程序,但同样的问题。请帮忙解决这个问题。
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
r = requests.get('http://fortune.com/fortune500/xcel-energy/')
soup = bs(r.content, 'lxml') # tried: 'html.parser
data = pd.DataFrame(columns=['C1','C2','C3','C4'], dtype='object', index=range(0,11))
for table in soup.find_all('div', {'class': 'stock-quote row'}):
row_marker = 0
for row in table.find_all('li'):
column_marker = 0
columns = row.find_all('span')
for column in columns:
data.iat[row_marker, column_marker] = column.get_text()
column_marker += 1
row_marker += 1
print(data)
输出获取:
C1 C2 C3 C4
0 Previous Close: NaN NaN
1 Market Cap: NaNB NaN B
2 Next Earnings Date: NaN NaN
3 High: NaN NaN
4 Low: NaN NaN
5 52 Week High: NaN NaN
6 52 Week Low: NaN NaN
7 52 Week Change %: 0.00 NaN NaN
8 P/E Ratio: n/a NaN NaN
9 EPS: NaN NaN
10 Dividend Yield: n/a NaN NaN
您要查找的数据似乎可以在此处找到API端点 http://fortune.com/api/v2/company/xel/expand/1:
import requests
response = requests.get("http://fortune.com/api/v2/company/xel/expand/1")
data = response.json()
print(data['ticker'])
仅供参考,在硒自动化浏览器中打开页面时,您只需要确保在解析 HTML 之前等待所需的数据出现 http://selenium-python.readthedocs.io/waits.html#explicit-waits,工作代码:
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
url = 'http://fortune.com/fortune500/xcel-energy/'
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get(url)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".stock-quote")))
page_source = driver.page_source
driver.close()
# HTML parsing part
soup = BeautifulSoup(page_source, 'lxml') # tried: 'html.parser
data = pd.DataFrame(columns=['C1','C2','C3','C4'], dtype='object', index=range(0,11))
for table in soup.find_all('div', {'class': 'stock-quote'}):
row_marker = 0
for row in table.find_all('li'):
column_marker = 0
columns = row.find_all('span')
for column in columns:
data.iat[row_marker, column_marker] = column.get_text()
column_marker += 1
row_marker += 1
print(data)
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)