Google 专利 API 已弃用(“Google Patent Search API 已于 2011 年 5 月 26 日正式弃用。 https://developers.google.com/patent-search/”)。我认为你得到的数据并不可靠。
我不确定谷歌服务条款是否允许反对单独的谷歌专利页面,但一种策略可能是使用搜索来获取结果列表,然后使用类似的内容美丽的汤 http://www.crummy.com/software/BeautifulSoup/解析每个结果。
Example:
import urllib2
import json
from bs4 import BeautifulSoup
url = ('https://ajax.googleapis.com/ajax/services/search/patent?' +
'v=1.0&q=barack%20obama')
request = urllib2.Request(url, None, {})
response = urllib2.urlopen(request)
jsonResponse = json.load(response)
responseData=jsonResponse['responseData']
results = responseData["results"]
print "This doesn't work, no assignee data..."
for result in results:
print "patent no.: ", result["patentNumber"]
print "assignee: ", result["assignee"]
print " "
print "...but this seems to."
for result in results:
URL = "https://www.google.com/patents/"+result["patentNumber"]
req = urllib2.Request(URL, headers={'User-Agent' : "python"})
_file = urllib2.urlopen(req)
patent_html = _file.read()
soup = BeautifulSoup(patent_html, 'html.parser')
patentNumber = soup.find("span", { "class" : "patent-number" }).text
assigneeMetaTag = soup.find("meta", { "scheme" : "assignee"})
patentAssignee = assigneeMetaTag.attrs["content"]
print "patent no.: ", patentNumber
print "assignee: ", patentAssignee
print " "
对我来说这打印出来:
This doesn't work, no assignee data...
patent no.: US20110022394
assignee:
patent no.: US20140089323
assignee:
patent no.: US8117227
assignee:
patent no.: CA2702937C
assignee:
...but this seems to.
patent no.: US 20110022394 A1
assignee: Thomas Wide
patent no.: US 20140089323 A1
assignee: Appinions Inc.
patent no.: US 8117227 B2
assignee: Scuola Normale Superiore Di Pisa
patent no.: CA 2702937 C
assignee: Neil S. Roseman
需要注意的是,我相信您只会在专利颁发之日起获得受让人;如果已转让,则不是当前受让人。