我有一个数据库
books (primary key: bookID)
characterNames (foreign key: books.bookID)
locations (foreign key: books.bookID)
角色名称和位置的文本位置保存在相应的表中。
我正在使用 psycopg2 编写 Python 脚本,查找书中给定角色名称和位置的所有出现情况。我只想要书中出现的事件,其中可以找到角色名称和位置。
Here https://stackoverflow.com/q/10036645/1315186我已经找到了一种搜索一个位置和一个字符的解决方案:
WITH b AS (
SELECT bookid
FROM characternames
WHERE name = 'XXX'
GROUP BY 1
INTERSECT
SELECT bookid
FROM locations
WHERE l.locname = 'YYY'
GROUP BY 1
)
SELECT bookid, position, 'char' AS what
FROM b
JOIN characternames USING (bookid)
WHERE name = 'XXX'
UNION ALL
SELECT bookid, position, 'loc' AS what
FROM b
JOIN locations USING (bookid)
WHERE locname = 'YYY'
ORDER BY bookid, position;
CTE“b”包含所有 bookid,其中出现角色名称“XXX”和位置“YYY”。
现在我还想知道是否要搜索 2 个地点和一个名称(或分别搜索 2 个名称和一个地点)。如果所有搜索到的实体必须出现在一本书中,那很简单,但是这样呢:
正在寻找:蒂姆、艾尔、工具店
结果:书籍包括
(蒂姆、艾尔、Toolshop)或
(蒂姆,艾尔)或
(蒂姆,工具店)或
(阿尔,工具车间)
该问题可能会在 4、5、6...条件下重复出现。
我考虑过相交更多的子查询,但这行不通。
相反,我会联合找到的 bookID,将它们分组并选择多次出现的 bookid:
WITH b AS (
SELECT bookid, count(bookid) AS occurrences
FROM
(SELECT DISTINCT bookid
FROM characterNames
WHERE name='XXX'
UNION
SELECT DISTINCT bookid
FROM characterNames
WHERE name='YYY'
UNION
SELECT DISTINCT bookid
FROM locations
WHERE locname='ZZZ'
GROUP BY bookid)
WHERE occurrences>1)
我认为这可行,目前无法测试,但这是最好的方法吗?