如何优雅地将下面的递归 SQL 查询移植到 Pandas python 代码中?
不知何故,如果不编写自己的递归函数,我就看不到一种直接的方法?
Python 示例代码:
import datetime
import numpy as np
import pandas as pd
import pandas.io.data
from pandas import Series, DataFrame
data = {
'ID': [1,2,3,4,5,6,7,8],
'Name': ['Keith','Josh','Robin','Raja','Tridip','Arijit','Amit','Dev'],
'MgrID': [0,1,1,2,0,5,5,6]
}
df = pd.DataFrame.from_dict(data)
df.set_index('ID', inplace=True, drop=False, append=False)
df.ix[df.query('MgrID >0')['MgrID']]
试图得到这个:
nLevel ID Name
================================
1 6 Arijit
2 8 Dev
1 1 Keith
2 2 Josh
2 3 Robin
3 4 Raja
1 5 Tridip
2 7 Amit
递归 SQL 查询:
;WITH Employee (ID, Name, MgrID) AS
(
SELECT 1, 'Keith', NULL UNION ALL
SELECT 2, 'Josh', 1 UNION ALL
SELECT 3, 'Robin', 1 UNION ALL
SELECT 4, 'Raja', 2 UNION ALL
SELECT 5, 'Tridip', NULL UNION ALL
SELECT 6, 'Arijit', NULL UNION ALL
SELECT 7, 'Amit', 5 UNION ALL
SELECT 8, 'Dev', 6
)
,Hierarchy AS
(
-- Anchor
SELECT ID
,Name
,MgrID
,nLevel = 1
,Family = ROW_NUMBER() OVER (ORDER BY Name)
FROM Employee
WHERE MgrID IS NULL
UNION ALL
-- Recursive query
SELECT E.ID
,E.Name
,E.MgrID
,H.nLevel+1
,Family
FROM Employee E
JOIN Hierarchy H ON E.MgrID = H.ID
)
SELECT nLevel ,ID,space(nLevel+(CASE WHEN nLevel > 1 THEN nLevel ELSE 0 END))+Name Name FROM Hierarchy ORDER BY Family, nLevel