好的,我已经编写了扩展 Zend Framework DB 表、行和行集类的 PHP 类。无论如何,我一直在开发这个,因为我正在演讲PHP Tek-X http://tek.phparch.com/在几周内了解分层数据模型。
我不想将我的所有代码发布到 Stack Overflow,因为如果我这样做,它们就会隐式获得知识共享许可。update:我将我的代码提交给Zend Framework 额外孵化器 http://framework.zend.com/svn/framework/extras/incubator/我的演讲是使用 SQL 和 PHP 的分层数据模型 http://www.slideshare.net/billkarwin/models-for-hierarchical-data在幻灯片共享。
我将用伪代码描述解决方案。我使用动物分类学作为测试数据,从下载ITIS.gov http://www.itis.gov/。该表是longnames
:
CREATE TABLE `longnames` (
`tsn` int(11) NOT NULL,
`completename` varchar(164) NOT NULL,
PRIMARY KEY (`tsn`),
KEY `tsn` (`tsn`,`completename`)
)
我创建了一个封闭表对于分类层次结构中的路径:
CREATE TABLE `closure` (
`a` int(11) NOT NULL DEFAULT '0', -- ancestor
`d` int(11) NOT NULL DEFAULT '0', -- descendant
`l` tinyint(3) unsigned NOT NULL, -- levels between a and d
PRIMARY KEY (`a`,`d`),
CONSTRAINT `closure_ibfk_1` FOREIGN KEY (`a`) REFERENCES `longnames` (`tsn`),
CONSTRAINT `closure_ibfk_2` FOREIGN KEY (`d`) REFERENCES `longnames` (`tsn`)
)
给定一个节点的主键,您可以通过以下方式获取其所有后代:
SELECT d.*, p.a AS `_parent`
FROM longnames AS a
JOIN closure AS c ON (c.a = a.tsn)
JOIN longnames AS d ON (c.d = d.tsn)
LEFT OUTER JOIN closure AS p ON (p.d = d.tsn AND p.l = 1)
WHERE a.tsn = ? AND c.l <= ?
ORDER BY c.l;
加入到closure AS p
就是包含每个节点的父节点 id。
该查询很好地利用了索引:
+----+-------------+-------+--------+---------------+---------+---------+----------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+-----------------------------+
| 1 | SIMPLE | a | const | PRIMARY,tsn | PRIMARY | 4 | const | 1 | Using index; Using filesort |
| 1 | SIMPLE | c | ref | PRIMARY,d | PRIMARY | 4 | const | 5346 | Using where |
| 1 | SIMPLE | d | eq_ref | PRIMARY,tsn | PRIMARY | 4 | itis.c.d | 1 | |
| 1 | SIMPLE | p | ref | d | d | 4 | itis.c.d | 3 | |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+-----------------------------+
鉴于我有 490,032 行longnames
和 4,299,883 行closure
,它运行得相当好:
+--------------------+----------+
| Status | Duration |
+--------------------+----------+
| starting | 0.000257 |
| Opening tables | 0.000028 |
| System lock | 0.000009 |
| Table lock | 0.000013 |
| init | 0.000048 |
| optimizing | 0.000032 |
| statistics | 0.000142 |
| preparing | 0.000048 |
| executing | 0.000008 |
| Sorting result | 0.034102 |
| Sending data | 0.001300 |
| end | 0.000018 |
| query end | 0.000005 |
| freeing items | 0.012191 |
| logging slow query | 0.000008 |
| cleaning up | 0.000007 |
+--------------------+----------+
现在,我对上面 SQL 查询的结果进行后处理,根据层次结构将行排序为子集(伪代码):
while ($rowData = fetch()) {
$row = new RowObject($rowData);
$nodes[$row["tsn"]] = $row;
if (array_key_exists($row["_parent"], $nodes)) {
$nodes[$row["_parent"]]->addChildRow($row);
} else {
$top = $row;
}
}
return $top;
我还定义了行和行集的类。 Rowset 基本上是一个行数组。 Row 包含行数据的关联数组,还包含其子项的 Rowset。叶节点的子行集为空。
行和行集还定义了称为toArrayDeep()
它将数据内容递归地转储为普通数组。
然后我可以像这样一起使用整个系统:
// Get an instance of the taxonomy table data gateway
$tax = new Taxonomy();
// query tree starting at Rodentia (id 180130), to a depth of 2
$tree = $tax->fetchTree(180130, 2);
// dump out the array
var_export($tree->toArrayDeep());
输出如下:
array (
'tsn' => '180130',
'completename' => 'Rodentia',
'_parent' => '179925',
'_children' =>
array (
0 =>
array (
'tsn' => '584569',
'completename' => 'Hystricognatha',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '552299',
'completename' => 'Hystricognathi',
'_parent' => '584569',
),
),
),
1 =>
array (
'tsn' => '180134',
'completename' => 'Sciuromorpha',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '180210',
'completename' => 'Castoridae',
'_parent' => '180134',
),
1 =>
array (
'tsn' => '180135',
'completename' => 'Sciuridae',
'_parent' => '180134',
),
2 =>
array (
'tsn' => '180131',
'completename' => 'Aplodontiidae',
'_parent' => '180134',
),
),
),
2 =>
array (
'tsn' => '573166',
'completename' => 'Anomaluromorpha',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '573168',
'completename' => 'Anomaluridae',
'_parent' => '573166',
),
1 =>
array (
'tsn' => '573169',
'completename' => 'Pedetidae',
'_parent' => '573166',
),
),
),
3 =>
array (
'tsn' => '180273',
'completename' => 'Myomorpha',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '180399',
'completename' => 'Dipodidae',
'_parent' => '180273',
),
1 =>
array (
'tsn' => '180360',
'completename' => 'Muridae',
'_parent' => '180273',
),
2 =>
array (
'tsn' => '180231',
'completename' => 'Heteromyidae',
'_parent' => '180273',
),
3 =>
array (
'tsn' => '180213',
'completename' => 'Geomyidae',
'_parent' => '180273',
),
4 =>
array (
'tsn' => '584940',
'completename' => 'Myoxidae',
'_parent' => '180273',
),
),
),
4 =>
array (
'tsn' => '573167',
'completename' => 'Sciuravida',
'_parent' => '180130',
'_children' =>
array (
0 =>
array (
'tsn' => '573170',
'completename' => 'Ctenodactylidae',
'_parent' => '573167',
),
),
),
),
)
回复您关于计算深度或每条路径的实际长度的评论。
假设您刚刚将一个新节点插入到保存实际节点的表中(longnames
在上面的示例中),新节点的 id 由以下命令返回LAST_INSERT_ID()
在 MySQL 中,否则你可以通过某种方式得到它。
INSERT INTO Closure (a, d, l)
SELECT a, LAST_INSERT_ID(), l+1 FROM Closure
WHERE d = 5 -- the intended parent of your new node
UNION ALL SELECT LAST_INSERT_ID(), LAST_INSERT_ID(), 0;