给定二进制 MxN 矩阵和切换列的能力，最大化行相同性？

2023-12-29

如果你有一个由 1 和 0 组成的二进制矩阵，并且能够切换列（将列中的所有 1 更改为 0，将所有 0 更改为 1），那么如何找到所有可能的“纯”行的最大数量列切换的组合？ “纯”表示该行全为 0 或全为 1。

Ex:

1 0

1 1

您可以切换任一列以获得 2 行“纯”行，这是您能做的最好的事情（同时切换两列也不是更好），因此您返回 2（“纯”行的最大数量）。

我似乎无法找到一种有效的方法来做到这一点。到目前为止，我得到的唯一方法是使用一堆循环和蛮力，并通过检查一行的总和是否为 0（全 0）或 N（一行中的元素数）来检查相同性。

Update

经过OP的澄清后，最大纯行问题是查找切换后变为 00...0 或 11...1 的最大行数。我已相应更新了我的解决方案。

请注意，我们有以下事实：

If two rows r_i and r_j reduce to a pure row after toggling, then we must have r_i = r_j to start with.
If r_i ≠ r_j and r_i overlaps r_j (i.e. some of their corresponding column are the same), then both of them cannot map to a pure row.

上述两个事实直接来自以下观察：

Max number of "pure" rows is the same as the max number of identical rows

Proof

我们声称构成最大纯问题解的所有行在矩阵 M 中必须相同。

Suppose we are given a m-by-n matrix M, and we have found a solution of the max-pure row problem. Let rows r_i and r_j be two arbitrary rows that get reduce to pure rows after toggling.

Observe that after all the necessary toggling operation on the columns (denote by σ₁, σ₂, ..., σ_k), r_i and r_j are both "pure" rows. i.e. We have the following:

σ1(σ2(...(σk(ri)...)) = σ1(σ2(...(σk(rj)...)) = 00...0

σ1(σ2(...(σk(ri)...)) = σ1(σ2(...(σk(rj)...)) = 11...1

So after applying all these toggling operations, r_i and r_j will equal each other. If we undo the very last toggling (i.e. we toggling the same column entry of these rows), it is obviously that both r_i and r_j will still map to the same output. i.e. We have the following:

σ2(σ3(...(σk(ri)...)) = σ2(σ3(...(σk(rj)...))

If we we continue undoing the toggling operations, we can conclude that r_i = r_j. In other words, if you pick any arbitrary rows from a solution of the max-pure problem, these rows must be identical in the beginning.

Idea

Given a row r_i, if it can be reduce to the pure row, say 00...0, then we know that another row r_j cannot be reduced to 11...1 if r_i overlaps with r_j (from fact 2 above). We can only hope that another row r_k which does not overlap with r_i to reduce to 11...1.

算法

根据前面的想法，我们可以有以下简单的算法来解决最大纯行问题。

We first scan over the rows of matrix M, and then find all the unique rows of the matrix (denote by s₁, s₂, ..., s_k). We let count(si) denotes the number of times s_i appears in M. We then loop over all the pairs (s_i, s_j) to determine the max-pure row number as below:

int maxCount = 0;

for each row si:
    for each  sj ≠ si:
        if (sj overlaps si)
            continue;
        else
            if (count(si) + count(sj) > maxCount)
                // We have found a better pair
                maxCount = count(si) + count(sj);    

return maxCount;

We are doing O(n) works in the inner for loop (for entry-wise checking whether two rows overlap), and the loops are over O(m²) rows in the worst-case, so the running time of the algorithm is O(nm²).

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)