答案是假设这些条件,因为任何 NFA 都可以修改以满足这些要求。
For any kind of NFA, you can add a new initial state q0 that has an epsilon-transition to the original initial state, and also using an additional transition symbol called ∅ (they call it empty set symbol, assumed to be a symbol which does not match any symbol from the original NFA) from it to any other states, then use this new state as the new initial state. Note that this does not change the language accepted by the original NFA. This would make your NFA satisfies the first condition.
For any kind of NFA, you can add a new acceptance state qa that has an epsilon-transition from all acceptance state in the original NFA. Then mark this as the only acceptance state. Note that this does not change the language accepted by the original NFA. This would make your NFA satisfies the second condition.
By the above construction, by setting q0 != qa, it satisfies the third condition.
在您提供的链接中,第四个条件是通过一个称为 ∅(空集符号)的特殊转换符号来解释的,原始 NFA 中的实际字母表无法匹配该符号。因此,您可以使用这个新符号添加从每个状态到任何其他状态的转换。请注意,这不会更改原始 NFA 接受的语言。
现在 NFA 已被修改以满足四个要求,您可以应用那里的算法将 NFA 转换为正则表达式,它将接受与原始 NFA 相同的语言。
编辑以回答进一步的问题:
To answer your question in the comment, consider the NFA with two states, qA and qB. qA is the initial state as well as the only acceptance state. We have a transition from qA to itself with symbol 0,1. We also have transition from qA to qB with symbol 1. Lastly we have transition from qB to qA with symbol 0.
可视化:
0,1
| 1
->qA----->qB
^ |
|-------|
0
Step 2. When we normalize the NFA, just put the new init state (qinit) that points to qA, and put a new acceptance state (qacc) from qA.
Step 3. We want to remove qA. So qA is the qrip in the algorithm (in page 3). Now we need to consider every states that enters qA and every states that exits from qA. In this case, there are two states pointing to qA, that are qinit and qB. There are two states that are pointed to by qA, that are qB and qacc. By the algorithm, we replace the transitions qin->qrip->qout with a transition qin->qout, having the transition symbol Rdir+Rin(Rrip)*Rout, where:
- Rdir is the original transition from qin to qout
- Rin is the original transition from qin to qrip
- Rrip is the original loop at qrip
- Rout is the original transition from qrip to qout
So in this case we replace the transition qinit->qA->qB with qinit->qB with transition symbol (0+1)*1. Continuing this process, we will create in total 4 new transitions:
- qinit->qB: (0+1)*1
- qinit->qacc: (0+1)*
- qB->qB: 0(0+1)*1
- qB->qacc: 0(0+1)*
Then we can remove qA.
Step 4. We want to remove qB. Again, we identify the qin and qout. There is only one state coming to qB here, which is qinit, and there is only one state departing from qB, which is qacc. So we have:
- Rdir = (0+1)*
- Rin = (0+1)*1
- Rrip = 0(0+1)*1
- Rout = 0(0+1)*
So the new transition qinit->qacc will be:
Rdir+Rin(Rrip)*Rout
(0+1)* + (0+1)*1 (0(0+1)*1)* 0(0+1)*
And we can remove qB.
步骤 5. 由于原始 NFA 中的每个状态都已被删除,我们就完成了。所以最终的正则表达式如上所示。
请注意,最终的正则表达式可能不是最佳的(并且在大多数情况下它不会是最佳的),这是算法所期望的。一般来说,为 NFA(甚至 DFA)找到最短的正则表达式是非常困难的(尽管在这个例子中很容易看出第一个组件已经涵盖了所有可能的字符串)
为了完整起见,接受相同语言的最短正则表达式将是:
(0+1)*