我想逐句读取文本文件。我的问题是下面的代码仅根据时期分开。
#!/usr/bin/perl
use strict;
use warnings;
my $file = "data.txt";
open (FILE , $file);
my @buffer;
$/ = '.';
while ( my $sentence = <FILE> ) {
#do_something
}
close FILE;
无论如何,有什么可以做的吗?$/
像这样使用正则表达式/[.?!]/
所以它根据问号或感叹号来分隔句子,而不仅仅是句号
这可以更正确地使用Lingua::Sentence https://metacpan.org/pod/Lingua::Sentence :
use feature qw(say);
use strict;
use warnings;
use Lingua::Sentence;
my $fn = "data.txt";
open (my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
my $str = do {local $/; <$fh>};
close $fh;
for my $sentence (Lingua::Sentence->new("en")->split_array( $str)) {
say $sentence;
}
With data.txt:
'How often do you come here?', asked Mr. Smith.
This is a paragraph. It contains several sentences. "But why," you ask?
我们得到以下输出:
'How often do you come here?', asked Mr. Smith.
This is a paragraph.
It contains several sentences.
"But why," you ask?
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)