以下是一些可以帮助您完成您想做的事情的函数:
// nl2p
// This function will convert newlines to HTML paragraphs
// without paying attention to HTML tags. Feed it a raw string and it will
// simply return that string sectioned into HTML paragraphs
function nl2p($str) {
$arr=explode("\n",$str);
$out='';
for($i=0;$i<count($arr);$i++) {
if(strlen(trim($arr[$i]))>0)
$out.='<p>'.trim($arr[$i]).'</p>';
}
return $out;
}
// nl2p_html
// This function will add paragraph tags around textual content of an HTML file, leaving
// the HTML itself intact
// This function assumes that the HTML syntax is correct and that the '<' and '>' characters
// are not used in any of the values for any tag attributes. If these assumptions are not met,
// mass paragraph chaos may ensue. Be safe.
function nl2p_html($str) {
// If we find the end of an HTML header, assume that this is part of a standard HTML file. Cut off everything including the
// end of the head and save it in our output string, then trim the head off of the input. This is mostly because we don't
// want to surrount anything like the HTML title tag or any style or script code in paragraph tags.
if(strpos($str,'</head>')!==false) {
$out=substr($str,0,strpos($str,'</head>')+7);
$str=substr($str,strpos($str,'</head>')+7);
}
// First, we explode the input string based on wherever we find HTML tags, which start with '<'
$arr=explode('<',$str);
// Next, we loop through the array that is broken into HTML tags and look for textual content, or
// anything after the >
for($i=0;$i<count($arr);$i++) {
if(strlen(trim($arr[$i]))>0) {
// Add the '<' back on since it became collateral damage in our explosion as well as the rest of the tag
$html='<'.substr($arr[$i],0,strpos($arr[$i],'>')+1);
// Take the portion of the string after the end of the tag and explode that by newline. Since this is after
// the end of the HTML tag, this must be textual content.
$sub_arr=explode("\n",substr($arr[$i],strpos($arr[$i],'>')+1));
// Initialize the output string for this next loop
$paragraph_text='';
// Loop through this new array and add paragraph tags (<p>...</p>) around any element that isn't empty
for($j=0;$j<count($sub_arr);$j++) {
if(strlen(trim($sub_arr[$j]))>0)
$paragraph_text.='<p>'.trim($sub_arr[$j]).'</p>';
}
// Put the text back onto the end of the HTML tag and put it in our output string
$out.=$html.$paragraph_text;
}
}
// Throw it back into our program
return $out;
}
第一个,nl2p(),将字符串作为输入,并在有换行符的地方将其转换为数组("\n"
) 特点。然后它会遍历每个元素,如果找到一个不为空的元素,则会换行<p></p>
围绕它的标签并将其添加到一个字符串中,该字符串在函数末尾返回。
第二个,nl2p_html(),是前者的更复杂的版本。将 HTML 文件的内容作为字符串传递给它,它会换行<p>
and </p>
任何非 HTML 文本周围的标签。它通过将字符串分解为数组来实现这一点,其中分隔符是<
字符,它是任何 HTML 标记的开始。然后,代码将遍历每个元素,查找 HTML 标记的末尾,并将其后面的所有内容放入新字符串中。
这个新字符串本身将分解为一个数组,其中分隔符是换行符("\n"
)。循环遍历这个新数组,代码查找非空元素。当它找到一些数据时,它将把它包装在段落标签中并将其添加到输出字符串中。当此循环完成时,该字符串将被添加回 HTML 代码中,并且将一起修改为输出缓冲区字符串,该字符串将在函数完成后返回。
tl;dr:nl2p() 会将字符串转换为 HTML 段落,而不留下任何空段落,nl2p_html() 会将段落标签包裹在 HTML 文档正文的内容周围。
我在几个小型 HTML 示例文件上对此进行了测试,以确保间距和其他内容不会破坏输出。 nl2p_html() 生成的代码也可能不符合 W3C,因为它会将锚点包裹在段落等周围,而不是相反。
希望这可以帮助。