如何用正则表达式去掉html标签

html-css014

如何用正则表达式去掉html标签,第1张

用正则表达式去掉html标签,下面是它的代码,直接复制就可以用的。

代码:

public

static string StripHTML(string HTML) //google "StripHTML" 得到 {

string[] Regexs = {

@"<script[^>]*?>.*?</script>",

@"<(\/\s*)?!?((\w+:)?\w+)(\w+(\s*=?\s*(([""'])(\\[""'tbnr]|[^\7])*?\7|\w+)|.{0})|\s)*?(\/\s*)?>",

@"([\r\n])[\s]+", @"&(quot|#34)",

@"&(amp|#38)", @"&(lt|#60)",

@"&(gt|#62)", @"&(nbsp|#160)",

@"&(iexcl|#161)",

@"&(cent|#162)",

@"&(pound|#163)",

@"&(copy|#169)", @"(\d+)",

@"-->", @"<!--.*\n" }string[]

Replaces = { "", "", "", "\"", "&",

"<", ">", " ", "\xa1", //chr(161),

"\xa2", //chr(162), "\xa3", //chr(163), "\xa9", //chr(169), "",

"\r\n", "" }string s = HTMLfor (int i = 0i <

Regexs.Lengthi++) { s = new Regex(Regexs[i],

RegexOptions.Multiline | RegexOptions.IgnoreCase).Replace(s,

Replaces[i])} s.Replace("<", "")

s.Replace(">", "")s.Replace("\r\n", "")return s

} }

var s="<span>a</span>.... <span>z</span>"

s=s.replace(/<span>(.*?)<\/span>/g,"<input type='text' value='$1' />")

var str = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.1//EN\" \"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd\">"

str = str.replace(/^[<!DOCTYPE html].*[>]$/,"<!DOCTYPE html>")

alert(str)