php过滤非法脚本 允许正常的html标签

过滤有害的js代码,而放行正常的html标签。在网上搜了半天,没发现什么好办法。网上只要有富文本编辑器的地方,普遍存在xss问题。

网上有个remove_xss函数。相对好用。但经过实测,IE下还是有漏洞的。而且,会过滤style、iframe这些我们需要的标签和属性。

网上的remove_xss函数

function remove_xss($val) {
   // remove all non-printable characters. CR(0a) and LF(0b) and TAB(9) are allowed
   // this prevents some character re-spacing such as <java\0script>
   // note that you have to handle splits with \n, \r, and \t later since they *are* allowed in some inputs
   $val = preg_replace('/([\x00-\x08,\x0b-\x0c,\x0e-\x19])/', '', $val);
   // straight replacements, the user should never need these since they're normal characters
   // this prevents like <IMG SRC=@avascript:alert('XSS')>
   $search = 'abcdefghijklmnopqrstuvwxyz';
   $search .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
   $search .= '1234567890!@#$%^&#038;*()';
   $search .= '~`";:?+/={}[]-_|\'\\';
   for ($i = 0; $i < strlen($search); $i++) {
      // ;? matches the ;, which is optional
      // 0{0,7} matches any padded zeros, which are optional and go up to 8 chars
      // @ @ search for the hex values
      $val = preg_replace('/(&#[xX]0{0,8}'.dechex(ord($search[$i])).';?)/i', $search[$i], $val); // with a ;
      // @ @ 0{0,7} matches '0' zero to seven times
      $val = preg_replace('/(&#0{0,8}'.ord($search[$i]).';?)/', $search[$i], $val); // with a ;
   }
   // now the only remaining whitespace attacks are \t, \n, and \r
   $ra1 = array('javascript', 'vbscript', 'expression', 'applet', 'meta', 'xml', 'blink', 'link', 'style', 'script', 'embed', 'object', 'iframe', 'frame', 'frameset', 'ilayer', 'layer', 'bgsound', 'title', 'base');
   $ra2 = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavailable', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterchange', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowenter', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload');
   $ra = array_merge($ra1, $ra2);
   $found = true; // keep replacing as long as the previous round replaced something
   while ($found == true) {
      $val_before = $val;
      for ($i = 0; $i < sizeof($ra); $i++) {
         $pattern = '/';
         for ($j = 0; $j < strlen($ra[$i]); $j++) {
            if ($j > 0) {
               $pattern .= '(';
               $pattern .= '(&#[xX]0{0,8}([9ab]);)';
               $pattern .= '|';
               $pattern .= '|(&#0{0,8}([9|10|13]);)';
               $pattern .= ')*';
            }
            $pattern .= $ra[$i][$j];
         }
         $pattern .= '/i';
         $replacement = substr($ra[$i], 0, 2).'<x>'.substr($ra[$i], 2); // add in <> to nerf the tag
         $val = preg_replace($pattern, $replacement, $val); // filter out the hex tags
         if ($val_before == $val) {
            // no replacements were made, so exit the loop
            $found = false;
         }
      }
   }
   return $val;
}

测试代码

$test='<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p>
<p>测试<span style="background-color: #99cc00">背景色</span>测试</p>
<p>正常的字符:“javascript:”</p>
<img src="javas	cript:alert(/xss2/)" width=100>
<iframe src="http://www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe>
';
echo remove_xss($test);

测试结果

看到了吧,不仅弹出了对话框,还把正常的html样式标签给过滤了!
其过滤之后的代码是这样的:

<p>测试测试<span st<x>yle="color: #ff6600">带颜色字<span st<x>yle="font-size: small">小字体</span></span><span st<x>yle="font-size: small">字是小字体</span></p>
<p>测试<span st<x>yle="background-color: #99cc00">背景色</span>测试</p>
<p>正常的字符:“ja<x>vasc<x>ript:</p>
<img src="javas	cript:alert(/xss2/)" width=100>
<if<x>rame src="http://www.baidu.com" width="600px" height="330px" st<x>yle="margin: -33px 0 0 -23px; display=inline;" fr<x>ameborder="0" marginheight="0" marginwidth="0" scrolling="no"></if<x>rame>

为什么弹出了呢?因为javascript中间的不是空格,而是[tab] !

<img src="javas[tab键]cript:alert(/xss2/)" width=100>

下面我们对函数进行一些修改,把style、iframe等标签统统放行,确保html样式能够正常显示,同时过滤掉[tab]。然后继续测试,这次测试更多xss的情况:

$test='<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p>
<p>测试<span style="background-color: #99cc00">背景色</span>测试</p>
<p>正常的字符:“javascript:”</p>
<img src="javascrip&#116&#58alert(/xss1/)" width=100>
<img src="javas	cript:alert(/xss2/)" width=100>
<img src="#" onerror=alert(/xss3/)>
<img src="#" style="Xss:expression(alert(/xss4/))">
<img src="#"/**/onerror=alert(/xss5/) width=100>
<img src="#" onerror=alert(/xss6/)>
<img src="#" style="Xss:expression(alert(/xss7/));">
<img src="#"/**/onerror=alert(/xss8/) width=100>
<img src="#" style="Xss:ex/**/pression(alert(/xss9/));">
<iframe src="http://www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe>
';
echo no_xss($test);

测试结果:

xss

这次html标签是放行了,但又产生了新问题,这样的代码又可以执行了!

<img src="#" style="Xss:ex/**/pression(alert(/xss9/));">

这个怎么过滤?我也不知道。只能用笨方法,把所有情况替换掉。
完整的跨站脚本过滤函数no_xss

function no_xss($str){
	$str = str_replace('	',' ',$str);
	$str = str_replace(':e/*','e',$str);
	$str = str_replace(':ex/*','ex',$str);
	$str = str_replace(':exp/*','exp',$str);
	$str = str_replace(':expr/*','expr',$str);
	$str = str_replace(':expre/*','expre',$str);
	$str = str_replace(':expres/*','expres',$str);
	$str = str_replace(':express/*','express',$str);
	$str = str_replace(':expressi/*','expressi',$str);
	$str = str_replace(':expressio/*','expressio',$str);
	$val=$str;
	$val = preg_replace('/([\x00-\x08,\x0b-\x0c,\x0e-\x19])/', '', $val);
	// straight replacements, the user should never need these since they're normal characters
	// this prevents like <IMG SRC=@avascript:alert('XSS')>
	$search = 'abcdefghijklmnopqrstuvwxyz';
	$search .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
	$search .= '1234567890!@#$%^&#038;*()';
	$search .= '~`";:?+/={}[]-_|\'\\';
	for ($i = 0; $i < strlen($search); $i++) {
	// ;? matches the ;, which is optional
	// 0{0,7} matches any padded zeros, which are optional and go up to 8 chars
	// @ @ search for the hex values
		$val = preg_replace('/(&#[xX]0{0,8}'.dechex(ord($search[$i])).';?)/i', $search[$i], $val); // with a ;
		// @ @ 0{0,7} matches '0' zero to seven times
		$val = preg_replace('/(&#0{0,8}'.ord($search[$i]).';?)/', $search[$i], $val); // with a ;
	}
	// now the only remaining whitespace attacks are \t, \n, and \r
	$ra1 = Array('javascript', 'vbscript', 'expression', 'applet', 'meta', 'xml', 'blink', 'link', 'frameset', 'ilayer', 'bgsound', 'title', 'base');
	$ra2 = Array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavailable', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterchange', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowenter', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload');
	$ra = array_merge($ra1, $ra2);
	$found = true; // keep replacing as long as the previous round replaced something
	while ($found == true) {
		$val_before = $val;
		for ($i = 0; $i < sizeof($ra); $i++) {
			$pattern = '/';
			for ($j = 0; $j < strlen($ra[$i]); $j++) {
				if ($j > 0) {
					$pattern .= '(';
					$pattern .= '(&#[xX]0{0,8}([9ab]);)';
					$pattern .= '|';
					$pattern .= '|(&#0{0,8}([9|10|13]);)';
					$pattern .= ')*';
				}
				$pattern .= $ra[$i][$j];
			}
			$pattern .= '/i';
			$replacement = substr($ra[$i], 0, 2).'<x>'.substr($ra[$i], 2); // add in <> to nerf the tag
			$val = preg_replace($pattern, $replacement, $val); // filter out the hex tags
			if ($val_before == $val) {
				// no replacements were made, so exit the loop
				$found = false;
			}
		}
	}
	return $val;  
}

再测试一下:

xss测试

好了,不弹了。
过滤后的代码:

<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p>
<p>测试<span style="background-color: #99cc00">背景色</span>测试</p>
<p>正常的字符:“ja<x>vascript:</p>
<img src="ja<x>vascript:alert(/xss1/)" width=100>
<img src="javas cript:alert(/xss2/)" width=100>
<img src="#" on<x>error=alert(/xss3/)>
<img src="#" style="Xss:ex<x>pression(alert(/xss4/))">
<img src="#"/**/on<x>error=alert(/xss5/) width=100>
<img src="#" on<x>error=alert(/xss6/)>
<img src="#" style="Xss:ex<x>pression(alert(/xss7/));">
<img src="#"/**/on<x>error=alert(/xss8/) width=100>
<img src="#" style="Xssex*/pression(alert(/xss9/));">
<iframe src="http://www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe>
Melvintharo2021-06-24 12:33:16

Terrific information, Thanks a lot.
hhttps://cadmed-bb.com/# canada pharmacy no prescription

Melvintharo2021-06-24 13:38:01

You said it adequately.!
hhttps://cadmed-bb.com/# canada pharmaceuticals

Melvintharo2021-06-24 18:00:37

With thanks! Useful stuff.
hhttps://cadmed-bb.com/# mexican online pharmacies

Melvintharo2021-06-24 19:01:46

Thanks a lot! Ample tips!
hhttps://cadmed-bb.com/# web medical information

Melvintharo2021-06-24 22:10:43

Thanks a lot! I value it!
hhttps://cadmed-bb.com/# list of approved canadian pharmacies

Melvintharo2021-06-24 23:12:41

Regards. Numerous posts!
hhttps://cadmed-bb.com/# pharmacy online cheap

Melvintharo2021-06-25 0:17:47

Thanks a lot! Loads of material!
hhttps://cadmed-bb.com/# online pharmacy viagra

Melvintharo2021-06-25 4:36:55

Whoa many of amazing data.
hhttps://cadmed-bb.com/# cheap canadian drugs

Melvintharo2021-06-27 1:23:42

You actually expressed it adequately.
hhttps://cadmed-bb.com/# north west pharmacy canada

Melvintharo2021-06-27 2:29:46

Fine advice. Thanks a lot.
hhttps://cadmed-bb.com/# pain meds online without doctor prescription

Judi Online2021-07-04 2:05:50

Hello! I know this is kinda off topic but I was wondering which
blog platform are you using for this site? I’m getting sick and tired
of WordPress because I’ve had issues with hackers and I’m looking at options for another platform.
I would be fantastic if you could point me in the direction of a
good platform.

free best hookups2021-07-11 4:07:01

Hookup Girls Utilizes Free of charge Issues? A Fantastic Side to
side Reward!

Totally free hookup girls free best hookups on the internet is the
perfect solution if you’re sick and tired of going to cafes and
night clubs just to be ignored, and even a whole lot worse, laughed at.

I understand what it’s like because I’ve been there.
I had been one and needy back in the day — I necessary a fresh lover
— but I kept on attempting because I had no other choice.
If you’re an individual gentleman who wants to hookup
with hot girls without likely to those places where the ladies are by yourself, then this article might just make positive changes to
life. It would describe why courting on the internet is
the perfect substitute if you’re a male who seems to be too shy
to technique a lovely lady within a club or team.

Aaronoriep2021-07-11 10:15:58

Профессиональный монтаж напольных покрытий.Обращайтесь всегда рады вам помочь.
Мы делаем следующие работы
Монтаж напольного плинтуса из массива
Монтаж напольного плинтуса МДФ
Монтаж напольного плинтуса дюрополимер
Монтаж напольного плинтуса ПВХ
Монтаж напольного плинтуса ЛДФ
Монтаж потолочного плинтуса.
Монтаж напольного плинтуса из металла и т.д кроме камня.
Покраска плинтуса.
Монтаж напольных покрытий
Монтаж паркетной доски на подложку.
Монтаж ламината.
Монтаж винилового ламината
Монтаж инжинерной доски
Монтаж моссивной доски (с готовым покрытием)
Монтаж фанеры.
Монтаж галтелий и наличников.
По другим работам уточняйте!
гарантия на все виды работ.
Напилим.про

cheap bouncy castle rental in singapore2021-07-12 13:38:28

Hello outstanding website! Does running a blog similar to this require a massive amount work?
I’ve very little knowledge of computer programming but
I was hoping to start my own blog soon. Anyway, should you have any
recommendations or tips for new blog owners please share.
I understand this is off subject however I simply wanted
to ask. Appreciate it!

Bouncing Castle Rental Singapore2021-07-13 18:08:48

Can you tell us more about this? I’d like to find out more details.

验证码