php过滤非法脚本 允许正常的html标签
过滤有害的js代码,而放行正常的html标签。在网上搜了半天,没发现什么好办法。网上只要有富文本编辑器的地方,普遍存在xss问题。
网上有个remove_xss函数。相对好用。但经过实测,IE下还是有漏洞的。而且,会过滤style、iframe这些我们需要的标签和属性。
网上的remove_xss函数
function remove_xss($val) { // remove all non-printable characters. CR(0a) and LF(0b) and TAB(9) are allowed // this prevents some character re-spacing such as <java\0script> // note that you have to handle splits with \n, \r, and \t later since they *are* allowed in some inputs $val = preg_replace('/([\x00-\x08,\x0b-\x0c,\x0e-\x19])/', '', $val); // straight replacements, the user should never need these since they're normal characters // this prevents like <IMG SRC=@avascript:alert('XSS')> $search = 'abcdefghijklmnopqrstuvwxyz'; $search .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; $search .= '1234567890!@#$%^&*()'; $search .= '~`";:?+/={}[]-_|\'\\'; for ($i = 0; $i < strlen($search); $i++) { // ;? matches the ;, which is optional // 0{0,7} matches any padded zeros, which are optional and go up to 8 chars // @ @ search for the hex values $val = preg_replace('/(&#[xX]0{0,8}'.dechex(ord($search[$i])).';?)/i', $search[$i], $val); // with a ; // @ @ 0{0,7} matches '0' zero to seven times $val = preg_replace('/(�{0,8}'.ord($search[$i]).';?)/', $search[$i], $val); // with a ; } // now the only remaining whitespace attacks are \t, \n, and \r $ra1 = array('javascript', 'vbscript', 'expression', 'applet', 'meta', 'xml', 'blink', 'link', 'style', 'script', 'embed', 'object', 'iframe', 'frame', 'frameset', 'ilayer', 'layer', 'bgsound', 'title', 'base'); $ra2 = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavailable', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterchange', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowenter', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload'); $ra = array_merge($ra1, $ra2); $found = true; // keep replacing as long as the previous round replaced something while ($found == true) { $val_before = $val; for ($i = 0; $i < sizeof($ra); $i++) { $pattern = '/'; for ($j = 0; $j < strlen($ra[$i]); $j++) { if ($j > 0) { $pattern .= '('; $pattern .= '(&#[xX]0{0,8}([9ab]);)'; $pattern .= '|'; $pattern .= '|(�{0,8}([9|10|13]);)'; $pattern .= ')*'; } $pattern .= $ra[$i][$j]; } $pattern .= '/i'; $replacement = substr($ra[$i], 0, 2).'<x>'.substr($ra[$i], 2); // add in <> to nerf the tag $val = preg_replace($pattern, $replacement, $val); // filter out the hex tags if ($val_before == $val) { // no replacements were made, so exit the loop $found = false; } } } return $val; } |
测试代码
$test='<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p> <p>测试<span style="background-color: #99cc00">背景色</span>测试</p> <p>正常的字符:“javascript:”</p> <img src="javas cript:alert(/xss2/)" width=100> <iframe src="//www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe> '; echo remove_xss($test); |
测试结果
看到了吧,不仅弹出了对话框,还把正常的html样式标签给过滤了!
其过滤之后的代码是这样的:
<p>测试测试<span st<x>yle="color: #ff6600">带颜色字<span st<x>yle="font-size: small">小字体</span></span><span st<x>yle="font-size: small">字是小字体</span></p> <p>测试<span st<x>yle="background-color: #99cc00">背景色</span>测试</p> <p>正常的字符:“ja<x>vasc<x>ript:”</p> <img src="javas cript:alert(/xss2/)" width=100> <if<x>rame src="//www.baidu.com" width="600px" height="330px" st<x>yle="margin: -33px 0 0 -23px; display=inline;" fr<x>ameborder="0" marginheight="0" marginwidth="0" scrolling="no"></if<x>rame> |
为什么弹出了呢?因为javascript中间的不是空格,而是[tab] !
<img src="javas[tab键]cript:alert(/xss2/)" width=100> |
下面我们对函数进行一些修改,把style、iframe等标签统统放行,确保html样式能够正常显示,同时过滤掉[tab]。然后继续测试,这次测试更多xss的情况:
$test='<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p> <p>测试<span style="background-color: #99cc00">背景色</span>测试</p> <p>正常的字符:“javascript:”</p> <img src="javascript:alert(/xss1/)" width=100> <img src="javas cript:alert(/xss2/)" width=100> <img src="#" onerror=alert(/xss3/)> <img src="#" style="Xss:expression(alert(/xss4/))"> <img src="#"/**/onerror=alert(/xss5/) width=100> <img src="#" onerror=alert(/xss6/)> <img src="#" style="Xss:expression(alert(/xss7/));"> <img src="#"/**/onerror=alert(/xss8/) width=100> <img src="#" style="Xss:ex/**/pression(alert(/xss9/));"> <iframe src="//www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe> '; echo no_xss($test); |
测试结果:
这次html标签是放行了,但又产生了新问题,这样的代码又可以执行了!
<img src="#" style="Xss:ex/**/pression(alert(/xss9/));"> |
这个怎么过滤?我也不知道。只能用笨方法,把所有情况替换掉。
完整的跨站脚本过滤函数no_xss
function no_xss($str){ $str = str_replace(' ',' ',$str); $str = str_replace(':e/*','e',$str); $str = str_replace(':ex/*','ex',$str); $str = str_replace(':exp/*','exp',$str); $str = str_replace(':expr/*','expr',$str); $str = str_replace(':expre/*','expre',$str); $str = str_replace(':expres/*','expres',$str); $str = str_replace(':express/*','express',$str); $str = str_replace(':expressi/*','expressi',$str); $str = str_replace(':expressio/*','expressio',$str); $val=$str; $val = preg_replace('/([\x00-\x08,\x0b-\x0c,\x0e-\x19])/', '', $val); // straight replacements, the user should never need these since they're normal characters // this prevents like <IMG SRC=@avascript:alert('XSS')> $search = 'abcdefghijklmnopqrstuvwxyz'; $search .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; $search .= '1234567890!@#$%^&*()'; $search .= '~`";:?+/={}[]-_|\'\\'; for ($i = 0; $i < strlen($search); $i++) { // ;? matches the ;, which is optional // 0{0,7} matches any padded zeros, which are optional and go up to 8 chars // @ @ search for the hex values $val = preg_replace('/(&#[xX]0{0,8}'.dechex(ord($search[$i])).';?)/i', $search[$i], $val); // with a ; // @ @ 0{0,7} matches '0' zero to seven times $val = preg_replace('/(�{0,8}'.ord($search[$i]).';?)/', $search[$i], $val); // with a ; } // now the only remaining whitespace attacks are \t, \n, and \r $ra1 = Array('javascript', 'vbscript', 'expression', 'applet', 'meta', 'xml', 'blink', 'link', 'frameset', 'ilayer', 'bgsound', 'title', 'base'); $ra2 = Array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavailable', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterchange', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowenter', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload'); $ra = array_merge($ra1, $ra2); $found = true; // keep replacing as long as the previous round replaced something while ($found == true) { $val_before = $val; for ($i = 0; $i < sizeof($ra); $i++) { $pattern = '/'; for ($j = 0; $j < strlen($ra[$i]); $j++) { if ($j > 0) { $pattern .= '('; $pattern .= '(&#[xX]0{0,8}([9ab]);)'; $pattern .= '|'; $pattern .= '|(�{0,8}([9|10|13]);)'; $pattern .= ')*'; } $pattern .= $ra[$i][$j]; } $pattern .= '/i'; $replacement = substr($ra[$i], 0, 2).'<x>'.substr($ra[$i], 2); // add in <> to nerf the tag $val = preg_replace($pattern, $replacement, $val); // filter out the hex tags if ($val_before == $val) { // no replacements were made, so exit the loop $found = false; } } } return $val; } |
再测试一下:
好了,不弹了。
过滤后的代码:
<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p> <p>测试<span style="background-color: #99cc00">背景色</span>测试</p> <p>正常的字符:“ja<x>vascript:”</p> <img src="ja<x>vascript:alert(/xss1/)" width=100> <img src="javas cript:alert(/xss2/)" width=100> <img src="#" on<x>error=alert(/xss3/)> <img src="#" style="Xss:ex<x>pression(alert(/xss4/))"> <img src="#"/**/on<x>error=alert(/xss5/) width=100> <img src="#" on<x>error=alert(/xss6/)> <img src="#" style="Xss:ex<x>pression(alert(/xss7/));"> <img src="#"/**/on<x>error=alert(/xss8/) width=100> <img src="#" style="Xssex*/pression(alert(/xss9/));"> <iframe src="//www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe> |
What’s up mates, how is all, and what you wish for to say regarding this post, in my view its
actually remarkable in favor of me.
wonderful issues altogether, you simply
received a new reader. What could you recommend about your publish that you simply
made a few days ago? Any positive?
Remarkable! Its truly awesome post, I have got much clear idea on the topic of from this paragraph.
NeoDownloader 4 Crack outfitted with superior fantastic functions for people who use to download articles in bulk amounts.
This article presents clear idea in favor of the new visitors of blogging, that
actually how to do running a blog.
ESET NOD32 Antivirus Crack is the latest and powerful antivirus software that ensures your protection either you are online or offline.