php过滤非法脚本 允许正常的html标签

过滤有害的js代码,而放行正常的html标签。在网上搜了半天,没发现什么好办法。网上只要有富文本编辑器的地方,普遍存在xss问题。

网上有个remove_xss函数。相对好用。但经过实测,IE下还是有漏洞的。而且,会过滤style、iframe这些我们需要的标签和属性。

网上的remove_xss函数

function remove_xss($val) {
   // remove all non-printable characters. CR(0a) and LF(0b) and TAB(9) are allowed
   // this prevents some character re-spacing such as <java\0script>
   // note that you have to handle splits with \n, \r, and \t later since they *are* allowed in some inputs
   $val = preg_replace('/([\x00-\x08,\x0b-\x0c,\x0e-\x19])/', '', $val);
   // straight replacements, the user should never need these since they're normal characters
   // this prevents like <IMG SRC=@avascript:alert('XSS')>
   $search = 'abcdefghijklmnopqrstuvwxyz';
   $search .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
   $search .= '1234567890!@#$%^&#038;*()';
   $search .= '~`";:?+/={}[]-_|\'\\';
   for ($i = 0; $i < strlen($search); $i++) {
      // ;? matches the ;, which is optional
      // 0{0,7} matches any padded zeros, which are optional and go up to 8 chars
      // @ @ search for the hex values
      $val = preg_replace('/(&#[xX]0{0,8}'.dechex(ord($search[$i])).';?)/i', $search[$i], $val); // with a ;
      // @ @ 0{0,7} matches '0' zero to seven times
      $val = preg_replace('/(&#0{0,8}'.ord($search[$i]).';?)/', $search[$i], $val); // with a ;
   }
   // now the only remaining whitespace attacks are \t, \n, and \r
   $ra1 = array('javascript', 'vbscript', 'expression', 'applet', 'meta', 'xml', 'blink', 'link', 'style', 'script', 'embed', 'object', 'iframe', 'frame', 'frameset', 'ilayer', 'layer', 'bgsound', 'title', 'base');
   $ra2 = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavailable', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterchange', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowenter', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload');
   $ra = array_merge($ra1, $ra2);
   $found = true; // keep replacing as long as the previous round replaced something
   while ($found == true) {
      $val_before = $val;
      for ($i = 0; $i < sizeof($ra); $i++) {
         $pattern = '/';
         for ($j = 0; $j < strlen($ra[$i]); $j++) {
            if ($j > 0) {
               $pattern .= '(';
               $pattern .= '(&#[xX]0{0,8}([9ab]);)';
               $pattern .= '|';
               $pattern .= '|(&#0{0,8}([9|10|13]);)';
               $pattern .= ')*';
            }
            $pattern .= $ra[$i][$j];
         }
         $pattern .= '/i';
         $replacement = substr($ra[$i], 0, 2).'<x>'.substr($ra[$i], 2); // add in <> to nerf the tag
         $val = preg_replace($pattern, $replacement, $val); // filter out the hex tags
         if ($val_before == $val) {
            // no replacements were made, so exit the loop
            $found = false;
         }
      }
   }
   return $val;
}

测试代码

$test='<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p>
<p>测试<span style="background-color: #99cc00">背景色</span>测试</p>
<p>正常的字符:“javascript:”</p>
<img src="javas	cript:alert(/xss2/)" width=100>
<iframe src="http://www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe>
';
echo remove_xss($test);

测试结果

看到了吧,不仅弹出了对话框,还把正常的html样式标签给过滤了!
其过滤之后的代码是这样的:

<p>测试测试<span st<x>yle="color: #ff6600">带颜色字<span st<x>yle="font-size: small">小字体</span></span><span st<x>yle="font-size: small">字是小字体</span></p>
<p>测试<span st<x>yle="background-color: #99cc00">背景色</span>测试</p>
<p>正常的字符:“ja<x>vasc<x>ript:</p>
<img src="javas	cript:alert(/xss2/)" width=100>
<if<x>rame src="http://www.baidu.com" width="600px" height="330px" st<x>yle="margin: -33px 0 0 -23px; display=inline;" fr<x>ameborder="0" marginheight="0" marginwidth="0" scrolling="no"></if<x>rame>

为什么弹出了呢?因为javascript中间的不是空格,而是[tab] !

<img src="javas[tab键]cript:alert(/xss2/)" width=100>

下面我们对函数进行一些修改,把style、iframe等标签统统放行,确保html样式能够正常显示,同时过滤掉[tab]。然后继续测试,这次测试更多xss的情况:

$test='<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p>
<p>测试<span style="background-color: #99cc00">背景色</span>测试</p>
<p>正常的字符:“javascript:”</p>
<img src="javascrip&#116&#58alert(/xss1/)" width=100>
<img src="javas	cript:alert(/xss2/)" width=100>
<img src="#" onerror=alert(/xss3/)>
<img src="#" style="Xss:expression(alert(/xss4/))">
<img src="#"/**/onerror=alert(/xss5/) width=100>
<img src="#" onerror=alert(/xss6/)>
<img src="#" style="Xss:expression(alert(/xss7/));">
<img src="#"/**/onerror=alert(/xss8/) width=100>
<img src="#" style="Xss:ex/**/pression(alert(/xss9/));">
<iframe src="http://www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe>
';
echo no_xss($test);

测试结果:

xss

这次html标签是放行了,但又产生了新问题,这样的代码又可以执行了!

<img src="#" style="Xss:ex/**/pression(alert(/xss9/));">

这个怎么过滤?我也不知道。只能用笨方法,把所有情况替换掉。
完整的跨站脚本过滤函数no_xss

function no_xss($str){
	$str = str_replace('	',' ',$str);
	$str = str_replace(':e/*','e',$str);
	$str = str_replace(':ex/*','ex',$str);
	$str = str_replace(':exp/*','exp',$str);
	$str = str_replace(':expr/*','expr',$str);
	$str = str_replace(':expre/*','expre',$str);
	$str = str_replace(':expres/*','expres',$str);
	$str = str_replace(':express/*','express',$str);
	$str = str_replace(':expressi/*','expressi',$str);
	$str = str_replace(':expressio/*','expressio',$str);
	$val=$str;
	$val = preg_replace('/([\x00-\x08,\x0b-\x0c,\x0e-\x19])/', '', $val);
	// straight replacements, the user should never need these since they're normal characters
	// this prevents like <IMG SRC=@avascript:alert('XSS')>
	$search = 'abcdefghijklmnopqrstuvwxyz';
	$search .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
	$search .= '1234567890!@#$%^&#038;*()';
	$search .= '~`";:?+/={}[]-_|\'\\';
	for ($i = 0; $i < strlen($search); $i++) {
	// ;? matches the ;, which is optional
	// 0{0,7} matches any padded zeros, which are optional and go up to 8 chars
	// @ @ search for the hex values
		$val = preg_replace('/(&#[xX]0{0,8}'.dechex(ord($search[$i])).';?)/i', $search[$i], $val); // with a ;
		// @ @ 0{0,7} matches '0' zero to seven times
		$val = preg_replace('/(&#0{0,8}'.ord($search[$i]).';?)/', $search[$i], $val); // with a ;
	}
	// now the only remaining whitespace attacks are \t, \n, and \r
	$ra1 = Array('javascript', 'vbscript', 'expression', 'applet', 'meta', 'xml', 'blink', 'link', 'frameset', 'ilayer', 'bgsound', 'title', 'base');
	$ra2 = Array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavailable', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterchange', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowenter', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload');
	$ra = array_merge($ra1, $ra2);
	$found = true; // keep replacing as long as the previous round replaced something
	while ($found == true) {
		$val_before = $val;
		for ($i = 0; $i < sizeof($ra); $i++) {
			$pattern = '/';
			for ($j = 0; $j < strlen($ra[$i]); $j++) {
				if ($j > 0) {
					$pattern .= '(';
					$pattern .= '(&#[xX]0{0,8}([9ab]);)';
					$pattern .= '|';
					$pattern .= '|(&#0{0,8}([9|10|13]);)';
					$pattern .= ')*';
				}
				$pattern .= $ra[$i][$j];
			}
			$pattern .= '/i';
			$replacement = substr($ra[$i], 0, 2).'<x>'.substr($ra[$i], 2); // add in <> to nerf the tag
			$val = preg_replace($pattern, $replacement, $val); // filter out the hex tags
			if ($val_before == $val) {
				// no replacements were made, so exit the loop
				$found = false;
			}
		}
	}
	return $val;  
}

再测试一下:

xss测试

好了,不弹了。
过滤后的代码:

<p>测试测试<span style="color: #ff6600">带颜色字<span style="font-size: small">小字体</span></span><span style="font-size: small">字是小字体</span></p>
<p>测试<span style="background-color: #99cc00">背景色</span>测试</p>
<p>正常的字符:“ja<x>vascript:</p>
<img src="ja<x>vascript:alert(/xss1/)" width=100>
<img src="javas cript:alert(/xss2/)" width=100>
<img src="#" on<x>error=alert(/xss3/)>
<img src="#" style="Xss:ex<x>pression(alert(/xss4/))">
<img src="#"/**/on<x>error=alert(/xss5/) width=100>
<img src="#" on<x>error=alert(/xss6/)>
<img src="#" style="Xss:ex<x>pression(alert(/xss7/));">
<img src="#"/**/on<x>error=alert(/xss8/) width=100>
<img src="#" style="Xssex*/pression(alert(/xss9/));">
<iframe src="http://www.baidu.com" width="600px" height="330px" style="margin: -33px 0 0 -23px; display=inline;" frameborder="0" marginheight="0" marginwidth="0" scrolling="no"></iframe>
yeezy boost 350 v2 size 112021-04-03 15:56:31

hey there and thank you for your info – I’ve definitely picked up something new from right here.
I did however expertise several technical issues using this site, as I experienced to reload the website lots of times previous to I could get it to load properly.
I had been wondering if your hosting is OK?
Not that I’m complaining, but sluggish loading instances times will very frequently affect your placement in google and could damage your high-quality
score if advertising and marketing with Adwords.
Well I am adding this RSS to my email and can look out for a lot more
of your respective exciting content. Make sure you update this again soon.

http://www.nookl.com2021-04-03 16:14:14

continuously i used to read smaller articles that also
clear their motive, and that is also happening with this post which I
am reading at this time.

double glazing near me2021-04-03 16:36:20

Hi! Someone in my Myspace group shared this site with us so I came to check it out.
I’m definitely enjoying the information. I’m
book-marking and will be tweeting this to my
followers! Excellent blog and amazing design and style.

window doctor2021-04-03 17:45:37

I got this site from my pal who told me regarding this web page and at
the moment this time I am visiting this website and reading very
informative content at this time.

https://moverslosangeles.co/professional-packers-and-movers-los-angeles.html2021-04-03 18:39:31

Thank you for every other informative blog. Where else may just I am
getting that kind of information written in such
an ideal approach? I have a venture that I am just now working on, and
I’ve been on the look out for such information.

cheap vps hosting usa2021-04-03 20:13:17

Magnificent goods from you, man. I have understand your stuff
previous to and you’re just too excellent. I actually like what
you’ve acquired here, certainly like what you’re stating
and the way in which you say it. You make it enjoyable and you still care for to keep it smart.
I can not wait to read much more from you. This is really a tremendous website.

1win рабочее зеркало 1 win2021-04-03 20:17:20

Пользователи в интернете часто интересует,
нужен ли паспорт или верификация дли
того, чтобы делать ставки на
спорт в «1win».

binary options trading signals australian2021-04-04 12:23:10

I’ve been exploring for a little bit for any high-quality articles or blog posts on this
kind of house . Exploring in Yahoo I finally stumbled upon this web site.

Reading this information So i am satisfied to exhibit that I’ve
an incredibly good uncanny feeling I came upon exactly
what I needed. I most certainly will make sure to do not fail to remember this website and provides it a glance regularly.

news2021-04-04 15:13:03

I’m not sure where you’re getting your information, but great topic.
I needs to spend some time learning much more or understanding more.
Thanks for magnificent information I was looking for this info for my mission.

jasa seo2021-04-04 15:40:28

We’re a gaggle of volunteers and opening a brand new scheme in our community.
Your site provided us with helpful info to work on. You’ve performed
an impressive job and our entire group will likely be grateful
to you.

bandarq Terpercaya2021-04-04 15:45:15

Hello, for all time i used to check website posts here early in the break of
day, for the reason that i like to find out more and more.

https://runescapegold-buy.com/2021-04-04 17:30:24

Since the admin oof this site is working, no question very rapidly it
wkll be well-known, due to its feature contents.

mlm business name ideas2021-04-04 19:24:46

Atomy is a wonderful business which I highly suggest!
They supply top quality all-natural items that are high popular, approved
by many health and wellness agencies, and also are very reliable contrasted
to those in your regional shops. As well as if you need to gain even more revenue, you can begin your very own company with Atomy free
of charge with a $0 investment simply by referring individuals which can obtain you compensations on their acquisitions as well as
also on the people they refer! upto limitless degrees!.
So, to acquire Atomy’s items or start your own business, simply search “AtomyBenefits” on Google and click the first
site that comes up.

clothes for cash2021-04-04 19:59:45

Hello my friend! I want to say that this post is awesome, nice written and come with approximately all significant infos.
I’d like to peer more posts like this .

windows and doors2021-04-04 21:29:13

Hi, i think that i saw you visited my blog thus i came to “return the
favor”.I am trying to find things to enhance my site!I suppose its ok to use a few of
your ideas!!

https://pornonlo.com2021-04-04 21:42:41

You are elegant.

หวย2021-04-05 10:01:45

Sweet blog! I found it while browsing on Yahoo News.
Do you have any suggestions on how to get listed in Yahoo News?
I’ve been trying for a while but I never seem to get there!

Appreciate it

หวย2021-04-05 11:31:13

I am extremely inspired with your writing skills and also with the format for your weblog.
Is that this a paid subject matter or did you customize it your self?

Either way keep up the nice high quality writing, it is rare to peer a
nice weblog like this one nowadays..

female nike acg boots2021-04-05 13:19:34

Hmm is anyone else experiencing problems with the images
on this blog loading? I’m trying to find out if its a problem
on my end or if it’s the blog. Any responses would be greatly appreciated.

바카라사이트2021-04-05 16:32:13

There is certainly a great deal to learn about this topic.
I like all of the points you have made.

Windows 10 Activator 20212021-04-06 0:42:25

Wonderful blog! I found it while browsing on Yahoo News. Do you have any
suggestions on how to get listed in Yahoo News? I’ve been trying
for a while but I never seem to get there! Cheers

atomybenefits.com2021-04-06 11:06:10

Atomy is a terrific business which I extremely suggest!
They offer top quality all-natural products that are high
in demand, approved by numerous health and wellness companies, and are extremely
efficient contrasted to those in your regional stores. And also if you need to make more revenue,
you can begin your very own company with Atomy free of charge with a $0 investment simply by referring
individuals which can obtain you commissions on their acquisitions as well as also on individuals they refer!
upto endless levels!. So, to buy Atomy’s products
or start your own service, simply search “AtomyBenefits” on Google and also click the
very first website that comes up.

Bouncing Castle Rental Singapore2021-04-06 15:31:08

I’ve been surfing online more than three hours today, yet I never found any interesting article like yours.

It is pretty worth enough for me. In my opinion, if all
site owners and bloggers made good content as you did, the net will be much more useful
than ever before.

yeezy early links2021-04-06 16:26:38

Does your website have a contact page? I’m having trouble locating
it but, I’d like to send you an e-mail. I’ve got some recommendations for your blog you
might be interested in hearing. Either way, great website
and I look forward to seeing it develop over time.

JAWS Crack2021-04-08 7:55:27

Good and useful work

SoundPad Crack2021-04-11 2:38:47

i like your work. it’s good and informative

验证码