2. Http Request & Response

  • 人与人通过自然语言沟通:问话,答话
  • 机器通过HTTP协议沟通:Request, Response

Request

GET /docs/index.html HTTP/1.1
Host: www.nowhere123.com
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

Response

HTTP/1.1 200 OK
Date: Sun, 18 Oct 2009 08:56:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT
ETag: "10000000565a5-2c-3e94b66c2e680"
Accept-Ranges: bytes
Content-Length: 44
Connection: close
Content-Type: text/html
X-Pad: avoid browser bug

<html><body><h1>It works!</h1></body></html>

3. Common Redirection Implementations

  1. HTTP portocol status code (easy)
  2. meta refresh (easy)
  3. client side script (tough)
  4. server redirect (not to be discussed in this article)

3.1 HTTP status codes 3xx

  • 300 Multiple Choices
  • 301 Moved Permanently
  • 302 Found
  • 303 See Other
  • 307 Temporary Redirect

Example HTTP response for a 301 redirect:

HTTP/1.1 301 Moved Permanently
Location: http://www.example.org/
Content-Type: text/html
Content-Length: 174

<html>
<head>
<title>Moved</title>
</head>
<body>
<h1>Moved</h1>
<p>This page has moved to <a href="http://www.example.org/">http://www.example.org/</a>.</p>
</body>
</html>

3.2 Refresh Meta tag and HTTP refresh header

refresh meta tag:

<html>
<head>
  <meta http-equiv="Refresh" content="0; url=http://www.example.com/" />
</head>
<body>
  <p>Please follow <a href="http://www.example.com/">this link</a>.</p>
</body>
</html>

http refresh header: (created by Netscape)

HTTP/1.1 200 OK
Refresh: 0; url=http://www.example.com/
Content-Type: text/html
Content-Length: 78

Please follow <a href="http://www.example.com/">this link</a>.

3.2 Javascript redirects

TODO

4 Redirection using JavaScript

4.1 Manual Redirect

The simplest technique is to ask the visitor to follow a link to the new page, usually using an HTML anchor like:

Please follow <a href="http://www.example.com/">this link</a>.

This method is often used as a fall-back — if the browser does not support the automatic redirect, the visitor can still reach the target document by following the link.

4.2 Frame redirects

<iframe height="100%" width="100%" src="http://www.example.com/">
Please follow <a href="http://www.example.com/">link</a>.
</iframe>

4.3 Location property

  • window.location
  • window.location.replace
  • window.location.href
  • self.location
  • window.location.replace
  • window.location
  • location.href
  • document.location.replace
window.location='http://www.example.com/'
window.location.replace('http://www.example.com/')
...

5. JavaScript Features that Facilitate Hiding

Obfuscation and eval, dynamic code generation, self-modifying

5.1 String Manipulation and Eval

var a1="win", a2="dow.", a3="loca", a4="tion.", a5="replace", a6="('http://www.partypoker.com/index.htm?wm=2501068')"; var i,str="";
for(i=1;i<=6;i++){
        str += eval("a"+i); }
eval(str);

5.2 Unescape

URL encoding or percent-encoding is a common mechanism intended for encoding reserved characters. Thus, unreserved characters such as “A” can be represented as “%7E” and are expected to be processed correctly.

var s = '%5CBEOD%5C%05GDHJ_BDE%16%0CC__%5B%11%04%04%5C%5C%5C%05 SMYNNFD%5DBNX%05HDF%04%0C';
var e = '', i; eval(unescape('s%3Dunescape%28s%29%3Bfor%28i%3D0%3Bi%3Cs.length%3Bi%2 B%2B%29%7Be%2B%3DString.fromCharCode%28s.charCodeAt%28i%29%5E43%29 %3B%7D%3Beval%28e%29%3B'));
s=unescape(s); for(i=0;i<s.length;i++){e+=String.fromCharCode(s.charCodeAt(i)^43);}; eval(e)
window.location='http://www.xfreemovies.com/'

5.3 Decode

a machine learning system [16, 17] can just as easily learn to observe these URL encoded substring patterns as the original patterns. employ custom decoding schemes.

5.4 Script Injection

The following example uses a combination of URL encoding to represent binary data that is custom decoded to build a script tag with redirection code.

var s,q,e,d,i;s=String.fromCharCode;q='script'+s(62);
var e = unescape('%BD%AD%BF%EE%BA%ED%F3%BA%A7%A0%BB%F6%E2%E1%B8%A7% A6%FC%A5%B1%B9%AD%AE%A7%BB%B5%BA%FC%B0%BB%A6%E3%AC%AA%9B %A2%B0%B1%B8%B1%B9%E3%F2%BD%A0%A5%A3%B1%B6%E9%FA%F5%FA%F8 %E9%A7%EC%BA%A7%A0%BB%E9%FE%8F%EA%E2%97%F7%E1%92%BB%A3%BF %A7%E2%B3%B9%A7%B7%B5%A2%E2%BE%BA%AE%A2%FC%B5%BC%A7%B8%A5 %BD%E0%AC%BF%BC%F7%E1%92%BF%AD%A6%BA%AE%AA%F3%FE%B6%E9%AE %BF%AE%AF%BF%B5%FD%B6%EE%B0%A4%AF%B8%A3%AA%BE%A5%E9%B7%FA %A7%A3%AE%AF%BB%B9%BE%BC%EE%A1%F0');
var d = '';
for(i=0;i<e.length;++i) d+=s(e.charCodeAt(i)^(((i%10)+203)&255)); document.write(s(60)+q+d+s(60)+'/'+q);
 <script>var u="http://www.veracitek.com/adTracker/?source=1976&w= http%3A%2F%2Fpori‐chudai.star‐gossip.com%2Ftaktaz”,e=escape,d=document; d.location=u;
</script>

Headless Browser

What?

一些经过混淆的js代码难以解析, 通过Headless Browser渲染页面,执行js代码,跟踪其跳转。 Headless Browser指没有GUI的,可以用程序操纵的浏览器。例如, Phantom JS, Headless Chrome

Why not?

Headless Browser可以用于模拟真实用户,因此可以用于伪造广告的点击。 广告平台对此有反制措施, 会拒绝连接或者拒绝跳转。

Detecting Headless Chrome

webdriver

if(navigator.webdriver) {
    console.log("Chrome headless detected");
}

window.chrome

// isChrome is true if the browser is Chrome, Chromium or Opera
if(isChrome && !window.chrome) {
    console.log("Chrome headless detected");
}

permissions

navigator.permissions.query({name:'notifications'}).then(function(permissionStatus) {
    if(Notification.permission === 'denied' && permissionStatus.state === 'prompt') {
        console.log('This is Chrome headless')
    } else {
        console.log('This is not Chrome headless')
    }
});

Reference:

  1. https://en.wikipedia.org/wiki/URL_redirection
  2. Kumar Chellapilla , Alexey Maykov, A taxonomy of JavaScript redirection spam, May 08-08, 2007
  3. https://antoinevastel.com/bot%20detection/2018/01/17/detect-chrome-headless-v2.html