0

I was evaluating the various ways in which the big guys implement auto suggest. These are my observations.(Search string used was "ab") Questions towards the end.

Yahoo tries something like this, where the response was a JSONP. Response is readable and serves the purpose.

Yahoo's response

yasearch({"q":"ab ","gprid":"Y435dN7TRFqnYqQhnBueJA","f":["k","m"],"r":[["ab de villiers",0],["ab exercises",0],["ab king pro",0],
["ab infi-net internet banking",0],["ab mujhe raat din",0],["ab workouts",0],["ab mp3",0],["ab meri bari",0],["ab ke baras",0],["ab meri baari",0]]})

Bing had a similar approach. Returns an "if" block, sa_inst.apiCB() seems to be a function which would process the JSON. Again response is readable and legit.

Bing's response

if(typeof sa_inst.apiCB == 'function') sa_inst.apiCB({"AS":{"Query":"ab","FullResults":1,"Results":[{"Type":"AS","Suggests": [{"Txt":"ab<strong>p</strong> <strong>news</strong>","Type":"AS","Sk":""}, 
{"Txt":"ab<strong>bottapp</strong>.ab<strong>bott</strong>.<strong>in</strong>","Type":"AS","Sk":"AS1"},{"Txt":"ab<strong>t</strong> <strong>travels</strong>","Type":"AS","Sk":"AS2"},{"Txt":"ab<strong>p</strong> <strong>ananda</strong>","Type":"AS","Sk":"AS3"},
{"Txt":"ab<strong>hibus</strong>","Type":"AS","Sk":"AS4"},{"Txt":"ab<strong>p</strong> <strong>maza</strong>","Type":"AS","Sk":"AS5"},  {"Txt":"ab<strong>b</strong>","Type":"AS","Sk":"AS6"},
{"Txt":"ab<strong>outgoogle</strong>","Type":"AS","Sk":"AS7"}]}]}} /* pageview_candidate */);

Now comes Google. Response is sent as 2 JSON objects(separated by /""/). Most of it is unreadable.

Google's response

{e:"XteVUYKqDoKHrAfdz4D4Aw",c:0,u:"https://www.google.com/s?hl\x3den\x26gs_rn\x3d14\x26gs_ri\x3dpsy-ab\x26tok\x3dvsobDhICRmdcnY7ayKTGng\x26cp\x3d2\x26gs_id\x3dd\x26xhr\x3dt\x26q\x3dab\x26es_nrs\x3dtrue\x26pf\x3dp\x26output\x3dsearch\x26sclient\x3dpsy-ab\x26oq\x3d\x26gs_l\x3d\x26pbx\x3d1\x26bav\x3don.2,or.r_cp.r_qf.\x26bvm\x3dbv.46751780,d.bmk\x26fp\x3d2647af89de6b6c61\x26biw\x3d1366\x26bih\x3d453\x26tch\x3d1\x26ech\x3d2\x26psi\x3dVteVUcYuzOGsB6LpgdgB.1368774484351.1",p:true,d:"[\x22ab\x22,     
[[\x22ab\\u003Cb\\u003Ec\\u003C\\/b\\u003E\x22,0,[]],[\x22ab\\u003Cb\\u003Ec news\\u003C\\/b\\u003E\x22,0,[]],[\x22ab\\u003Cb\\u003Eercrombie\\u003C\\/b\\u003E\x22,0,[]],[\x22ab\\u003Cb\\u003Ecya\\u003C\\/b\\u003E\x22,0,[]]],
{\x22j\x22:\x22d\x22,\x22q\x22:\x22t8z6h8KhWvbkEX6xablxgYxDUq4\x22,\x22t\x22:
{\x22bpc\x22:false,\x22tlw\x22:false}}]"}
/*""*/
{e:"XteVUYKqDoKHrAfdz4D4Aw",c:-1,u:"https://www.google.com/searchdata?hl\x3den\x26gs_rn\x3d14\x26gs_ri\x3dpsy-ab\x26tok\x3dvsobDhICRmdcnY7ayKTGng\x26cp\x3d2\x26gs_id\x3dd\x26xhr\x3dt\x26q\x3dab\x26es_nrs\x3dtrue
\x26pf\x3dp\x26output\x3dsearch\x26sclient\x3dpsy-ab\x26oq\x3d\x26gs_l\x3d\x26pbx\x3d1\x26bav\x3don.2,or.r_cp.r_qf.\x26bvm\x3dbv.46751780,d.bmk\x26fp\x3d2647af89de6b6c61\x26biw\x3d1366\x26bih\x3d453\x26tch\x3d1\x26ech\x3d2\x26psi\x3dVteVUcYuzOGsB6LpgdgB.1368774484351.1",
p:true,d:"{\x22snp\x22:1}"}
/*""*/
  1. Are those hex codes or what do you call them?
  2. Why is there a need for 2 objects to be returned?
  3. What is the need for encoding the JSON?
  4. Which is the ideal format for JSON among these three?

Any thoughts on this are welcome.

Uzair
  • 3
  • 2
  • 1
    Yahoo and Bing don't return JSON but JSONP, in order to allow cross domain requests in old browsers. Optimally, you should work with JSON and not JSONP and add the correct headers. As for Google that's encoding. – Benjamin Gruenbaum May 17 '13 at 10:18

1 Answers1

3

JSONP is not technically a thing, it's actually just Javascript. So it's not JSON but Javascript object initializers.

If you notice, Google's response does not run as Javascript code nor is it valid JSON. So it doesn't do anything if a 3rd party site includes it as a script. Those are not 2 JSON objects for sure because the keys are not quoted. They are Javascript object initializers which have a property that contains a JSON string that can be parsed as JSON.

You can get the JSON out of the first part like so:

var res = '{e:"XteVUYKqDoKHrAfdz4D4Aw",c:0,u:"https://www.google.com/s?hl\\x3den\\x26gs_rn\\x3d14\\x26gs_ri\\x3dpsy-ab\\x26tok\\x3dvsobDhICRmdcnY7ayKTGng\\x26cp\\x3d2\\x26gs_id\\x3dd\\x26xhr\\x3dt\\x26q\\x3dab\\x26es_nrs\\x3dtrue\\x26pf\\x3dp\\x26output\\x3dsearch\\x26sclient\\x3dpsy-ab\\x26oq\\x3d\\x26gs_l\\x3d\\x26pbx\\x3d1\\x26bav\\x3don.2,or.r_cp.r_qf.\\x26bvm\\x3dbv.46751780,d.bmk\\x26fp\\x3d2647af89de6b6c61\\x26biw\\x3d1366\\x26bih\\x3d453\\x26tch\\x3d1\\x26ech\\x3d2\\x26psi\\x3dVteVUcYuzOGsB6LpgdgB.1368774484351.1",p:true,d:"[\\x22ab\\x22,[[\\x22ab\\\\u003Cb\\\\u003Ec\\\\u003C\\\\/b\\\\u003E\\x22,0,[]],[\\x22ab\\\\u003Cb\\\\u003Ec news\\\\u003C\\\\/b\\\\u003E\\x22,0,[]],[\\x22ab\\\\u003Cb\\\\u003Eercrombie\\\\u003C\\\\/b\\\\u003E\\x22,0,[]],[\\x22ab\\\\u003Cb\\\\u003Ecya\\\\u003C\\\\/b\\\\u003E\\x22,0,[]]],{\\x22j\\x22:\\x22d\\x22,\\x22q\\x22:\\x22t8z6h8KhWvbkEX6xablxgYxDUq4\\x22,\\x22t\\x22:{\\x22bpc\\x22:false,\\x22tlw\\x22:false}}]"}'

console.log(JSON.parse(eval("(" + res  + ")" ).d));

Note that I added additional slashes so that the initial res would be exactly the same because it goes through additional string interpolation that it otherwise wouldn't go through.

A third party website can't extract the contents of the script, they can only run it. If Google just included JSON directly, then a 3rd party site could just override Object/Array constructors and access your data.

See

http://www.lazycoder.com/weblog/2007/04/02/ajax-library-security-advisory/

http://haacked.com/archive/2008/11/20/anatomy-of-a-subtle-json-vulnerability.aspx

http://jeremiahgrossman.blogspot.fi/2006/01/advanced-web-attack-techniques-using.html

Esailija
  • 5,364
  • 1
  • 19
  • 16