【已解决】PHP怎样抓取网页代码中动态(Ajax)显示的数据?

本帖最后由 qq37431300 于 2013-12-17 09:03:47 编辑

比如淘宝的宝贝页:
http://item.taobao.com/item.htm?id=36221049162

价  格: ¥596.00
参加促销:全年抄底价 ¥298.00
价格在网页html代码中有,但是 参加促销 网页html中没有,如何抓取?

效果图:



已解决,不要后面的商店地址就行了。
http://detailskip.taobao.com/json/sib.htm?itemId=36221049162&sellerId=110811289&p=1&rcid=16&sts=504983568,1170940490216898572,144678138062864512,36028801320484867&chnl=pc&price=59600&shopId=&vd=1&skil=false&prior=1&ref=


回复讨论(解决方案)

好像是这个ajax生成的,
http://detailskip.taobao.com/json/sib.htm?itemId=36221049162&sellerId=110811289&p=1&rcid=16&sts=504983568,1170940490216898572,144678138062864512,36028801320484867&chnl=pc&price=59600&shopId=&vd=1&skil=false&prior=1&ref=http://shop36048351.taobao.com




可是,直接打开那个地址,看不到价格,是参数错了还是淘宝限制了不能直接打开的?
那该如何取得它里面的json数据呢?

已解决,不要后面的商店地址就行了。
http://detailskip.taobao.com/json/sib.htm?itemId=36221049162&sellerId=110811289&p=1&rcid=16&sts=504983568,1170940490216898572,144678138062864512,36028801320484867&chnl=pc&price=59600&shopId=&vd=1&skil=false&prior=1&ref=

恭喜!!!!!

楼主,我也遇到了和你一样的问题,但是我像你说的去掉后面的店铺地址也不行啊,是什么原因呢,这个地址:http://detailskip.taobao.com/json/sib.htm?itemId=36386887896&sellerId=196755313&p=1&rcid=16&sts=404230144,1170940438677291084,144185556853620864,70373045502979&chnl=pc&price=13900&shopId=&vd=1&skil=false&pf=1&al=false&ap=0&ss=0&free=1&st=1&ct=1&prior=1&ref=

楼主,我也遇到了和你一样的问题,但是我像你说的去掉后面的店铺地址也不行啊,是什么原因呢,这个地址:http://detailskip.taobao.com/json/sib.htm?itemId=36386887896&sellerId=196755313&p=1&rcid=16&sts=404230144,1170940438677291084,144185556853620864,70373045502979&chnl=pc&price=13900&shopId=&vd=1&skil=false&pf=1&al=false&ap=0&ss=0&free=1&st=1&ct=1&prior=1&ref=

恩,是不行,要用curl要拿。

function curl_taobao_detail($url) { //模拟提交数据函数
$ch = curl_init(); //启动一个CURL会话
$header = array(); //http头数组
$header[] = 'Host:detail.taobao.com';
$header[] = 'User-Agent:Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0';
$header[] = 'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
$header[] = 'Accept-Language:"en-Us,en;q=0.5"';
$header[] = 'Accept-Encoding:"deflate"';
$header[] = 'Referer:"http://www.taobao.com"';
$header[] = 'Cookie:mall_fp_ab=2012b; cna=bBiqCsrWeRUCAXbGun/HhKLU; cq=ccp=1; otherx=e=1&p=*&s=0&c=0&f=0&g=0&t=0; x=__ll=-1&_ato=0; t=7009aef24f80bf19638795f67e442046; tracknick=hellobiaobiao; _tb_token_=KYsNvQICZGpL; cookie2=2d70cfb31e8da274de2f3eaed2d1612e; pnm_cku822=008uZ+ZXOTBiNH0MRQyyh8Jj9nPuf+Zr9lfeqJ6|uqxpf0k/uS85fwk/Oa/JX4c=|u51Y4MWiOxFIIdhSd7KXsYeiZ0KFHPWvJQDYAA==|vIpP92FnosSSV0FnYUeClAKUgkdRd/H3MrSytHFnQVcBxLLEonqi|vauNSG17TTu9Kz17DTs9q81b/fseCH54fmheCF54DkjOSN7IvvjdBQ==|vqg++0N1sCYgpmMl4MUARoOVsGiw|v/n/+Tw6/9kcer+p7+ksGkwq8g==';
$header[] = 'Connection:keep-alive';

curl_setopt($ch, CURLOPT_URL, $url); //要访问的地址
curl_setopt($ch, CURLOPT_HTTPHEADER, $header); //设置http头
curl_setopt($ch, CURLOPT_HEADER, 0); //显示返回的Header区域内容
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //获取的信息以文件流的形式返回
curl_setopt($ch, CURLOPT_TIMEOUT, 20); //设置超时限制防止死循环
$content = curl_exec($ch); //执行操作
preg_match_all('/price:"(.*)\.00"/iU', $content, $goods_price); //拿到价格
$goods_price = $goods_price[1][0] ? $goods_price[1][0] : '??';
$curl_errno = curl_errno($ch);
$curl_error = curl_error($ch);
curl_close($ch); //关闭CURL会话
return $goods_price; //返回数据
}


楼主,我也遇到了和你一样的问题,但是我像你说的去掉后面的店铺地址也不行啊,是什么原因呢,这个地址:http://detailskip.taobao.com/json/sib.htm?itemId=36386887896&sellerId=196755313&p=1&rcid=16&sts=404230144,1170940438677291084,144185556853620864,70373045502979&chnl=pc&price=13900&shopId=&vd=1&skil=false&pf=1&al=false&ap=0&ss=0&free=1&st=1&ct=1&prior=1&ref=

恩,是不行,要用curl要拿。

function curl_taobao_detail($url) { //模拟提交数据函数
$ch = curl_init(); //启动一个CURL会话
$header = array(); //http头数组
$header[] = 'Host:detail.taobao.com';
$header[] = 'User-Agent:Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0';
$header[] = 'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
$header[] = 'Accept-Language:"en-Us,en;q=0.5"';
$header[] = 'Accept-Encoding:"deflate"';
$header[] = 'Referer:"http://www.taobao.com"';
$header[] = 'Cookie:mall_fp_ab=2012b; cna=bBiqCsrWeRUCAXbGun/HhKLU; cq=ccp=1; otherx=e=1&p=*&s=0&c=0&f=0&g=0&t=0; x=__ll=-1&_ato=0; t=7009aef24f80bf19638795f67e442046; tracknick=hellobiaobiao; _tb_token_=KYsNvQICZGpL; cookie2=2d70cfb31e8da274de2f3eaed2d1612e; pnm_cku822=008uZ+ZXOTBiNH0MRQyyh8Jj9nPuf+Zr9lfeqJ6|uqxpf0k/uS85fwk/Oa/JX4c=|u51Y4MWiOxFIIdhSd7KXsYeiZ0KFHPWvJQDYAA==|vIpP92FnosSSV0FnYUeClAKUgkdRd/H3MrSytHFnQVcBxLLEonqi|vauNSG17TTu9Kz17DTs9q81b/fseCH54fmheCF54DkjOSN7IvvjdBQ==|vqg++0N1sCYgpmMl4MUARoOVsGiw|v/n/+Tw6/9kcer+p7+ksGkwq8g==';
$header[] = 'Connection:keep-alive';

curl_setopt($ch, CURLOPT_URL, $url); //要访问的地址
curl_setopt($ch, CURLOPT_HTTPHEADER, $header); //设置http头
curl_setopt($ch, CURLOPT_HEADER, 0); //显示返回的Header区域内容
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //获取的信息以文件流的形式返回
curl_setopt($ch, CURLOPT_TIMEOUT, 20); //设置超时限制防止死循环
$content = curl_exec($ch); //执行操作
preg_match_all('/price:"(.*)\.00"/iU', $content, $goods_price); //拿到价格
$goods_price = $goods_price[1][0] ? $goods_price[1][0] : '??';
$curl_errno = curl_errno($ch);
$curl_error = curl_error($ch);
curl_close($ch); //关闭CURL会话
return $goods_price; //返回数据
}

我就按照楼主你给我的代码,输入url参数为:http://detailskip.taobao.com/json/sib.htm?itemId=35277493308&sellerId=905487172&p=1&rcid=16&sts=337186818,1170936092103278604,72127962815692928,13510803217056771&chnl=pc&price=7900&shopId=&vd=1&skil=false&pf=1&al=false&ap=1&ss=0&free=0&st=1&ct=1&prior=1 也不行啊,是不是我哪里弄错了啊



楼主,我也遇到了和你一样的问题,但是我像你说的去掉后面的店铺地址也不行啊,是什么原因呢,这个地址:http://detailskip.taobao.com/json/sib.htm?itemId=36386887896&sellerId=196755313&p=1&rcid=16&sts=404230144,1170940438677291084,144185556853620864,70373045502979&chnl=pc&price=13900&shopId=&vd=1&skil=false&pf=1&al=false&ap=0&ss=0&free=1&st=1&ct=1&prior=1&ref=

恩,是不行,要用curl要拿。

function curl_taobao_detail($url) { //模拟提交数据函数
$ch = curl_init(); //启动一个CURL会话
$header = array(); //http头数组
$header[] = 'Host:detail.taobao.com';
$header[] = 'User-Agent:Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0';
$header[] = 'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
$header[] = 'Accept-Language:"en-Us,en;q=0.5"';
$header[] = 'Accept-Encoding:"deflate"';
$header[] = 'Referer:"http://www.taobao.com"';
$header[] = 'Cookie:mall_fp_ab=2012b; cna=bBiqCsrWeRUCAXbGun/HhKLU; cq=ccp=1; otherx=e=1&p=*&s=0&c=0&f=0&g=0&t=0; x=__ll=-1&_ato=0; t=7009aef24f80bf19638795f67e442046; tracknick=hellobiaobiao; _tb_token_=KYsNvQICZGpL; cookie2=2d70cfb31e8da274de2f3eaed2d1612e; pnm_cku822=008uZ+ZXOTBiNH0MRQyyh8Jj9nPuf+Zr9lfeqJ6|uqxpf0k/uS85fwk/Oa/JX4c=|u51Y4MWiOxFIIdhSd7KXsYeiZ0KFHPWvJQDYAA==|vIpP92FnosSSV0FnYUeClAKUgkdRd/H3MrSytHFnQVcBxLLEonqi|vauNSG17TTu9Kz17DTs9q81b/fseCH54fmheCF54DkjOSN7IvvjdBQ==|vqg++0N1sCYgpmMl4MUARoOVsGiw|v/n/+Tw6/9kcer+p7+ksGkwq8g==';
$header[] = 'Connection:keep-alive';

curl_setopt($ch, CURLOPT_URL, $url); //要访问的地址
curl_setopt($ch, CURLOPT_HTTPHEADER, $header); //设置http头
curl_setopt($ch, CURLOPT_HEADER, 0); //显示返回的Header区域内容
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //获取的信息以文件流的形式返回
curl_setopt($ch, CURLOPT_TIMEOUT, 20); //设置超时限制防止死循环
$content = curl_exec($ch); //执行操作
preg_match_all('/price:"(.*)\.00"/iU', $content, $goods_price); //拿到价格
$goods_price = $goods_price[1][0] ? $goods_price[1][0] : '??';
$curl_errno = curl_errno($ch);
$curl_error = curl_error($ch);
curl_close($ch); //关闭CURL会话
return $goods_price; //返回数据
}

我就按照楼主你给我的代码,输入url参数为:http://detailskip.taobao.com/json/sib.htm?itemId=35277493308&sellerId=905487172&p=1&rcid=16&sts=337186818,1170936092103278604,72127962815692928,13510803217056771&chnl=pc&price=7900&shopId=&vd=1&skil=false&pf=1&al=false&ap=1&ss=0&free=0&st=1&ct=1&prior=1 也不行啊,是不是我哪里弄错了啊

你的价格有小数,自己改一下
preg_match_all('/price:"(.*)\.00"/iU', $content, $goods_price); //拿到价格
改成
preg_match_all('/price:"(.*)",/iU', $content, $goods_price); //拿到价格
这样就拿到
17.90




楼主,我也遇到了和你一样的问题,但是我像你说的去掉后面的店铺地址也不行啊,是什么原因呢,这个地址:http://detailskip.taobao.com/json/sib.htm?itemId=36386887896&sellerId=196755313&p=1&rcid=16&sts=404230144,1170940438677291084,144185556853620864,70373045502979&chnl=pc&price=13900&shopId=&vd=1&skil=false&pf=1&al=false&ap=0&ss=0&free=1&st=1&ct=1&prior=1&ref=

恩,是不行,要用curl要拿。

function curl_taobao_detail($url) { //模拟提交数据函数
$ch = curl_init(); //启动一个CURL会话
$header = array(); //http头数组
$header[] = 'Host:detail.taobao.com';
$header[] = 'User-Agent:Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0';
$header[] = 'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
$header[] = 'Accept-Language:"en-Us,en;q=0.5"';
$header[] = 'Accept-Encoding:"deflate"';
$header[] = 'Referer:"http://www.taobao.com"';
$header[] = 'Cookie:mall_fp_ab=2012b; cna=bBiqCsrWeRUCAXbGun/HhKLU; cq=ccp=1; otherx=e=1&p=*&s=0&c=0&f=0&g=0&t=0; x=__ll=-1&_ato=0; t=7009aef24f80bf19638795f67e442046; tracknick=hellobiaobiao; _tb_token_=KYsNvQICZGpL; cookie2=2d70cfb31e8da274de2f3eaed2d1612e; pnm_cku822=008uZ+ZXOTBiNH0MRQyyh8Jj9nPuf+Zr9lfeqJ6|uqxpf0k/uS85fwk/Oa/JX4c=|u51Y4MWiOxFIIdhSd7KXsYeiZ0KFHPWvJQDYAA==|vIpP92FnosSSV0FnYUeClAKUgkdRd/H3MrSytHFnQVcBxLLEonqi|vauNSG17TTu9Kz17DTs9q81b/fseCH54fmheCF54DkjOSN7IvvjdBQ==|vqg++0N1sCYgpmMl4MUARoOVsGiw|v/n/+Tw6/9kcer+p7+ksGkwq8g==';
$header[] = 'Connection:keep-alive';

curl_setopt($ch, CURLOPT_URL, $url); //要访问的地址
curl_setopt($ch, CURLOPT_HTTPHEADER, $header); //设置http头
curl_setopt($ch, CURLOPT_HEADER, 0); //显示返回的Header区域内容
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //获取的信息以文件流的形式返回
curl_setopt($ch, CURLOPT_TIMEOUT, 20); //设置超时限制防止死循环
$content = curl_exec($ch); //执行操作
preg_match_all('/price:"(.*)\.00"/iU', $content, $goods_price); //拿到价格
$goods_price = $goods_price[1][0] ? $goods_price[1][0] : '??';
$curl_errno = curl_errno($ch);
$curl_error = curl_error($ch);
curl_close($ch); //关闭CURL会话
return $goods_price; //返回数据
}

我就按照楼主你给我的代码,输入url参数为:http://detailskip.taobao.com/json/sib.htm?itemId=35277493308&sellerId=905487172&p=1&rcid=16&sts=337186818,1170936092103278604,72127962815692928,13510803217056771&chnl=pc&price=7900&shopId=&vd=1&skil=false&pf=1&al=false&ap=1&ss=0&free=0&st=1&ct=1&prior=1 也不行啊,是不是我哪里弄错了啊

你的价格有小数,自己改一下
preg_match_all('/price:"(.*)\.00"/iU', $content, $goods_price); //拿到价格
改成
preg_match_all('/price:"(.*)",/iU', $content, $goods_price); //拿到价格
这样就拿到
17.90


楼主你可以发这个php文件到我的邮箱吗,谢谢了!我想仔细看看,310976780@qq.com