WebMagic官网:http://webmagic.io/
注意:
1、在自定义PageProcessor中使用System.out.println(“str”),Spider.create(new myPageProcessor()).start()爬虫启动后,console并不会输出
spider.addPipeline(new ConsolePipeline())//在控制台上输出
webMagic之post请求
请求参数为form
@Test public void webMagicTest1() { String str = "一人之下"; Map<String,Object> map = new HashMap<String,Object>(); map.put("keyword", str); Request request = new Request("https://www.80s.tw/search"); request.setMethod(HttpConstant.Method.POST); request.setRequestBody(HttpRequestBody.form(map, "utf-8")); Spider.create(new webMagic1()) .addRequest(request) .addPipeline(new JsonFilePipeline("D:\\webmagic\\")) .addPipeline(new ConsolePipeline()) .run(); }
请求参数为json
@Test public void webMagicTest1() { String name = "一人之下"; JSONObject json = new JSONObject(); json.put("keyword", name); Request request = new Request("https://www.80s.tw/search"); request.setMethod(HttpConstant.Method.POST); request.setRequestBody(HttpRequestBody.json(json.toString(), "utf-8")); Spider.create(new webMagic1()) .addRequest(request) .addPipeline(new JsonFilePipeline("D:\\webmagic\\")) .addPipeline(new ConsolePipeline()) .run(); }
本站文章如无特殊说明,均为本站原创,如若转载,请注明出处:j网络爬虫之WebMagic - Python技术站