erlangに触り始めて一日目のヤツが書く超簡易クローラ
-module(crawler). -export([ crawl/1, crawl_exec/1, fetch/2, data_receive/1 ]). %This is the first time to writing in Erlang. %So now, I can only write simply... crawl(List) -> crawl_exec(List), data_receive(length(List)). crawl_exec([]) -> ok; crawl_exec([Url|List]) -> spawn(crawler, fetch, [self(), Url]), crawl_exec(List). fetch(Pid, Url) -> {ok, {{_, Status, _}, _, _}} = http:request(Url), Pid ! {Status, Url}. data_receive(0) -> complete; data_receive(N) -> receive {Status, Url} -> io:format("~b ~s~n", [Status, Url]), data_receive(N - 1) after 3000 -> timeout end.
1> Urls = ["http://www.erlang.org/", "http://en.wikipedia.org/wiki/Erlang_programming_language"]. ["http://www.erlang.org/", "http://en.wikipedia.org/wiki/Erlang_programming_language"] 2> c(crawler). {ok,crawler} 3> crawler:crawl(Urls). =INFO REPORT==== 20-May-2007::02:16:50 === The inets application was not started. Has now been started as a temporary application. =INFO REPORT==== 20-May-2007::02:16:50 === The inets application was not started. Has now been started as a temporary application. 200 http://en.wikipedia.org/wiki/Erlang_programming_language 200 http://www.erlang.org/ complete
とりあえず、キモであるプロセスの生成や、メッセージの受け渡しの基本的な部分はおおよそ分かった。