{"id":402,"date":"2013-03-12T23:51:00","date_gmt":"2013-03-12T14:51:00","guid":{"rendered":""},"modified":"2021-06-05T13:06:11","modified_gmt":"2021-06-05T04:06:11","slug":"python_13","status":"publish","type":"post","link":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/2013\/03\/python_13.html","title":{"rendered":"Python\u3067\u6587\u7ae0\u306b\u51fa\u73fe\u3059\u308b\u500b\u3005\u306e\u30ad\u30fc\u30ef\u30fc\u30c9\u306e\u51fa\u73fe\u56de\u6570\u3092\u8abf\u3079\u308b\u65b9\u6cd5"},"content":{"rendered":"<p>\u30ad\u30fc\u30ef\u30fc\u30c9\u51fa\u73fe\u983b\u5ea6\u3068\u304b\u3092\u8abf\u3079\u308b\u6642\u306b\u3001\u3042\u308b\u6587\u7ae0\u4e2d\u306b\u7279\u5b9a\u306e\u5358\u8a9e\u304c\u4f55\u56de\u51fa\u73fe\u3059\u308b\u304b\u3068\u3044\u3046\u306e\u3092\u8abf\u3079\u305f\u3044\u6642\u306f\u3088\u304f\u3042\u308a\u307e\u3059\u3002<\/p>\n<p>Python\u3067\u500b\u3005\u306e\u5358\u8a9e\u306e\u51fa\u73fe\u56de\u6570\u3092\u8abf\u3079\u308b\u65b9\u6cd5\u306f\u4e3b\u306b\uff13\u3064\u3042\u308a\u307e\u3059\u3002<\/p>\n<ol>\n<li>for\u6587\u3067\uff11\u3064\u305a\u3064\u8abf\u3079\u3066\u884c\u304f\u65b9\u6cd5<\/li>\n<li>for\u6587\u3067\u8f9e\u66f8\u306eget\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u3046\u65b9\u6cd5<\/li>\n<li>nltk\u3092\u4f7f\u3046\u65b9\u6cd5<\/li>\n<\/ol>\n<p>\uff13\u3064\u306e\u65b9\u6cd5\u3067\u5b9f\u969b\u306b\u66f8\u3044\u3066\u307f\u307e\u3057\u305f\u3002\u30b5\u30f3\u30d7\u30eb\u306e\u6587\u7ae0\u306fWall Street Journal\u306e\u30c8\u30c3\u30d7\u30cb\u30e5\u30fc\u30b9\u304b\u3089\u5f15\u7528\u3055\u305b\u3066\u3044\u305f\u3060\u304d\u307e\u3057\u305f\u3002<\/p>\n<div style=\"background-color: lightgray; padding: 10px;\">\n<pre><code>def version1(line):\r\n    for word in line.split():\r\n        if word in wordcount:\r\n            wordcount[word] = wordcount[word] + 1\r\n        else:\r\n            wordcount[word] = 1\r\n    print wordcount\r\n\r\ndef version2(line):\r\n    for word in line.split():\r\n        wordcount[word] = wordcount.get(word, 0) + 1\r\n    print wordcount\r\n\r\ndef version3(line):\r\n    import nltk\r\n    tokens = nltk.word_tokenize(line)\r\n    text = nltk.Text(tokens)\r\n    wordcount = nltk.FreqDist(text)\r\n    print wordcount.items()\r\n\r\nif __name__ == '__main__':\r\n    line = \"These changes, almost all of which the White House and Democrats have said they oppose, would combine with January's tax increases to eliminate the government's budget deficit in 2023, a top GOP goal, says Mr. Ryan, chairman of the House Budget Committee. Prior House GOP budget resolutions called for changes that would have taken several decades to eliminate the deficit.\"\r\n    wordcount = {}\r\n    version1(line)\r\n    version2(line)\r\n    version3(line)<\/code><\/pre>\n<\/div>\n<p>\u4e00\u756a\u5206\u304b\u308a\u3084\u3059\u3044\u306e\u304c\uff11\u756a\u306e\u65b9\u6cd5(version1)\u3067\u3001\uff11\u3064\u305a\u3064\u5358\u8a9e\u3092\u53d6\u308a\u51fa\u3057\u3066\u3001\u305d\u306e\u5358\u8a9e\u304c\u8f9e\u66f8\u306b\u65e2\u306b\u767b\u9332\u3055\u308c\u3066\u3044\u308c\u3070\uff11\u3092\u8db3\u3057\u3066\u3001\u767b\u9332\u3055\u308c\u3066\u3044\u306a\u3051\u308c\u3070\uff11\u3092\u4ee3\u5165\u3059\u308b\u306e\u3092\u7e70\u308a\u8fd4\u3057\u307e\u3059\u3002<\/p>\n<p>\u3082\u3046\u5c11\u3057\u7c21\u5358\u306b\u66f8\u3051\u308b\u306e\u304c\uff12\u756a\u306e\u65b9\u6cd5(version2)\u3067\u3001\u8f9e\u66f8\u306eget\u30e1\u30bd\u30c3\u30c9\u3092\u4f7f\u3044\u307e\u3059\u3002\u5f15\u6570\u306e\uff11\u3064\u76ee\u306f\u30ad\u30fc\u3092\u3001\uff12\u3064\u76ee\u306f\u30ad\u30fc\u304c\u5b58\u5728\u3057\u306a\u3044\u5834\u5408\u306e\u30c7\u30d5\u30a9\u30eb\u30c8\u5024\u3092\u6e21\u3057\u307e\u3059\u3002<\/p>\n<p>\u30ad\u30fc\u304c\u5b58\u5728\u3059\u308c\u3070\u305d\u308c\u306b\uff11\u3092\u8db3\u3059\u4e8b\u306b\u306a\u308a\u3001\u30ad\u30fc\u304c\u5b58\u5728\u3057\u306a\u3051\u308c\u3070\uff10\u3092\u4ee3\u5165\u3059\u308b\u3068\u3044\u3046\u4e8b\u306b\u306a\u308a\u307e\u3059\u3002<\/p>\n<p>\u3082\u3063\u3068\u4fbf\u5229\u306a\u306e\u304c\uff13\u756a\u306e\u65b9\u6cd5(version3)\u306enltk\u3092\u4f7f\u3046\u3084\u308a\u65b9\u3067\u3059\u3002nltk\u306f\u6a19\u6e96\u3067\u5165\u3063\u3066\u306a\u3044\u306e\u3067\u5225\u9014\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n<p>FreqDist\u3063\u3066\u3044\u3046\u306e\u306f\u3001\u5358\u8a9e\u306e\u51fa\u73fe\u56de\u6570\u306e\u9806\u756a\u306b\u4e26\u3079\u3066\u304f\u308c\u3066\u3044\u307e\u3059\u306e\u3067\u3001\u51fa\u73fe\u56de\u6570\u306e\u4e0a\u4f4d\uff11\uff10\u500b\u3060\u3051\u3092\u53d6\u308a\u51fa\u3057\u305f\u3044\u6642\u306b\u306f\u4ee5\u4e0b\u306e\u3088\u3046\u306b\u6700\u5f8c\u306b[:10]\u3068\u66f8\u3051\u3070\u3044\u3044\u3060\u3051\u3067\u3059\u3002NLTK\u306f\u9762\u767d\u3044\u3002<\/p>\n<div style=\"background-color: lightgray; padding: 10px;\">\n<pre><code>    print wordcount.items()[:10]<\/code><\/pre>\n<\/div>\n<p>\u8abf\u67fb\u3057\u305f\u3044\u30e9\u30a4\u30d0\u30eb\u30b5\u30a4\u30c8\u3001\u30e9\u30a4\u30d0\u30eb\u30b5\u30a4\u30c8\u306e\u30d0\u30c3\u30af\u30ea\u30f3\u30af\u30b5\u30a4\u30c8\u306a\u3069\u306e\u30ad\u30fc\u30ef\u30fc\u30c9\u51fa\u73fe\u56de\u6570\u3068\u304b\u3092\u8abf\u3079\u305f\u3044\u6642\u306b\u4fbf\u5229\u3067\u3059\u3002NLTK\u3092\u4f7f\u3048\u3070\u6700\u8fd1\u306eSEO\u3067\u5927\u4e8b\u306a\u5171\u8d77\u8a9e\u306a\u3093\u304b\u306e\u5206\u6790\u3082\u7c21\u5358\u306b\u3067\u304d\u307e\u3059\u3002<\/p>\n<p>\u65e5\u672c\u8a9e\u307f\u305f\u3044\u306b\u5358\u8a9e\u3092\u30b9\u30da\u30fc\u30b9\u3067\u533a\u5207\u3089\u306a\u3044\u8a00\u8a9e\u306e\u5834\u5408\u306fMeCab\u306a\u3069\u3067\u5358\u8a9e\u306b\u5206\u5272\u3057\u305f\u308a\u5f62\u614b\u7d20\u89e3\u6790\u3059\u308b\u624b\u9593\u3082\u5fc5\u8981\u306b\u306a\u308a\u307e\u3059\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u30ad\u30fc\u30ef\u30fc\u30c9\u51fa\u73fe\u983b\u5ea6\u3068\u304b\u3092\u8abf\u3079\u308b\u6642\u306b\u3001\u3042\u308b\u6587\u7ae0\u4e2d\u306b\u7279\u5b9a\u306e\u5358\u8a9e\u304c\u4f55\u56de\u51fa\u73fe\u3059\u308b\u304b\u3068\u3044\u3046\u306e\u3092\u8abf\u3079\u305f\u3044\u6642\u306f\u3088\u304f\u3042\u308a\u307e\u3059\u3002 Pyth &#8230; <\/p>\n","protected":false},"author":1,"featured_media":598,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[75],"tags":[33,26],"class_list":["post-402","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology","tag-python","tag-seo"],"_links":{"self":[{"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/posts\/402","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/comments?post=402"}],"version-history":[{"count":0,"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/posts\/402\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/media\/598"}],"wp:attachment":[{"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/media?parent=402"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/categories?post=402"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sigmadesign.co.jp\/minomonchan\/wp-json\/wp\/v2\/tags?post=402"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}