Python 的mapreduce 单词统计
2018-07-20 来源:open-open
#!/usr/bin/env python import random # 'abc..z' alphaStr = "".join(map(chr, range(97,123))) fp = open("word.txt", "w") maxIter = 100000 for i in range(maxIter): word = "" len =random.randint(1,5) for j in range(len): word + = alphaStr[random.randint(0,25)] fp.write(word + '\n') fp.close() cat word.txt | ./wordcount_mapper.py | ./wordcount_reducer.py . word count reduce, python #filename: wordcount_reducer.py from operator import itemgetter import sys wordcount = {} for line in sys.stdin: word, count = line.strip().split('\t',1) try: count = int(count) wordcount[word] = wordcount.get(word,0) + count except ValueError pass sorted_wordcount = sorted(wordcount.iterms(), key = itemgettter(0)) for word,count in sorted_wordcount: print("%s\t%s") %(word, count)
标签:
版权申明:本站文章部分自网络,如有侵权,请联系:west999com@outlook.com
特别注意:本站所有转载文章言论不代表本站观点!
本站所提供的图片等素材,版权归原作者所有,如需使用,请与原作者联系。
上一篇:python批量抓取美女图片
下一篇:Java唯一码生成
最新资讯
热门推荐