Python 3 using word counter to list line & how many times a word appears from an array in file -
so issue i've created array in script called 'ga' store words may hold 100+ words. trying call array , search words in txt doc , output how many times each word found. in first part of code 'def readfile' opening file cleaning , display lines of these words in.
the problem can't seem to find way display lines word output how many times each 1 hit, here code.
import re collections import counter categories.goingace import ga path = "chatlogs/chat1.txt" file = path lex = counter(ga) count = {} def readfile(): open(file) file_read: content = file_read.readlines() line in content: if any(word in line word in lex): cleanse = re.sub('<.*?>', '', line) print(cleanse) file_read.close() def wordcount(): open(file) f: lex = counter(f.read().split()) item in lex.items(): print ("{}\t{}".format(*item)) f.close() #readfile() wordcount()
original input this
<200> <ilovethaocean> <08/22/06 12:15:36 am> hi asl? <210> <a_latino_man559> <08/22/06 12:15:53 am> 32 m fresno <210> <a_latino_man559> <08/22/06 12:15:53 am> u? <200> <ilovethaocean> <08/22/06 12:16:12 am> "13/f/ca, how r u?" <200> <a_latino_man559> <08/22/06 12:16:18 am> 13?
i use hide in brackets:
cleanse = re.sub('<.*?>', '', line) print(cleanse)
which outputs this:
hi asl?
32 m fresno
u?
"13/f/ca, how r u?"
13?
along let's example ga array contains (hi, u, 13) perfect aim output this:
hi appeared 1 time line_num hi asl?
u appeared 2 time line_num u?
line_num 13/f/ca, how r u?
etc.
here's approach simplified example:
from collections import defaultdict occurrences = defaultdict(list) words = ['cat', 'dog', 'bird', 'person'] open(path_to_your_file) f: i, line in enumerate(f.readlines(), start=1): word in words: if word in line: occurrences[word] += [(i, line)] (word, matches) in occurrences.items(): total_count = sum(line.count(word) _, line in matches) print '%s appeared %d time(s). line(s):' % (word, total_count) print '\n'.join(['\t %d) %s' % (line_num, line.strip()) line_num, line in matches])
given text file following contents:
cat, rat, dog, cat bird, person animal insect whatever bird etc.
the script prints
bird appeared 2 time(s). line(s): 2) bird, person 6) bird person appeared 1 time(s). line(s): 2) bird, person dog appeared 1 time(s). line(s): 1) cat, rat, dog, cat cat appeared 2 time(s). line(s): 1) cat, rat, dog, cat
Comments
Post a Comment