Python 3 using word counter to list line & how many times a word appears from an array in file -


so issue i've created array in script called 'ga' store words may hold 100+ words. trying call array , search words in txt doc , output how many times each word found. in first part of code 'def readfile' opening file cleaning , display lines of these words in.

the problem can't seem to find way display lines word output how many times each 1 hit, here code.

 import re  collections import counter  categories.goingace import ga   path = "chatlogs/chat1.txt"  file = path  lex = counter(ga)  count = {}  def readfile():      open(file) file_read:         content = file_read.readlines()         line in content:             if any(word in line word in lex):                 cleanse = re.sub('<.*?>', '', line)                 print(cleanse)      file_read.close()  def wordcount():      open(file) f:        lex = counter(f.read().split())     item in lex.items(): print ("{}\t{}".format(*item))     f.close()   #readfile() wordcount() 

original input this

<200>   <ilovethaocean> <08/22/06 12:15:36 am>  hi asl? <210>   <a_latino_man559>   <08/22/06 12:15:53 am>  32 m fresno <210>   <a_latino_man559>   <08/22/06 12:15:53 am>  u? <200>   <ilovethaocean> <08/22/06 12:16:12 am>  "13/f/ca, how r u?" <200>   <a_latino_man559>   <08/22/06 12:16:18 am>  13? 

i use hide in brackets:

cleanse = re.sub('<.*?>', '', line)                     print(cleanse) 

which outputs this:

hi asl?

32 m fresno

u?

"13/f/ca, how r u?"

13?

along let's example ga array contains (hi, u, 13) perfect aim output this:

hi appeared 1 time line_num hi asl?

u appeared 2 time line_num u?

line_num 13/f/ca, how r u?

etc.

here's approach simplified example:

from collections import defaultdict  occurrences = defaultdict(list) words = ['cat', 'dog', 'bird', 'person']  open(path_to_your_file) f:     i, line in enumerate(f.readlines(), start=1):         word in words:             if word in line:                 occurrences[word] += [(i, line)]  (word, matches) in occurrences.items():     total_count = sum(line.count(word) _, line in matches)     print '%s appeared %d time(s). line(s):' % (word, total_count)     print '\n'.join(['\t %d) %s' % (line_num, line.strip()) line_num, line in matches]) 

given text file following contents:

cat, rat, dog, cat bird, person animal insect whatever bird etc. 

the script prints

bird appeared 2 time(s). line(s):      2) bird, person      6) bird person appeared 1 time(s). line(s):      2) bird, person dog appeared 1 time(s). line(s):      1) cat, rat, dog, cat cat appeared 2 time(s). line(s):      1) cat, rat, dog, cat 

Comments

Popular posts from this blog

account - Script error login visual studio DefaultLogin_PCore.js -

xcode - CocoaPod Storyboard error: -