python - Completing a function to add Values depending on specific "Regions" (More info provided) -

i have 2 files, 1 containing on 200 tweets, , containing key words , values. typical tweet looks like: (i provided code below)

[41.923916200000001, -88.777469199999999] 6 2011-08-28 19:24:18 life moviee. ( number in brackets , words after time relevant)

and keywords like

love,10 like,5 best,10 hate,1

with 2 numbers @ beginning of tweet, use determine region tweet made in (shown below in code). & each individual tweet (each line in file), depending on number of keywords in tweet, add them, divided total of values associated them (per tweet) gives me score. my question is, how able total scores tweets in region , divide number of tweets in region? below, put happynesstweetscore, how calculated score individual tweets in file (each line) contain keywords. for part, i'm not sure how add values depending on region, , divide them depending on number of tweets in region? should add them list depending on region add?? don't know. i started this:

def score(tweet):     total = 0       total_value = 0       word in tweet:         if word in sentiments:             total_value += sentiments[word]               total_count += 1                     return total_value, total_count

but dont know how use in order total scores of tweets in each region indivdually , divide number of tweets in region?

i divided tweets 4 regions (latitude, long) using these values (rectangle) way @ bottom of code:

p1 = (49.189787, -67.444574)  p2 = (24.660845, -67.444574)  p3 = (49.189787, -87.518395)  p4 = (24.660845, -87.518395)  p5 = (49.189787, -101.998892)  p6 = (24.660845, -101.998892)  p7 = (49.189787, -115.236428) p8 = (24.660845, -115.236428)  p9 = (49.189787, -125.242264) p10 = (24.660845, -125.242264)   collections import counter try:     keyw_path = input("enter file named keywords: ")     keyfile = open(keyw_path, "r") except ioerror:     print("error: file not found.")     exit() # read keywords list keywords = {} wordfile = open('keywords.txt', 'r') line in wordfile.readlines():     word = line.replace('\n', '')     if not(word in keywords.keys()): #checks word doesn't exist.         keywords[word] = 0 # adds word db. wordfile.close() # read file name user , open file. try:     tweet_path = input("enter file named tweets: ")     tweetfile = open(tweet_path, "r") except ioerror:     print("error: file not found.")     exit() #calculating sentiment values open('keywords.txt') f:     sentiments = {word: int(value) word, value in (line.split(",") line in f)}  open('tweets.txt') f:     line in f:         values = counter(word word in line.split() if word in sentiments)         if not values:             continue keyw = ["love", "like", "best", "hate", "lol", "better", "worst", "good", "happy", "haha", "please", "great", "bad", "save", "saved", "pretty", "greatest", 'excited', 'tired', 'thanks', 'amazing', 'glad', 'ruined', 'negative', 'loving', 'sorry', 'hurt', 'alone', 'sad', 'positive', 'regrets', 'god'] open('tweets.txt') oldfile, open('newfile.txt', 'w') newfile:     line in oldfile:         if any(word in line word in keyw):             newfile.write(line) def score(tweet):     total = 0     word in tweet:         if word in sentiments:             total += 1     return total def total(score):     sum = 0     number in score:         if number in values:             sum += 1 #classifying regions class region:     def __init__(self, lat_range, long_range):         self.lat_range = lat_range         self.long_range = long_range     def contains(self, lat, long):         return self.lat_range[0] <= lat , lat < self.lat_range[1] and\                self.long_range[0] <= long , long < self.long_range[1] eastern = region((24.660845, 49.189787), (-87.518395, -67.444574)) central = region((24.660845, 49.189787), (-101.998892, -87.518395)) mountain = region((24.660845, 49.189787), (-115.236428, -101.998892)) pacific = region((24.660845, 49.189787), (-125.242264, -115.236428))  eastscore = 0 centralscore = 0 pacificscore = 0 mountainscore = 0 happyscoree = 0  line in open('newfile.txt'):     line = line.split(" ")     lat = float(line[0][1:-1]) #stripping [ , ,     long = float(line[1][:-1])  #stripping ]     if eastern.contains(lat, long):         eastscore += score(line)     elif central.contains(lat, long):         centralscore += score(line)     elif mountain.contains(lat, long):         mountainscore += score(line)     elif pacific.contains(lat, long):         pacificscore += score(line)     else:         continue

lets - said, have file containting data like:

love,10 movie,5

first of all, create dictionary file.

kw_to_score = {} kw_file = 'keywords.txt' open(kw_file, 'r') kwf:     line in kwf.readlines():        word, score = line.split(',')        kw_to_score[word] = int(score)

one done it, need create score function:

def score(tweet, keywords):     score = 0     count = 0     word in tweet.split(): # split words spaces         if word in keywords:            score += keywords[word]            count += 1     return score, count

after that, continue..

class region:     def __init__(self, lat_range, long_range):         self.lat_range = lat_range         self.long_range = long_range         self.score = 0 # add new field         self.quantity = 0 # add new field     def contains(self, lat, long):         return self.lat_range[0] <= lat , lat < self.lat_range[1] and\                self.long_range[0] <= long , long < self.long_range[1]  eastern = region((24.660845, 49.189787), (-87.518395, -67.444574)) central = region((24.660845, 49.189787), (-101.998892, -87.518395)) mountain = region((24.660845, 49.189787), (-115.236428, -101.998892)) pacific = region((24.660845, 49.189787), (-125.242264, -115.236428))   line in open('newfile.txt'):     line = line.split(" ")     lat = float(line[0][1:-1]) #stripping [ , ,     long = float(line[1][:-1])  #stripping ]     region in (eastern, central, mountain, pacific):         if region.contains(lat, long):             region_score, count = score(line, kw_to_score) # pass dict keywords mapped score             region.score += region_score             region.quantity += count

then need go for:

print(eastern.score / eastern.quantity) # give avg.

Search This Blog

QR

python - Completing a function to add Values depending on specific "Regions" (More info provided) -

Comments

Post a Comment

Popular posts from this blog

java - .class files under target/classes folder Maven -

linux - Could not find a package configuration file provided by "Qt5Svg" -

simple.odata.client - Simple OData Client Unlink -