twitter - How to write multiple txt files in Python? -

- March 15, 2015

i doing preprocessing tweet in python. unpreprocess tweets in folder. each file containing unpreprocess tweet named 1.txt, 2.txt,...10000.txt. want preprocess them , write them new files named 1.txt , 2.txt,...10000.txt. code follows :

for filename in glob.glob(os.path.join(path, '*.txt')): open(filename) file:     tweet=file.read()     def processtweet(tweet):         tweet = tweet.lower()         tweet = re.sub('((www\.[^\s]+)|(https?://[^\s]+))','url',tweet)         tweet = re.sub('@[^\s]+','user',tweet)         tweet = re.sub('[\s]+', ' ', tweet)         tweet = re.sub(r'#([^\s]+)', r'\1', tweet)                     tweet = tweet.translate(none, string.punctuation)         tweet = tweet.strip('\'"')         return tweet      fp = open(filename)     line = fp.readline()      count = 0     processedtweet = processtweet(line)     line = fp.readline()     count += 1     name = str(count) + ".txt"     file = open(name, "w")     file.write(processedtweet)     file.close()

but code give me new file named 1.txt preprocessed. how can write other 9999 files? there mistake in code?

your count getting reset 0 call count=0. everytime write file, write "1.txt". why trying reconstruct filename, instead of using existing filename tweet preprocessing. also, should move function definition outside loop:

def processtweet(tweet):     tweet = tweet.lower()     tweet = re.sub('((www\.[^\s]+)|(https?://[^\s]+))','url',tweet)     tweet = re.sub('@[^\s]+','user',tweet)     tweet = re.sub('[\s]+', ' ', tweet)     tweet = re.sub(r'#([^\s]+)', r'\1', tweet)                 tweet = tweet.translate(none, string.punctuation)     tweet = tweet.strip('\'"')     return tweet  filename in glob.glob(os.path.join(path, '*.txt')):   open(filename) file:     tweet=file.read()    processedtweet = processtweet(tweet)    file = open(filename, "w")   file.write(processedtweet)   file.close()

Search This Blog

QR

twitter - How to write multiple txt files in Python? -

Comments

Post a Comment

Popular posts from this blog

java - .class files under target/classes folder Maven -

linux - Could not find a package configuration file provided by "Qt5Svg" -

simple.odata.client - Simple OData Client Unlink -