i have large .csv
file (1065 row x 1 column). each row has sentences. want pick several important words wordlist (.csv file) in each row , make data term frequency each row.
i have tried put down something, you. done more efficiently probably, job.
example of input file
bla bla bla. bla! bla bla apple!, :banana. apple!!! banana bla bla, apple , banana peach 12345 bla bla peach , banana, peach, banana! :apple
code
# inputs list_words = ['apple', 'banana','peach'] filename = 'example.txt' # set of characters remove tokenize file's line rm = ",:;?/-!." # read through file per each line , math open(filename,'r') fin: count_line, line in enumerate(fin,1): clean_line = filter(lambda x: not (x in rm), line) # hold counts of each word words_frequency = {key: 0 key in list_words} w in clean_line.split(): if w in list_words: words_frequency[w] += 1 print 'line', count_line,':', words_frequen
output:
line 1 : {'apple': 2, 'peach': 0, 'banana': 1} line 2 : {'apple': 1, 'peach': 0, 'banana': 2} line 3 : {'apple': 1, 'peach': 3, 'banana': 2}
Comments
Post a Comment