i new python , new programming. have been attempting parse .txt files excel, , have had success number of them easy split lines code around.
however, have bunch of files have information, no reasonable line breaks. data looks this:
company1 name _______ 123 company2 name 456 company3 name 789
with no indicators between names , numbers--sometimes there underscores between, whitespace, there's line break in between. if separate of lines ended after each full number, code i've written rest. ideally, i'd have string looks like:
company1 name ______ 123 company2 name 456 company3 name 789
with line breaks in original string parsed out.
i hope can help!
you should use regular expression looks patterns in text, , allows modify pattern newline.
for example:
import re line = 'company1 name _______ 123 company2 name 456 company3 name 789' output = re.sub(r'(\s\d+\s*)', r'\1\n', line) print output
which returns
company1 name _______ 123 company2 name 456 company3 name 789
Comments
Post a Comment