regex - What is the RegExp Pattern to Extract Bullet Points Between Two Group Words using VBA in Word? -


i can't seem figure out regexp extract bullet points between 2 group of words in word document.

for example:

risk assessment:

  • test 1
  • test 2
  • test 3

internal audit

in case want extract bullet points between "risk assessment" , "internal audit", 1 bullet @ time , assign bullet excel cell. shown in code below have pretty done, except cant figure out correct regex pattern. great. in advance!

sub populateexceltable()      dim fd office.filedialog          set fd = application.filedialog(msofiledialogfilepicker)         fd            .allowmultiselect = false            .title = "please select file."            .filters.clear           .filters.add "word 2007-2013", "*.docx"            if .show = true             txtfilename = .selecteditems(1)            end if        end      dim wordapp word.application     set wordapp = createobject("word.application")     dim worddoc word.document     set worddoc = wordapp.documents.open(txtfilename)      dim str string: str = worddoc.content.text ' assign entire document content string      dim rex new regexp     rex.pattern = "\b[^risk assessment\s].*[^internal audit\s]"      dim long : = 1     rex.global = true     each mtch in rex.execute(str)         debug.print mtch         range("a" & i).value = mtch         = + 1     next mtch      worddoc.close     wordapp.quit end sub 

this long way around problem works.

steps i'm taking:

  1. find bullet list items using keywords before , after list in regexp.
  2. (group) regexp pattern can extract in-between words.
  3. store listed items group string.
  4. split string new line character new array.
  5. output each array item excel.
  6. loop again since there may more 1 list in document.

note: don't see code link excel workbook. i'll assume part working.


dim rex new regexp rex.pattern = "(\brisk assessment\s)(.*)(internal\saudit\s)"  rex.global = true rex.multiline = true rex.ignorecase = true  dim linearray() string dim mymatches object set mymatches = rex.execute(str)  each mtch in rex.execute(str)     'debug.print mtch.submatches(1)      linearray = split(mtch.submatches(1), vblf)     x = lbound(linearray) ubound(linearray)         'debug.print linearray(x)          range("a" & i).value = linearray(x)         = + 1     next next mtch 

my test page looks this:

enter image description here

results inner debug.print line return this:

item 1 item 2 item 3 

Comments

Popular posts from this blog

java - Intellij Synchronizing output directories .. -

git - Initial Commit: "fatal: could not create leading directories of ..." -