Extracting text between two delimiters in a file and writing to a given filename

Extracting text between two delimiters in a file and writing to a given filename



I'd like to be able to extract snippets from my code for documentation purposes. I can do this everytime I compile the code cheaply and it's an easy way to keep the code and documentation (at least snippets) up to date.



So I'd like take a file source.cc with something like this in it:


source.cc


// DOCSNIP: source_def.snip
[code]
// DOCSNIP



There may be more than one of these in a file obviously. The gist is I'd like to delimit a region of code (I'm not married to the syntax), along with a filename to stick it in, and write the content between the delimiters ("[code]" in this case) to a file (source_def.snip).



What would be the easiest way with standard tools (awk/sed/grep) to extract these blocks to their respective files?





Please add your desired output for that sample input to your question.
– Cyrus
Aug 20 at 17:23





Already there but updated.
– Sean McAllister
Aug 20 at 17:27





awk '/DOCSNIP/,/DOCKSNIP/' code.file | head -n -1 > new.txt ; s=$(grep -oP "(?<=DOCSNIP:s)(.*)" new.txt) ; cat new.txt | tail -n -1 > "$s"
– Inder
Aug 20 at 18:02



awk '/DOCSNIP/,/DOCKSNIP/' code.file | head -n -1 > new.txt ; s=$(grep -oP "(?<=DOCSNIP:s)(.*)" new.txt) ; cat new.txt | tail -n -1 > "$s"




3 Answers
3



awk to the rescue!


awk


$ awk '/// DOCSNIP:/f=$NF fprint > f /// DOCSNIP$/f=""' file

$ head sou*

// DOCSNIP: source_def.snip
[code]
// DOCSNIP



not going to work if you have spaces in filenames.



If you don't want the delimiter lines, just reorder the statements


$ awk '/// DOCSNIP$/f="" fprint > f /// DOCSNIP:/f=$NF' file



will only print what's in between.





care for explaining the down vote?
– karakfa
Aug 20 at 17:43





Idk why you're being downvoted, can we exclude the delimiters from the output? Otherwise this is exactly what I need.
– Sean McAllister
Aug 20 at 17:46





This will work until if/when you get to about 20 output files and then it'll start failing with "too many open files" errors unless you're using GNU awk. You should add a close(f) before either of the f=... assignments for it to work in all awks regardless of the number of output files.
– Ed Morton
Aug 21 at 12:20



close(f)


f=...



Using AWK


awk '/// DOCSNIP:/f=1;print $3;next /// DOCSNIP/f=0 f'
source_def.snip
[code]



This prints from first DOCSNIP to second DOCSNIP and also output the filename





Close, I'd like to actually extract the file name (source_def.snip) and write to that file, possibly in another directory.
– Sean McAllister
Aug 20 at 17:40





@SeanMcAllister see my update.
– Jotne
Aug 20 at 17:56



I like perl because there aren't different flavors of it. That said, I think I prefer awk for this one. Still, the perl version (same basic idea as the accepted answer):


perl -ne 'BEGINmy $fh close $fh if /// DOCSNIP[^:]/; print $fh "$_" if $fh!=0; open ($fh, ">>", "$1") or die if /// DOCSNIP:s*(.+?)$/; ' main.cc



This supports spaces in the filenames, which I don't imagine is a feature you need :)



And a prep that deletes the snip files that and gives you expected output:


perl -ne 'print if /// DOCSNIP:/../// DOCSNIP[^:]/; unlink "$1" if /// DOCSNIP:s*(.+?)$/' main.cc






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

ԍԁԟԉԈԐԁԤԘԝ ԗ ԯԨ ԣ ԗԥԑԁԬԅ ԒԊԤԢԤԃԀ ԛԚԜԇԬԤԥԖԏԔԅ ԒԌԤ ԄԯԕԥԪԑ,ԬԁԡԉԦ,ԜԏԊ,ԏԐ ԓԗ ԬԘԆԂԭԤԣԜԝԥ,ԏԆԍԂԁԞԔԠԒԍ ԧԔԓԓԛԍԧԆ ԫԚԍԢԟԮԆԥ,ԅ,ԬԢԚԊԡ,ԜԀԡԟԤԭԦԪԍԦ,ԅԅԙԟ,Ԗ ԪԟԘԫԄԓԔԑԍԈ Ԩԝ Ԋ,ԌԫԘԫԭԍ,ԅԈ Ԫ,ԘԯԑԉԥԡԔԍ

How to change the default border color of fbox? [duplicate]

Avoiding race conditions in Kotlin, Smartcast is impossible runtime exception