Linux, programming, computers and life

October 16, 2007

using awk to truncate a file based on string

Filed under: CLI, programming, linux, awk

One more awk trick ;) . Let’s say you have a huge file. You only want to get all the lines of the file which are:

  1. before certain string
  2. after certain string
It’s easy to do it using any text editor, but it’ll take time and if the file is really huge you’ll have hard time finding a working editor. awk to the resque!

The solutions:

  1. seq 1 30 | awk '/.*8.*/ { nextfile } { print $0 }'
  2. seq 1 30 | awk '/.*8.*/ { pr = 1 } { if (pr == 1) print $0 }'
  3. seq 1 30 | awk '/.*8.*/ { pr = 1; print $0 } { if (pr == 1) print $0 }'
In this example i use ‘<anything>8<anything>’ as the string which divides the file. The difference between (2) and (3) is whether the string itself is printed or not.

Short explanation: awk is instructed to look for regular expression (which is enclosed in ‘/’) and in (1) stops parsing the file before stopping each line is echoed as is. In (2) and (3) it sets pr variable to be 1 (it’s 0 by default and is short of “print”), and lines are echoed only if pr is 1.

Technorati Tags: , , ,

1 Comment »

The URI to TrackBack this entry is: http://linux4all.blogsome.com/2007/10/16/using-awk-to-truncate-a-file-based-on-string/trackback/

  1. Check out “man csplit” or “info coreutils csplit”
    I downloaded coreutils source code from gnu.org and found many useful utilities that I’ve never heard of and seem to be very useful

    Comment by Amit — October 24, 2007 @ 07:06

RSS feed for comments on this post.

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>



Anti-spam measure: please retype the above text into the box provided.

Get free blog up and running in minutes with Blogsome
Theme designed by Gary Rogers