Linux, programming, computers and life

October 16, 2007

using awk to truncate a file based on string

Filed under: CLI, programming, linux, awk

One more awk trick ;) . Let’s say you have a huge file. You only want to get all the lines of the file which are:

  1. before certain string
  2. after certain string
It’s easy to do it using any text editor, but it’ll take time and if the file is really huge you’ll have hard time finding a working editor. awk to the resque!

The solutions:

  1. seq 1 30 | awk '/.*8.*/ { nextfile } { print $0 }'
  2. seq 1 30 | awk '/.*8.*/ { pr = 1 } { if (pr == 1) print $0 }'
  3. seq 1 30 | awk '/.*8.*/ { pr = 1; print $0 } { if (pr == 1) print $0 }'
In this example i use ‘<anything>8<anything>’ as the string which divides the file. The difference between (2) and (3) is whether the string itself is printed or not.

Short explanation: awk is instructed to look for regular expression (which is enclosed in ‘/’) and in (1) stops parsing the file before stopping each line is echoed as is. In (2) and (3) it sets pr variable to be 1 (it’s 0 by default and is short of “print”), and lines are echoed only if pr is 1.

Technorati Tags: , , ,

Get free blog up and running in minutes with Blogsome
Theme designed by Gary Rogers