Linux, programming, computers and life

August 20, 2007

arrays in awk

Filed under: CLI, programming, awk

awk is even better than i thought it is. It has arrays. Recently i needed to examine a log file, which includes mixed output from several threads, and to find 2 latest strings of each thread. I didn’t know how to do it, till a coworker mentioned to me about those arrays.
The script is extremely simple:

cat my_log_file.txt | awk '{arr[$2, 1] = arr[$2, 0]; arr[$2, 0]=$0} END { for (s in arr) print s" : "arr[s] }'

Some explanations:

  • {arr[$2, 1] = arr[$2, 0]; arr[$2, 0]=$0}
    - an operation which is performed on every line. Column 2 is the thread number. I keep in array with index [threadNumber,0] the last line and the line before last in index [threadNumber,1]
  • END { for (s in arr) print s" : "arr[s] }'
    - the final operation - print all the array with it indexes

Technorati Tags: , , , , ,

Get free blog up and running in minutes with Blogsome
Theme designed by Gary Rogers