27

I have a filename in a format like:

system-source-yyyymmdd.dat

I'd like to be able to parse out the different bits of the filename using the "-" as a delimiter.

dreftymac
  • 31,404
  • 26
  • 119
  • 182
Nick Pierpoint
  • 17,641
  • 9
  • 46
  • 74

6 Answers6

32

You can use the cut command to get at each of the 3 'fields', e.g.:

$ echo "system-source-yyyymmdd.dat" | cut -d'-' -f2
source

"-d" specifies the delimiter, "-f" specifies the number of the field you require

Bobby Jack
  • 15,689
  • 15
  • 65
  • 97
  • I'm curious why you added the # prompt. Normally, that prompt indicates the root or superuser. In generally, I'd think stuff like trying out the **cut** command would be better done as a regular user. I'd have used the $ prompt. – Jon 'links in bio' Ericson Sep 09 '08 at 19:34
  • Oh, yeah - good point. I must admit, I was logged in as root at the time and simply went for it - a bad habit, I know. Having said that, I think echo and cut are two of the least harmful commands :) But, for the sake of completeness, I'll certainly update the example right away. Cheers. – Bobby Jack Sep 10 '08 at 17:22
  • 1
    how can i assign the output to a variable? if i do `var = "system-source-yyyymmdd.dat" | cut -d'-' -f2` does not work. – EsseTi Oct 13 '14 at 11:52
  • 1
    Fantastic - an answer in 3 minutes - that's quicker than phoning a friend! - _originally added as an answer but that was deleted 6 years after I asked the question_ :) – Nick Pierpoint Aug 16 '16 at 16:46
10

A nice and elegant (in my mind :-) using only built-ins is to put it into an array

var='system-source-yyyymmdd.dat'
parts=(${var//-/ })

Then, you can find the parts in the array...

echo ${parts[0]}  ==> system
echo ${parts[1]}  ==> source
echo ${parts[2]}  ==> yyyymmdd.dat

Caveat: this will not work if the filename contains "strange" characters such as space, or, heaven forbids, quotes, backquotes...

Colas Nahaboo
  • 749
  • 5
  • 7
8

Depending on your needs, awk is more flexible than cut. A first teaser:

# echo "system-source-yyyymmdd.dat" \
    |awk -F- '{printf "System: %s\nSource: %s\nYear: %s\nMonth: %s\nDay: %s\n",
              $1,$2,substr($3,1,4),substr($3,5,2),substr($3,7,2)}'
System: system
Source: source
Year: yyyy
Month: mm
Day: dd

Problem is that describing awk as 'more flexible' is certainly like calling the iPhone an enhanced cell phone ;-)

flight
  • 1,738
  • 2
  • 14
  • 14
7

Use the cut command.

e.g.

echo "system-source-yyyymmdd.dat" | cut -f1 -d'-'

will extract the first bit.

Change the value of the -f parameter to get the appropriate parts.

Here's a guide on the Cut command.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
David
  • 14,047
  • 24
  • 80
  • 101
1

Another method is to use the shell's internal parsing tools, which avoids the cost of creating child processes:

oIFS=$IFS
IFS=-
file="system-source-yyyymmdd.dat"
set $file
IFS=$oIFS
echo "Source is $2"
Shannon Nelson
  • 2,090
  • 14
  • 14
0

The simplest (and IMO best way) to do this is simply to use read:

$ IFS=-. read system source date ext << EOF
> foo-bar-yyyymmdd.dat
> EOF
$ echo $system
foo
$ echo $source $date $ext
bar yyyymmdd dat

There are many variations on that theme, many of which are shell dependent:

bash$ IFS=-. read system source date ext <<< foo-bar-yyyymmdd.dat

echo "$name" | { IFS=-. read system source date ext
   echo In all shells, the variables are set here...; }
echo but only in some shells do they retain their value here
William Pursell
  • 204,365
  • 48
  • 270
  • 300