How can I remove all characters in each line after the first space in a text file?

Question

I have a large log file from which I need to extract file names.

The file looks like this:

/path/to/loremIpsumDolor.sit /more/text/here/notAlways/theSame/here
/path/to/anotherFile.ext /more/text/here/differentText/here
.... about 10 million times

I need to extract the file names like this:

loremIpsumDolor.sit
anotherFile.ext

I figure my first strategy is to find/replace all /path/to/ with ''. But I'm stuck how to remove all characters after the space.

Can you help?

score 6 · Accepted Answer · answered Nov 15 '12 at 19:42

6

sed 's/ .*//' file

It doesn't take any more. The transformed output appears on standard output, of course.

answered Nov 15 '12 at 19:42

Jonathan Leffler

730,956
141
904
1,278

umm... regex for stripping after the first space? Wouldn't expect that from you ;-) – Michael Krelin - hacker Nov 15 '12 at 19:44
Brute force `sed` action; I like it. It is a shame that Windows does not provide such powerful text manipulation tools like sed, grep, awk, etc. by default. These are the bread-n-butter tools for a sys admin (IMHO). – Will Nov 15 '12 at 19:51
1

I dislike 'cut' because the standard ([POSIX](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cut.html)) versions of it don't handle one-or-more separators between fields; GNU `cut` has the necessary `-i` option, but I can't always rely on having GNU `cut` available. Granted, not an issue with this particular task, but if you don't use a tool because it doesn't always work, you don't use it. I find `sed` easier to use, but there are multiple tools for the job (`awk`, `perl`, `python` could all be used very easily, but they're more complex than necessary. – Jonathan Leffler Nov 15 '12 at 19:52
@JonathanLeffler, I find `sed` more complex *for this particular task*. That's why I didn't expect that from you. (and no tool *always work*). That said, expected or not, I do not find anything severely wrong with this solution ;-) – Michael Krelin - hacker Nov 15 '12 at 20:38

Michael Krelin - hacker · Answer 2 · 2012-11-15T20:40:04.463

2

Pass it to cut:

cut '-d ' -f1 yourfile

edited Nov 15 '12 at 20:40

answered Nov 15 '12 at 19:43

Michael Krelin - hacker

138,757
24
193
173

You don't need the input redirection, though it does no harm here while there's only a single file to process. – Jonathan Leffler Nov 15 '12 at 20:14
@JonathanLeffler, true. Even thought of it after submitting. I'll edit it out. – Michael Krelin - hacker Nov 15 '12 at 20:39

score 2 · Answer 3 · answered Nov 15 '12 at 19:47

2

In theory, you could also use awk to grab the filename from each line as:

awk '{ print $1 }' input_file.log

That, of course, assumes that there are no spaces in any of the filenames. awk defaults to looking for whitespace as the field delimiters, so the above snippet would take the first "field" from your log file (your filename) for each line, and output it.

answered Nov 15 '12 at 19:47

Will

3,500
4
30
38

Ah, but there are actually spaces before in my real log file. But I like this direction. In reality it's more like `textHere thenSpaces /path/to/file.ext /more/text/here`. I didn't mention it because I figured I'd have to sed find/replace the first part anyway (since it's always the same). – Ryan Nov 15 '12 at 19:57
@Ryan: no sweat; you would just use `print $2` instead, since it would then be the second field. `awk` is a handy tool for things just such as this, and it is worth getting reasonably good at using it. – Will Nov 15 '12 at 19:59

score 0 · Answer 4 · answered Nov 15 '12 at 22:29

0

a bash-only solution:

while read path otherstuff; do
    echo ${path##*/}
done < filename

answered Nov 15 '12 at 22:29

glenn jackman

238,783
38
220
352

How can I remove all characters in each line after the first space in a text file?

4 Answers4