1

I'm trying to extract the size (in kb) from a file. Trying to do so as follows:

textA=$(du a)
sizeA=$(expr match "$textA" '\(^[^\s]*\)')
textB=$(du b)
sizeB=$(expr match "$textB" '\(^[^\s]*\)')

echo $textA
echo $sizeA
echo $textB
echo $sizeB

[[ $sizeA == $sizeB ]] && echo "eq"

But this just prints in console textA and textB. Both are like:

30745 a

Can someone please explain why is not the regex matching? I've tried to test the regex against the text in many sites, just to make sure, and it appears to capture the correct text.

I've also tried changing it to:

'^\([^\s]*\)'

But this way it will capture all the text. Any thoughts?

Adrian Frühwirth
  • 42,970
  • 10
  • 60
  • 71
gcandal
  • 937
  • 8
  • 23
  • 1
    It seems that `expr` isn't aware of character classes that follows the syntax of `\s`. For example the expression `sizeA=$(expr match $textA '\(^[[:digit:]]*\)')` works for me... – user1146332 Aug 02 '13 at 09:08

4 Answers4

3

Not a direct answer, but I would do it like this:

 sizeA=$(du a | awk '{print $1}')
piokuc
  • 25,594
  • 11
  • 72
  • 102
3

My expr match does not understand \s or other extended regexps. Try '\([0-9]*\)' instead.

But as others mentioned already, using regexp for getting "the first word" is a little overkill. I'd use du s | { read a b; echo $a; }, but you could also use the awk version or solutions using cut.

Alfe
  • 56,346
  • 20
  • 107
  • 159
2

Do not parse the output of du, if available you can e.g. use stat to get the size of a file in bytes:

sizeA=$(stat -c%s "${fileA}")
Alfe
  • 56,346
  • 20
  • 107
  • 159
Adrian Frühwirth
  • 42,970
  • 10
  • 60
  • 71
  • Did the OP state that he was about to use `du` on plain files only? Maybe he wants to compare directory sizes. – Alfe Aug 02 '13 at 09:11
  • Can't use stat in this machine (running Solaris). – gcandal Aug 02 '13 at 09:11
  • 1
    @Alfe: Run `du` without any switches on a directory and you will understand why it was implied that a single file was meant when the OP wrote "a" file based on the code snippet. – Adrian Frühwirth Aug 02 '13 at 09:25
  • @GCC404 OK, then I understand that's not an option unless you have the Solaris `coreutils` package installed (which provides `stat`) or it's an option to install it. I took the liberty to add `solaris` to the question's tags :-) – Adrian Frühwirth Aug 02 '13 at 09:31
  • (I took a minor edit in order to be able to remove my downvote, so don't mind that.) @Adrian: I see your reasoning behind the single file assumption. But directories without subdirectories give single-line outputs just as well, and in case hierarchical directories (with subdirs) are used as input, it's a valid assumption as well that the user then will add the proper option to `du` to suppress subdir output. – Alfe Aug 02 '13 at 11:19
  • Isn't just your answer using `stat` from the `solaris` package? Does your answer give you a reason to add a tag to the question? That seems not proper to me. OP might be working on Solaris currently but that does not mean his program should only work there. – Alfe Aug 02 '13 at 11:20
  • 1
    @Alfe I understood the reason behind your edit, no worries! No, not at all. My answer is "use `stat` on systems where it is available" (mainly Linux, plus others where it can be installed) if portability is not an issue. `stat` is not a `Solaris` tool. There is no non-ugly way of portably determining file size across different *nix flavours in shell scripting. Check [POSIX](http://pubs.opengroup.org/onlinepubs/009604499/utilities/du.html), you cannot even rely on having `du` available. ... – Adrian Frühwirth Aug 02 '13 at 11:41
  • `stat` is a good option on systems that support it since you don't have to resolve to parsing some other tool's output (another option is e.g. to use `GNU find`'s `-printf` option). If portability is absolutely required, it's the OP's responsibility to mention this as part of the question but I did give the hint that it is not since the system had not been mentioned. This is exactly what the tags are for and that's why I added it to reduce possible noise (someone might post another answer that does not work on Solaris otherwise). I added the tag based on the OP's comment, not my answer. – Adrian Frühwirth Aug 02 '13 at 11:42
  • 1
    Yeah, on the other hand, now someone might post an answer which works specifically on Solaris (and nowhere else) which also might not have been intended. Conclusion: Changing specs from the receiver side is a dangerous business ;-) – Alfe Aug 02 '13 at 11:44
  • @Alfe And that's why I mentioned my edit to the OP and didn't do it silently. What we both do here is absolute guesswork, but since Solaris is obviously a requirement, it's IMHO better to get answers that work on the one system mentioned than lots of answers that might work on many *but* the one system where we know it has to work. It's not like my edit is "in Stein gemeißelt", he can always roll it back ;-) – Adrian Frühwirth Aug 02 '13 at 11:51
2
size=$(wc -c < file)

If you want to use du, I would use the bash builtin read:

read size filename < <(du file)

Note that you can't say du file | read size filename because in bash, components of a pipeline are executed in subshells, so the variables will disappear when the subshell exits.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352