Re: For Loops and Space in Names, take 2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2008-12-11 at 10:02 +1100, Cameron Simpson wrote:

> Because the $1 is unquoted here you probably don't need the "+" in the
> regexps. But you probably should quote $1 because of this:
> 
>   [/Users/cameron]fleet*> x='t*'
>   [/Users/cameron]fleet*> echo "$x"
>   t*
>   [/Users/cameron]fleet*> echo $x
>   test.txt them tmp

Done.  But I do need the + in this one: "s/[_]+/_/g"
Also, need to reorder the expressions. I ran some tests and some thing
like "a    +++,,    b" would translate to "a__b" rather than "a_b"

Also, I now use set -f in the fixname function (to facilitate handling
original names with a * in them), but I quoted it anyway. 

> i.e. if $x is not quoted then glob characters in a filename will get expanded
> (supposing you have a name like "t*", and some other files starting with "t").
> 
> |         sed -r \
> |             -e "s/[ ]+/_/g" \
> |             -e "s/[_]+/_/g" \
> |             -e "s/'//g" \
> |             -e "s/[+]//g" \
> |             -e "s/,//g" \
> |             -e "s/_-_/-/g" \
> |             -e "s/\&/and/g"`;
> |     # only do rename if the name is changed
> |     if [ \"$1\" != \"${newname}\" ]; then
> 
> You don't need to backslash the "s in this comparison.

Not sure why I did this. Hmmmm.

> 
> |         echo mv \"$1\" \"${newname}\"
> 
> This is actually dangerous. By leaving the (expanded) $1 in double
> quotes, it is subject to parameter and command substitution in the
> shell that runs the commands you generate. For example, supposing I ship
> you a file whole name contains:
> 
>   `rm -rf $HOME`
> 
> That _will_ get through your script, and _will_ run the rm command in
> the subshell. In both the $1 and $newname parts.

I hadn't thought of that.  It definitaly does what you say if I make a
file named `ls -l`

I want to remove ` anyway, now that I think about it, so added an s
expression for that.  And $, *, and %.

> 
> But it gets better. Even though you remove all single quotes form
> $newname, and are thus safe to emit:
> 
>   mv ..... '$newname'
> 
> the $1 is still unsafe, and cannot easily be made safe. (Actually,
> bash's "printf %q" formatter may do the trick for you, and since you're
> already in nonportable GNU sed land, you may as well step right in with
> nonportable bash too:-)

I'm not sure what %q does.  It's not in the info or man documentation.
I now use single quotes to avoid expansion.  That seems to work on a few
test cases.


> Personally, I would change the script mode. Instead of trying to emit
> safe shell syntax for piping to "sh", I would instead give it two modes
> of operation. Default is to recite commands and a -x option would make
> it actually move files around.
> 
> Like this:
> 
>   setx()
>   { ( set -x
>       "$@"
>     )
>   }
>   trace=echo
> 
>   # handle leading -x option
>   if [ "x$1" = x-x ]
>   then
>     shift
>     trace=setx
>   fi
> 
> and then down where you emit the mv commands go:
> 
>   $trace mv "$1" "$newname"
> 
> which should be robust against weird names in $1 or $newname.

Done, or at least something similar

> 
> BTW, have you worried about having mv move a "bad" filename onto an
> already existing "good" name? A "mv -i" might avoid a little pain.

Done, with an if [ -a ....

> 
> |     else
> |         echo "# \"$1\" and \"${newname}\" are the same file"
> |     fi
> | }
> | 
> | if [ "$#" == "0" ]; then
> 
> No quotes needed here, and "==" is not a test notation. You want "="
> (string comparison) or "-eq" (numeric comparison).

You're right, Done.

> 
> |     for i in *; do 
> |         fixname "$i"
> |     done
> | else
> |     while [ "z$1" != "z" ]; do
> |         fixname "$1"
> |         shift
> |     done
> 
> This loop is better written:
> 
>   for i; do
>     fixname "$i"
>   done
> 

That's cool.  I never knew this.  Thanks.  Done

> otherwise running your script like this:
> 
>   your-script a b c "" d e f
> 
> will only operate on the first 3 names. Besides, the "for i; do" notation
> is shorter and clearer (once you know that leaving off the "in" clause
> iterates over the command line arguments). And all to frequently you
> see this in scripts:
> 
>   for i in $*; do

I have done this a lot.  I'm much better now.

> 
> which has the usual "spaces in words" trouble we're all working around
> here.
> 
> Personally, I strive to be robust against weird names in my code, to
> avoid having to play rename games like yours. I do have renaming scripts
> like your for much the same reason you do, though.
> 
> Cheers,
> -- 
> Cameron Simpson <cs@xxxxxxxxxx> DoD#743
> http://www.cskk.ezoshosting.com/cs/
> 
> Lady, have you ever seen a cat skeleton in a tree?      - Kevin Dunn
> 

WOW! This is great stuff and I'm sure not going to let it go to waste.
Thank you Cameron, for all the input.  I'm glad I put this up here.

I wrote this 8 years ago and once it worked, I've not looked at it again
until today. It just did the job. Fortunately for me, none of the bad
things you identified has ever happened to me and I've run it many many
times.  I think the fixes are well worth the time this evening.

I believe all that Cameron suggested is in the following new version,
but anyone beat it up if you see something wrong (it can only get
better).


#!/bin/bash
# 
# fixmp3names
# Copyright (c) 2000, 2008 Robert Wuest
# Permission is granted to use, modify and, distribute according
# to the terms of the GPL
#
# With thanks to Cameron Simpson from the Fedora users list!
#
# This script fixes a lot of the anomalies in file names
# usually weird stuff from mp3 files, but it's really generic 
#
# if called with no args it processes all files in the current directory
# or works on the names passed on the command line
# probably doesn't work across directories
#
# default just writes the mv commands to standard out
# if you want it to actually do something, pipe the output to sh, as in:
#   fixmp3names | sh
# or use -x as:
#   fixnames -x
#
# Version 2

IFS=$'\n'


oper=f_echo

function f_mv()
{   ( set -x
    mv $1 $2
    )
}

function f_echo()
{
    echo mv \'$1\' \'$2\'
}

function fixname() 
{   ( set -f
    newname=`echo "$1" | \
        sed -r \
            -e "s/\\\\$//g" \
            -e "s/\\o140//g" \
            -e "s/,//g" \
            -e "s/[+]//g" \
            -e "s/'//g" \
            -e "s/%//g" \
            -e "s/\\*//g" \
            -e "s/[ ]/_/g" \
            -e "s/[_]+/_/g" \
            -e "s/_-_/-/g" \
            -e "s/\&/and/g"`;

    # only do rename if the name is changed
    if [ $1 != ${newname} ]; then
        # leave, with message, if destination name exists
        if [ -a ${newname} ]; then
            echo "# '${newname}' exists, skipping '$1'"
            return
        fi
        ${oper} $1 ${newname}
    else
        echo "# '$1' and '${newname}' are the same file"
    fi
    )
}

# handle leading -x option
if [ "x$1" = x-x ]; then
    shift
    oper=f_mv
fi

if [ $# = 0 ]; then
    for i in *; do 
        fixname "$i"
    done
else
    for i; do
        fixname "$i"
    done
fi

exit


-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines

[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux