Re: perl and UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Anand Buddhdev wrote:

> [arb@home arb]$ time grep zymology docs/sowpods.txt
> enzymology
> zymology
>  
> real    0m0.267s
> user    0m0.260s
> sys     0m0.000s
> 
> [arb@home arb]$ export LANG=C
> [arb@home arb]$ time grep zymology docs/sowpods.txt
> enzymology
> zymology
>  
> real    0m0.012s
> user    0m0.000s
> sys     0m0.000s
> 
> Grep is clearly still much slower in UTF8.
> 

I'm not sure that's quite a fair test as the files will be cached in the
second case.

However, you are right, grep is slower under utf8. It's not surprising as
any file it encounters could contain unicoded characters, so there will be
extra overheads even if the files only contain standard ascii.

rc.sysinit on FC2 still contains many lines with
LC_ALL=C grep
for just that reason!


Jonathan




[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux