Re: [Patch] Support UTF-8 scripts

D. Hazelton wrote:
> This is news to me. The last time I handed execve() a script as a 
> paramter I had errors returned from execve() -- I must admit that 
> this was not on my current system and I had assumed that the behavior 
> would be consistent.

The kernel checks for #!<path>, and that <path> is an existing
executable. If not, execve fails.

> You are correct. It is fairly trivial. However my point still is valid 
> that the Kernel has the whole binfmt_misc system -- I will admit that 
> I have recently been shown numbers that show a noticeable difference 
> in the speed of a binary executed using the binfmt_misc system and 
> the binfmt_script system, but the fact remains that offering handling 
> for UTF8 and ASCII scripts directly in the kernel will likely lead to 
> at least one more patch in which the the full Unicode standard is 
> implemented.

The problem with the binfmt_misc approach is that you need *another*
execve call: with binfmt_misc, you register <utf8sig>#!, and a
generic binary. Then, this generic binary will interpret the #!
signature *again*, and invoke the proper interpreter. This will
intepret the first line *yet again* (finding that it is a comment),
and continue processing the file.

However, this is not the real problem. The real problem is that
the specific binfmt_misc "backend" would not be universally
available, and then the same script would start on some systems,
and break on others. This may be acceptable for large or specific
applications (e.g. you have to setup the ibcs2 module to run
SCO applications); it is not for scripts.

Now, the "universally available" part would not apply right now,
as only the most recent kernels would provide the feature. However,
within a few years, the feature would be part of "Linux" - then
people can start using it extensively.

> That, and my point remains that the kernel should know absolutely 
> nothing about how to execute a text file - the kernel should return 
> an error to the extent of "I don't know what to do with this file" to 
> the shell that tries to execute it, and the shell can then check for 
> the sh_bang. I do admit that this change would break a lot of 
> existing code, so I'll leave the argument to the experts.

The point is that it is not necessarily the shell which starts
programs - the shell is but one creator of new processes. It is
very common today that, say, httpd starts new programs - this
mechanism is called CGI. Your approach was in use until 1985 or
so, when Unix implementations started to support #! natively.
This was done both for convenience and for performance: if
programs would always use system(3) to start new processes,
there would always be a shell that execs the eventual
interpreter.

I'm not sure, but I believe that most current shells have "forgotten"
how to do the #! magic, since, by now, "traditionally" this is
a kernel responsibility.

Regards,
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [Patch] Support UTF-8 scripts
  - From: "D. Hazelton" <dhazelton@enter.net>

References:
- Re: [Patch] Support UTF-8 scripts
  - From: "Martin v. Löwis" <martin@v.loewis.de>
- Re: [Patch] Support UTF-8 scripts
  - From: "D. Hazelton" <dhazelton@enter.net>

Prev by Date: Re: Eradic disk access during reads
Next by Date: Re: [Patch] Support UTF-8 scripts
Previous by thread: Re: [Patch] Support UTF-8 scripts
Next by thread: Re: [Patch] Support UTF-8 scripts
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind]