Re: [mp2] "Unrecognized character" error when running scripts in utf-8

[prev] [thread] [next] [Date index for 2004/11/07]

From: Stas Bekman
Subject: Re: [mp2] "Unrecognized character" error when running scripts in utf-8
Date: 19:34 on 07 Nov 2004
Markus Wichitill wrote:
> Stas Bekman wrote:
> 
>>> It seems as if mod_perl doesn't recognize the format of the script file
>>> correctly. Any tips why this may occur? Thanks a bundle in advance!
>>
>>
>> Wow! That's interesting.
>>
>> Please take a look at the code in function 
>> convert_script_to_compiled_handler at 
>> ModPerl-Registry/lib/ModPerl/RegistryCooker.pm, and suggest a fix.
> 
> 
> The reason for the failure is pretty clear, the BOM ends up somewhere in 
> the middle of the string that is eval'ed to generate the package. On 
> Linux, I was able to fix that by removing the BOM in 
> RegistryCooker::read_script():
> 
>     # remove BOM
>     ${$self->{CODE}} =~ s/^(?:
>         \xef\xbb\xbf     |
>         \xfe\xff         |
>         \xff\xfe         |
>         \x00\x00\xfe\xff |
>         \xff\xfe\x00\x00
>     )//x;
> 
> For some reason that I haven't figured out yet, this doesn't work on 
> Windows. With the ^ anchor in place, the BOM isn't getting removed, with 
> the  ^ anchor gone, the script doesn't seem to receive its request 
> object in @_.

It's certainly a bad idea to add any s/// in the code, since it's slowing 
things down. And it certainly shouldn't be a penalty for most other users 
who don't have this problem. So this can be done in a subclass of 
RegistryCooker.

But first I would like to know why the script doesn't fail when run 
outside mod_perl. We do nothing special here, other than read+eval, 
compared to require. So it looks like a bug in perl. So may be one could 
try to reproduce this problem outside registry, by copying the relevant 
parts of the code (read+eval) and then if still reproducable, report it to 
p5p.


        -- 
        __________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@xxxxxx.xxx http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

(message missing)

[mp2] =?koi8-r?Q?=22?=Unrecognized character=?koi8-r?Q?=22=20?=error when running scripts in utf-8
=?koi8-r?Q?=E9=C7=CF=D2=D8=20=EB=D5=C4=C1=DB=C5=D7=20?= 13:23 on 07 Nov 2004

Re[2]: [mp2] =?koi8-r?Q?=22?=Unrecognized character=?koi8-r?Q?=22=20?=error when running scripts inutf-8
=?koi8-r?Q?=E9=C7=CF=D2=D8=20=EB=D5=C4=C1=DB=C5=D7=20?= 12:09 on 08 Nov 2004

Re: [mp2] "Unrecognized character" error when running scripts in utf-8
Stas Bekman 19:34 on 07 Nov 2004

Generated at 11:26 on 21 Dec 2004 by mariachi v0.52