Re: [mp2] "Unrecognized character" error when running scripts in utf-8

[prev] [thread] [next] [Date index for 2004/11/07]

From: Markus Wichitill
Subject: Re: [mp2] "Unrecognized character" error when running scripts in utf-8
Date: 19:26 on 07 Nov 2004
Stas Bekman wrote:
>> It seems as if mod_perl doesn't recognize the format of the script file
>> correctly. Any tips why this may occur? Thanks a bundle in advance!
> 
> Wow! That's interesting.
> 
> Please take a look at the code in function 
> convert_script_to_compiled_handler at 
> ModPerl-Registry/lib/ModPerl/RegistryCooker.pm, and suggest a fix.

The reason for the failure is pretty clear, the BOM ends up somewhere in the 
middle of the string that is eval'ed to generate the package. On Linux, I 
was able to fix that by removing the BOM in RegistryCooker::read_script():

     # remove BOM
     ${$self->{CODE}} =~ s/^(?:
         \xef\xbb\xbf     |
         \xfe\xff         |
         \xff\xfe         |
         \x00\x00\xfe\xff |
         \xff\xfe\x00\x00
     )//x;

For some reason that I haven't figured out yet, this doesn't work on 
Windows. With the ^ anchor in place, the BOM isn't getting removed, with the 
  ^ anchor gone, the script doesn't seem to receive its request object in @_.

        -- 
        Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

(message missing)

[mp2] =?koi8-r?Q?=22?=Unrecognized character=?koi8-r?Q?=22=20?=error when running scripts in utf-8
=?koi8-r?Q?=E9=C7=CF=D2=D8=20=EB=D5=C4=C1=DB=C5=D7=20?= 13:23 on 07 Nov 2004

Re[2]: [mp2] =?koi8-r?Q?=22?=Unrecognized character=?koi8-r?Q?=22=20?=error when running scripts inutf-8
=?koi8-r?Q?=E9=C7=CF=D2=D8=20=EB=D5=C4=C1=DB=C5=D7=20?= 12:09 on 08 Nov 2004

Re: [mp2] "Unrecognized character" error when running scripts in utf-8
Markus Wichitill 19:26 on 07 Nov 2004

Generated at 11:26 on 21 Dec 2004 by mariachi v0.52