DevHeads.net

moving from mod_php to mod_fcgid : rewrite problem

Hello
I'm new to apache mailing list, sorry if I'm not 100% clear, and sorry
for this long description.

I have developped a website with php/mysql :
<a href="http://www.perspectives-musicales.org" title="http://www.perspectives-musicales.org">http://www.perspectives-musicales.org</a> and placed it on a good hosting
service (web4all.fr).
To improve search engine rank I decided to set all urls to
/index.php/... and rewrite them to avoid having index.php in url (sort
of MVC technique combined with SEO...)

Example : the catalog is at url :
<a href="http://www.perspectives-musicales.org/en/all-albums" title="http://www.perspectives-musicales.org/en/all-albums">http://www.perspectives-musicales.org/en/all-albums</a>
This should be transparantly mapped to
<a href="http://www.perspectives-musicales.org/index.php/en/all-albums" title="http://www.perspectives-musicales.org/index.php/en/all-albums">http://www.perspectives-musicales.org/index.php/en/all-albums</a> thanks to
the rewrite rule :

RewriteRule ^en/(.*) ./index.php/en/$1

My application uses then $_SERVER["PATH_INFO"] (and not
$_SERVER["QUERY_STRING"]) to retreive url information. This worked
perfectly until last month, because web4all.fr changed the whole system
and separated apache from php, using fast cgi instead of mod_php.

The system is supposed to be more reliable and more efficient like this,
and apparently is. But the rewrite rule does not work anymore. So I
investigated and made some test :

I have a small test.php that displays the path_info and query_string.
You can presently test it here :

<a href="http://perspectives-musicales.org/test1/a/b/c" title="http://perspectives-musicales.org/test1/a/b/c">http://perspectives-musicales.org/test1/a/b/c</a>
<a href="http://perspectives-musicales.org/test2/a/b/c" title="http://perspectives-musicales.org/test2/a/b/c">http://perspectives-musicales.org/test2/a/b/c</a>
<a href="http://perspectives-musicales.org/test3/a/b/c" title="http://perspectives-musicales.org/test3/a/b/c">http://perspectives-musicales.org/test3/a/b/c</a>
<a href="http://perspectives-musicales.org/test4/a/b/c" title="http://perspectives-musicales.org/test4/a/b/c">http://perspectives-musicales.org/test4/a/b/c</a>

and I set the following rules :

RewriteRule ^test1/(.*) ./test.php/$1
RewriteRule ^test2/(.*) ./test.php?$1
RewriteRule ^test3/(.*) ./test.php?/$1
RewriteRule ^test4/(.*) http://www.perspectives-musicales.org/test.php/$1

None of these 4 rewrite rules are convenient. Here is why :

- test1 : the system anwsers 404 "No input file specified". I think (not
sure) that Apache beleives that test.php is a folder, and cannot find it
so answers 404

- test2 : the rewrite rule works, but of course the url information is
no more in path_info, it is in query_string as shown in the page content

- test3 : same as test2

- test4 : almost good, I can have the url info in path_info, but apache
begins first with a 302 redirection and then changes the url to
<a href="http://www.perspectives-musicales.org/test.php/a/b/c" title="http://www.perspectives-musicales.org/test.php/a/b/c">http://www.perspectives-musicales.org/test.php/a/b/c</a>, which looses all
search engine efficiency (and also eventual POST variables if any).

My host tried several searches on forums including this one, and could
not find any answer. It seems to be an apache bug, but not sure, I have
no bug number to give anyway. If it is a bug, it is demontrated by test1
I think.

So here is my question : Is there any way to make this rewrite rule work
in fastcgi mode, and what is the syntax for it, to keep info in
path_info without 302 redirection. The Apache version is 2.2.23 and
mod_fcgid is version 2.3.7 with configuration flag cgi.fix_pathinfo=1

If there is a way, thanks for your help I'd be glad to test it. If no
could you explain why and how to solve it. As workaround we used test4
syntax in the whole site, to make it work, but it is bad for search
engine, and creates problem in backoffice (because certain backoffice
functions use POST variables)

I know I can change my code to use query_string everywhere instead of
path_info, but if I can avoid changing and testing all my websites it
would be really great

Thanks a lot for your anwser.

Comments

Re: moving from mod_php to mod_fcgid : rewrite pro

By Yehuda Katz at 02/07/2013 - 11:50

On Tue, Feb 5, 2013 at 3:32 PM, Riccardo Cohen
<r.cohen@realty-property.com>wrote:

Probably not an HTTPD bug. More likely a problem with the PHP fcgi
configuration.

- Y

Re: moving from mod_php to mod_fcgid : rewrite pro

By Riccardo Cohen at 02/07/2013 - 12:00

Thanks for your answer Yehuda

Your rewrite rule will behave like my test2/3, converting pathinfo to a
querystring... and force me to change and double check all my website !

Thanks anyway.

On 07/02/13 16:50, Yehuda Katz wrote:

Re: moving from mod_php to mod_fcgid : rewrite pro

By Hendrik Schmieder at 02/08/2013 - 03:47

Riccardo Cohen schrieb:
You have to check your php code anyway, since the content of $_SERVER in
case of mod_php differs from the content of $_SERVER in case of mod_fcgid.

Hendrik

Re: moving from mod_php to mod_fcgid : rewrite pro

By Riccardo Cohen at 02/07/2013 - 06:17

Sorry to insist but I'm really blocked and I really need help.
Here is a small summary for those who don't want to read all :

I want to make a rewrite from :

<a href="http://www.perspectives-musicales.org/en/all-albums" title="http://www.perspectives-musicales.org/en/all-albums">http://www.perspectives-musicales.org/en/all-albums</a>
to
<a href="http://www.perspectives-musicales.org/index.php/en/all-albums" title="http://www.perspectives-musicales.org/index.php/en/all-albums">http://www.perspectives-musicales.org/index.php/en/all-albums</a>

my rewrite rule is

RewriteRule ^en/(.*) ./index.php/en/$1

This works when apache is runnnig with mod_php, but not when running
mod_fcgid (php as cgi). In cgi mode I have a 404 error.

The Apache version is 2.2.23 and mod_fcgid is version 2.3.7 with
configuration flag cgi.fix_pathinfo=1

Thanks for your help.

On 05/02/13 21:32, Riccardo Cohen wrote:

Re: moving from mod_php to mod_fcgid : rewrite pro

By Riccardo Cohen at 02/12/2013 - 03:16

Hello
I received some clues from this list members, thanks for that. But
unfortunately my problem is not solved.

It's not that I want others to focus on me, but I'm quite sure that
there is a real problem (if not why would it work perfectly on mod_php
?), I could not find any solution googling about it (even with the help
of the host technical team), and I would like a confirmation that 1)
it's not an error from my understanding, and 2) there is no workaround
for it.

So I'll be very pleased to here from some qualified developer before I
spend 2 days to modify and retest all my application.

Thanks in advance.

On 07/02/13 11:17, Riccardo Cohen wrote:

Re: moving from mod_php to mod_fcgid : rewrite pro

By Ben Johnson at 02/12/2013 - 09:53

On 2/12/2013 2:16 AM, Riccardo Cohen wrote:
I doubt it is a problem with the software. mod_rewrite has been put
through the paces over the years and I'd be shocked if a bug were
uncovered given your rule's relative simplicity.

Before digesting your post in its entirety, I have a couple of questions
first.

1.) Where have you defined the rewrite rule? In a .htaccess file?

2.) Have you defined a RewriteBase? If so, what is it?

3.) Have you reviewed Apache's access log at all?

4.) Have you increased RewriteLogLevel to, say, 4, to see exactly what
the mod_rewrite engine is doing?

-Ben

Re: moving from mod_php to mod_fcgid : rewrite pro

By Riccardo Cohen at 02/12/2013 - 11:59

Thanks Ben, here are the answers :

in .htaccess

no change with or without

I'll have a look now

I'll try that. Is it possible to set it in .htacces or must I change
global apache configuration (I only have access to my .htaccess in this
hosting).

Thanks

On 12/02/13 14:53, Ben Johnson wrote:

Re: moving from mod_php to mod_fcgid : rewrite pro

By Ben Johnson at 02/12/2013 - 14:40

On 2/12/2013 10:59 AM, Riccardo Cohen wrote:
Unfortunately, RewriteLogLevel can be set in the "server config" and
"virtual host" contexts only. (You can make this type of determination
in the future by visiting the manual page and looking for the "context"
value:
<a href="http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewriteloglevel" title="http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewriteloglevel">http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#rewriteloglevel</a> .)

This is one of many reasons for which hosting on a VPS over which you
have complete control is beneficial.

In any case, we'll have to proceed without access to the rewrite log.

Is there a specific reason for which you're using "./index.php" in the
right-hand side of the rule? I'm referring to the period ("."), in
particular. This may well be the source of the problem. It could be that
mod_php interprets that relative path (./index.php) "correctly", whereas
mod_fcgid does not.

Try this:

RewriteRule ^en/(.*) index.php/en/$1

-Ben

Re: moving from mod_php to mod_fcgid : rewrite pro

By Riccardo Cohen at 02/12/2013 - 16:19

Hi Ben
I tried without the dot : RewriteRule ^en/(.*) index.php/en/$1 but it
gave also an error 404.

These are all my tests : (available at
<a href="http://www.perspectives-musicales.org/test1/a/b/c" title="http://www.perspectives-musicales.org/test1/a/b/c">http://www.perspectives-musicales.org/test1/a/b/c</a> etc.)

RewriteRule ^test1/(.*) ./test.php/$1
# = error 404

RewriteRule ^test2/(.*) ./test.php?$1
# = parameters are in query_string instead of path_info

RewriteRule ^test3/(.*) ./test.php?/$1
# = parameters are in query_string instead of path_info

RewriteRule ^test4/(.*) http://www.perspectives-musicales.org/test.php/$1
# = redirection 302

RewriteRule ^test5/(.*) test.php/$1
# = error 404

RewriteRule ^test6/(.*) /test.php/$1
# = error 404

I could not find the apache error log, so I'll ask my hosting support
team and get back to you

Thanks for your help.

On 12/02/13 19:40, Ben Johnson wrote:

Re: moving from mod_php to mod_fcgid : rewrite pro

By Ben Johnson at 02/13/2013 - 12:48

On 2/12/2013 3:19 PM, Riccardo Cohen wrote:
It would be helpful to know what, exactly, appears in Apache's access
log (and/or error log, if you can manage to find that, too) in each of
these test cases.

I hit this URL and from what I can tell, the 404 response header is
coming from PHP, not Apache. The output is "No input file specified."
This doesn't look like a "stock" Apache 404 response. Did you build
logic into test.php that emits a 404 response header and this message
when some parameter is absent from the URL?

Why is this a problem?

It should be stated that mod_php and mod_fcgid populate these values in
different ways. From what I understand, PATH_INFO is less reliable and
less well-implemented than QUERY_STRING. Fundamentally, this is why you
are observing different behavior/values here after moving from mod_php
to mod_fcgid.

Same as above.

I don't see a 302 response for this one. I see the same 404 and message
as above. Maybe you changed something after sending this message.

Same as the others with 404 responses.

You're welcome. I'll wait to hear back before offering additional
information.

-Ben

Re: moving from mod_php to mod_fcgid : rewrite pro

By Riccardo Cohen at 02/13/2013 - 17:14

Hi Ben

I've asked for the apache error log, and found no error in it.
Only one which was a request done before adding the new .htaccess, but
nothing else :

[Tue Feb 12 21:04:17 2013] [error] [client 90.24.101.9] File does not
exist: /datas/vol1/w4a125552/var/www/perspectives-musicales.org/test6

The access log show all requests normally with no particular message :

90.24.101.9 - - [12/Feb/2013:21:04:46 +0100] "GET /test1/a/b/c HTTP/1.1"
404 45 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
Firefox/18.0" "20130212210446"

90.24.101.9 - - [12/Feb/2013:21:04:51 +0100] "GET /test2/a/b/c HTTP/1.1"
200 52 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
Firefox/18.0" "20130212210451"

90.24.101.9 - - [12/Feb/2013:21:04:56 +0100] "GET /test4/a/b/c HTTP/1.1"
302 206 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
Firefox/18.0" "20130212210456"

90.24.101.9 - - [12/Feb/2013:21:03:28 +0100] "GET /test5/a/b/c HTTP/1.1"
404 45 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
Firefox/18.0" "20130212210328"

90.24.101.9 - - [12/Feb/2013:21:04:17 +0100] "GET /test6/a/b/c HTTP/1.1"
404 45 "-" "Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101
Firefox/18.0" "20130212210417"

test.php is only this :

ok test
<br>
<?
$info=$_SERVER["PATH_INFO"];
echo "INFO=".$info."<br>";
$query=$_SERVER["QUERY_STRING"];
echo "query=".$query."<br>";
?>

maybe the error comes from mod_fcgid itself ?

My whole web application is developped with urls like

<a href="http://www.perspectives-musicales.org/en/all-associations" title="http://www.perspectives-musicales.org/en/all-associations">http://www.perspectives-musicales.org/en/all-associations</a>

for search engine optimizations, where "en" and "all-associations" are
not pages or directories, but program arguments (replacing
"?lang=en&command=all-associations" which are poor seo)

So, as explained in my first email, all arguments to my application
controller are in $_SERVER["PATH_INFO"] (and not
$_SERVER["QUERY_STRING"]). And that did work like a charm with
mod_php... Changing all my application with data in query_string is not
very complicated if I wrote a good program ( :) ) but will need a lot of
checks.

Actually at the point where I am now, i've already spent some time on it...

I'm not sure that this is a problem with the PATH_INFO variable since
the error occurs even before php has any chance to start executing (the
test.php is not executed at all in test1)

I use firefox http live header and it shows a status code 302 ("HTTP/1.1
302 Found") then the browser redirect to the page as if it was another
website
I still think that [apache or mod_fcgid] cannot execute test.php in
test1 just because it thinks it is a directory and cannot find it.

Re: moving from mod_php to mod_fcgid : rewrite pro

By Ben Johnson at 02/13/2013 - 23:15

On 2/13/2013 4:14 PM, Riccardo Cohen wrote:
Very good. No problems there.

This seems to imply that Apache is not generating the 404 errors; if it
were, one would expect access log entries to that effect.

Quite possibly. In fact, a search for "mod_fcgid No input file
specified" yields the following article:

<a href="http://isp-control.net/forum/printthread.php?tid=12653" title="http://isp-control.net/forum/printthread.php?tid=12653">http://isp-control.net/forum/printthread.php?tid=12653</a>

Of particular import is the suggestion, "Okay, this may be caused by
either (1) apache sending an incorrect path to the php file to php5-cgi;
or (2) something (permissions?) that prevents php5-cgi from running the
script."

Do other PHP scripts function as expected when executed via mod_fcgid?
Or do they all return the error string, "No input file specified" and a
404 response?

Right; I built a PHP framework that uses so-called "clean-URLs", and am
well-versed in the theory behind this approach, as well as its
execution. The rationale seems sound.

My PHP framework functions the same way via mod_php as it does with
mod_fcgid and mod_fastcgi. I achieved this by using a well-known
technique to rewrite the URLs (I place these directives into the
site-root's .htaccess file):

<IfModule mod_rewrite.c>
RewriteEngine on
Options All

# Modify the RewriteBase if you are using a subdirectory and the
# rewrite rules are not working properly:
# WARNING: Do not include a trailing slash on this directive if you
# include a path other than /!
#RewriteBase /

# Rewrite URIs of the form 'index.php?q=x' (except for real
# files/directories):
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
</IfModule>

(WordPress, Joomla, and many other frameworks do something similar.)

Then, in PHP, $_GET['q'] will always contain the "clean URL" (unless, of
course, the 'q' value is overwritten, e.g., the URL contains
"?q=something-else"). For this reason, you may wish to use something
other than "q" in the RewriteRule value. You can then parse the
clean-URL to obtain its "individual segments" and do with them as you
will. While over simplified, an example is to call explode('/',
trim($_GET['q'], '/')) in PHP. This will return an array that contains
the various "path segments". The URL
<a href="http://www.perspectives-musicales.org/en/all-associations" title="http://www.perspectives-musicales.org/en/all-associations">http://www.perspectives-musicales.org/en/all-associations</a> would return

array (size=2)
0 => string 'en' (length=2)
1 => string 'all-associations' (length=16)

Granted, undertaking this approach would mean rewriting certain aspects
of your application, but chances are that you'll thank yourself later.
You'll have a much more portable application that is
scripting-language-agnostic, with respect to URL structure. (Switching
to another scripting language requires a simple change to your
RewriteRule only.)

This may be for the reasons outlined in the article that I cited above.
If you'd like to post your CGI wrapper script, I'd be happy to take a
look. Alas, you may lack access to this script, in which case, it's a
moot point. Although, I must say, it seems unlikely that your host would
have misconfigured the wrapper script. (Then again, we've all seen worse.)

You're right; I checked this again, and I do see the 302 redirect. I
think it was a matter of enabling the "Persist" feature in Firebug.
(Otherwise, the "Net" panel is refreshed after the redirect is sent.)
Thanks for double-checking your work here!

That may very well be. And the solution I offered above should address
that shortcoming.

I can't tell you exactly why it doesn't work (only a VPS with shell
access would make that possible), but I can tell you what *does* work.

I'm happy to answer any questions.

Good luck!

-Ben

Re: moving from mod_php to mod_fcgid : rewrite pro

By Riccardo Cohen at 02/14/2013 - 05:47

Hi Ben

Actually my website redirect all urls to /index.php, so there is only
one php file (that loads many others). This way I implement a request
controller easily.

Yes I understand now that query_string is more portable than path_info.
Thanks for the tip.

See the answer of Benoit Georgelin, he kindly added the script to this
thread. Hoping this is the beginning of a solution. I know absolutely
nothing about these wrapper scripts.

Re: moving from mod_php to mod_fcgid : rewrite pro

By Ben Johnson at 02/14/2013 - 12:02

On 2/14/2013 4:47 AM, Riccardo Cohen wrote:
Okay, so if index.php can be accessed and doesn't return a 404, the
problem is not with the CGI setup, as a whole.

You're welcome.

I have responded to his message.

As I told him, whatever the source of this problem, it seems to be with
PHP and not Apache.

Here are two PHP bug reports that appear to be relevant, both still open:

Bug #51983 [fpm sapi] pm.status_path not working when cgi.fix_pathinfo=1
<a href="https://bugs.php.net/bug.php?id=51983" title="https://bugs.php.net/bug.php?id=51983">https://bugs.php.net/bug.php?id=51983</a>

Bug #55208 setting correct SCRIPT_NAME vs PHP_SELF is impossible in
certain circumstances
<a href="https://bugs.php.net/bug.php?id=55208" title="https://bugs.php.net/bug.php?id=55208">https://bugs.php.net/bug.php?id=55208</a>

Basically, the consensus is that the cgi.fix_pathinfo directive is a mess.

I wouldn't hold your breath for a resolution. If I were you, I would
take my good advice, use the Query String Append [QSA] flag, and be done
with it for good.

In any case, this discussion should probably be moved to a PHP forum or
mailing list at this point.

Thanks,

-Ben

Re: moving from mod_php to mod_fcgid : rewrite pro

By Riccardo Cohen at 02/15/2013 - 12:21

Hi Ben

I'll follow your advice. I'm really grateful for your help, and wish to
thank you very very much.

Bye

On 14/02/13 17:02, Ben Johnson wrote: