There are a couple of reasons why you might need to reverse proxy a web site: security, high availability, etc.
My favorite software to implement a reverse proxy is ( of course ) apache.
Some modules that allow this possibility are mod_proxy and mod_rewrite, usually you can use both of them for the more flexible results.
Mod_proxy lets you to publish internal http,ftp and AJP (= tomcat webapps) sites while mod_rewrite can be a godsend in rewriting urls , for example in SEO friendly urls or to redirect users from moved applications.
Mod_rewrite has also proxying capabilities but the latest mod_proxy has better features like caching and load balancing.
All this modules cannot change the content of the proxyed pages while sometimes this could be needed, you might have a proxied application that contains hardcoded absolute URLs which are different from you reverse proxy.
The most used module to change some strings in the output of your reverse proxy is mod_proxy_html.
This module allows you to intercept and substitute or add strings in the html produced by the proxyed application, it can be used together with mod_proxy to provide a more functional reverse proxy.
Sometimes mod_proxy_html power is still insufficient, that might happen because it fails to understand the html markup or scripting produced by the proxyed application.
In these cases I found another module that will be part of the next apache release and is currently in trunk: mod_sed.
Beware that mod_sed is still in its infancy (I would not recommend it in a big web site of a major italian financial institution) but I found it very handy when other solutions couldn’t globally replace some patterns.
One final suggestion is mod_security .
This great piece of software in an application level firewall: it (tries to) filter any malicious request before fowarding the request to the proxyed application.
Sometimes you just do not feel very comfortable about the security of some major php application or you do not trust blindly the latest microsoft technology du jour, this little friend could be some extra layer of security you were looking for…
That’s it for now, please let me know what do you think or your revproxying experiences
PS : ` Despite the tons of examples and docs, mod_rewrite is voodoo. Damned cool voodoo, but still voodoo. ”
– Brian Moore
bem@news.cmc.net (quoted from mod_rewrite docs)
Tags: apache, CPE1704TKS, linux, mod_sed, reverse proxy
September 22, 2009 at 1:15 am |
mod_sed is more mature than you give it credit for. I wouldn’t hesitate to use it as a general-purpose text filter. See http://bahumbug.wordpress.com/2008/04/28/sed-in-apache/ .
(speaking as an apache core dev, and author of mod_proxy_html)
September 22, 2009 at 2:08 pm |
wow a post from a core apache dev, I’m honored!
Anyway, I can confirm it works and I have personally used in some projects, but, as in the link you quote “it’s bleeding-edge, with all that implies”. But if you advice for production, I’ll tell it to my friend working in a “big italian financial institution” 8)