http://www.websiteoptimization.com/speed/tweak/cache/
http://metaskills.net/2006/02/19/how-to-control-browser-caching-with-apache-2/
https://www.humboldt.co.uk/the-mystery-of-proxypassreverse/
https://httpd.apache.org/docs/current/mod/mod_proxy.html
ProxyPass
ProxyPassReverse
Adjusts the path in Set-Cookie HTTP headers on redirect responses when acting as a reverse proxy in order to avoid the client becoming aware that resources are mirrored.
ProxyPass / balancer://mycluster/rolling stickysession=JSESSIONID|jsessionid nofailover=Off
ProxyPassReverse /rolling /
ProxyPassReverseCookieDomain / rolling.com
ProxyPassReverseCookiePath /rolling /
A traditional HTTP proxy, also called a forward proxy, accepts requests from clients (usually web browsers), contacts the remote server, and returns the responses.
A reverse proxy is a web server that is placed in front of other servers, providing a unified frontend and acting as a gateway. As far as the web browsers are concerned, the reverse proxy is the “real” server, as that is the only one they interact with. The reverse proxy relays requests as necessary to the backend servers.
ProxyPass /crm http://crm.example.com/
ProxyPass /bugzilla
http://backend.example.com/bugzilla
A reverse proxy can provide a unified frontend to a number of backend resources, associating certain URLs on the frontend machine to specific backend web servers.
ProxyPass path url
This command runs on an ordinary server and translates requests for a named directory and below to a demand to a proxy server. So, on our ordinary Butterthlies site, we might want to pass requests to /secrets onto a proxy server darkstar.com:
ProxyPass /secrets http://darkstar.com
ProxyPassReverse path url
A reverse proxy is a way to masquerade one server as another — perhaps because the "real" server is behind a firewall or because you want part of a web site to be served by a different machine but not to look that way. It can also be used to share loads between several servers — the frontend server simply accepts requests and forwards them to one of several backend servers.
Mac Apache httpd conf:
Hiding the Backend Servers
ProxyPass /crm http://crm.example.com
ProxyPassReverse /crm http://crm.example.com
ProxyErrorOverride On
Sometimes, however, the backend server will issue redirects or error pages that contain references to itself, for example in the Location: header.
The ProxyPassReverse directive will intercept these headers and rewrite them so that they include a reference to the reverse proxy (www.example.com) instead. The ProxyPassReverseCookiePath and ProxyPassReverseCookieDomain directives operate similarly, but on the path and domain strings in Set-Cookie: headers.
http://northernmost.org/blog/mod_log_forensic-howto/
mod_log_forensic howto
LoadModule log_forensic_module modules/mod_log_forensic.so
LoadModule unique_id_module modules/mod_unique_id.so
ForensicLog logs/forensic_log
http://feitianbenyue.iteye.com/blog/2056357
http://www.akadia.com/services/apache_redirect.html
Proxy Module
The Forward Proxy vs The Reverse Proxy
While a forward proxy proxies in behalf of clients (or requesting hosts), a reverse proxy proxies in behalf of servers. A reverse proxy accepts requests from external clients on behalf of servers stationed behind it.
To the client in our example, it is the reverse proxy that is providing file transfer services. The client is oblivious to the file transfer servers behind the proxy, which are actually providing those services. In effect, whereas a forward proxy hides the identities of clients, a reverse proxy hides the identities of servers.
Common uses for a reverse proxy server include:
Load balancing
Web acceleration – Reverse proxies can compress inbound and outbound data, as well as cache commonly requested content, both of which speed up the flow of traffic between clients and servers. They can also perform additional tasks such as SSL encryption to take load off of your web servers, thereby boosting their performance.
Security and anonymity
http://blogs.citrix.com/2010/10/04/reverse-vs-forward-proxy/
http://stackoverflow.com/questions/4119213/how-to-set-level-logging-to-debug-in-tomcat
https://tomcat.apache.org/tomcat-8.0-doc/config/valve.html
server.xml:
https://tomcat.apache.org/tomcat-7.0-doc/config/valve.html#Access_Log_Valve
http://www.symantec.com/connect/blogs/enhancing-tomcat-logging-improved-forensics
http://www.techstacks.com/howto/configure-access-logging-in-tomcat.html
Combined Log Format: %{X-Forwarded-For}i %l %u %t %r %s %b %{User-Agent}i %{Referer}i
http://serverfault.com/questions/391457/how-does-apache-merge-multiple-matching-location-sections
<Location ~ "/solr">
Options FollowSymLinks
Order allow,deny
Allow from all
ProxyPass http://ip:port/solr
ProxyPassReverse http://localhost/solr
</Location>
https://community.akamai.com/community/web-performance/blog/2015/09/16/is-caching-on-akamai-impacted-by-vary-header
http://my.globaldots.com/knowledgebase.php?action=displayarticle&id=32
https://www.fastly.com/blog/best-practices-for-using-the-vary-header
http://metaskills.net/2006/02/19/how-to-control-browser-caching-with-apache-2/
<IfModule mod_expires.c>
ExpiresActive On
ExpiresDefault "access plus 1 seconds"
ExpiresByType text/html "access plus 1 seconds"
ExpiresByType image/gif "access plus 120 minutes"
ExpiresByType image/jpeg "access plus 120 minutes"
ExpiresByType image/png "access plus 120 minutes"
ExpiresByType text/css "access plus 60 minutes"
ExpiresByType text/javascript "access plus 60 minutes"
ExpiresByType application/x-javascript "access plus 60 minutes"
ExpiresByType text/xml "access plus 60 minutes"
</IfModule>
http://www.askapache.com/htaccess/apache-speed-cache-control/
This code uses the FilesMatch directive and the Header directive to add Cache-Control Headers to certain files.
# 480 weeks <FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$"> Header set Cache-Control "max-age=290304000, public" </FilesMatch> # 2 DAYS <FilesMatch "\.(xml|txt)$"> Header set Cache-Control "max-age=172800, public, must-revalidate" </FilesMatch> # 2 HOURS <FilesMatch "\.(html|htm)$"> Header set Cache-Control "max-age=7200, must-revalidate" </FilesMatch>
If you are using far Future Expires Headers and Cache-Control (recommended), you can do this for these files.
<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$"> Header set Cache-Control "public" Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT" </FilesMatch>
https://www.humboldt.co.uk/the-mystery-of-proxypassreverse/
The mod_proxy_ajp module for Apache has many advantages over mod_jk for connecting a Tomcat server to an Apache front. For me, the crucial advantage was the ProxyPassReverseCookiePath directive, which allows me to map the session cookies of a Tomcat web application (other than the root application) into the root of a virtual host.
Unfortunately, many tutorials contain misleading advice, and recommend this pattern for the ProxyPassReverse, which will break if the web application issues a redirect:
1.
ProxyPass /jspdir ajp://localhost:8009/jspdir
2.
ProxyPassReverse /jspdir ajp://localhost:8009/jspdir
The purpose of ProxyPassReverse is to rewrite the headers of HTTP redirect responses by a simple string substitution. Unfortunately, when the webapplication sends a redirect, it will send a redirect to a http: URL, not an ajp: URL. This will not match the argument of ProxyPassReverse, so the header will be passed through unchanged.
The working form looks like this, in a more complete example:
01.
<
VirtualHost
*:80>
02.
ServerName www.example.com
03.
...
04.
ProxyRequests Off
05.
<
Proxy
*>
06.
Order deny,allow
07.
Allow from all
08.
</
Proxy
>
09.
ProxyPass / ajp://localhost:8009/jspdir/
10.
ProxyPassReverse / http://www.example.com/jspdir/
11.
ProxyPassReverseCookiePath /jspdir /
12.
...
13.
</
VirtualHost
>
https://httpd.apache.org/docs/current/mod/mod_proxy.html
ProxyPass
This directive allows remote servers to be mapped into the space of the local server. The local server does not act as a proxy in the conventional sense but appears to be a mirror of the remote server. The local server is often called a reverse proxy or gateway. The path is the name of a local virtual path; url is a partial URL for the remote server and cannot include a query string.
will cause a local request for
http://example.com/mirror/foo/bar
to be internally converted into a proxy request to http://backend.example.com/bar
.
The
!
directive is useful in situations where you don't want to reverse-proxy a subdirectory, e.g.
will proxy all requests to
/mirror/foo
to backend.example.com
except requests made to /mirror/foo/i
.Adjusts the URL in HTTP response headers sent from a reverse proxied server |
path is the name of a local virtual path; url is a partial URL for the remote server. These parameters are used the same way as for the
ProxyPass
directive.
For example, suppose the local server has address
http://example.com/
; then
will not only cause a local request for the
http://example.com/mirror/foo/bar
to be internally converted into a proxy request to http://backend.example.com/bar
(the functionality which ProxyPass
provides here). It also takes care of redirects which the server backend.example.com
sends when redirecting http://backend.example.com/bar
to http://backend.example.com/quux
. Apache httpd adjusts this to http://example.com/mirror/foo/quux
before forwarding the HTTP redirect response to the client. Note that the hostname used for constructing the URL is chosen in respect to the setting of the UseCanonicalName
directive.
Useful in conjunction with
ProxyPassReverseCookiePath internal-path public-pathProxyPassReverse
in situations where backend URL paths are mapped to public paths on the reverse proxy. This directive rewrites the path
string in Set-Cookie
headers. If the beginning of the cookie path matches internal-path, the cookie path will be replaced with public-path.Adjusts the path in Set-Cookie HTTP headers on redirect responses when acting as a reverse proxy in order to avoid the client becoming aware that resources are mirrored.
ProxyPass / balancer://mycluster/rolling stickysession=JSESSIONID|jsessionid nofailover=Off
ProxyPassReverse /rolling /
ProxyPassReverseCookieDomain / rolling.com
ProxyPassReverseCookiePath /rolling /
A traditional HTTP proxy, also called a forward proxy, accepts requests from clients (usually web browsers), contacts the remote server, and returns the responses.
A reverse proxy is a web server that is placed in front of other servers, providing a unified frontend and acting as a gateway. As far as the web browsers are concerned, the reverse proxy is the “real” server, as that is the only one they interact with. The reverse proxy relays requests as necessary to the backend servers.
ProxyPass /crm http://crm.example.com/
ProxyPass /bugzilla
http://backend.example.com/bugzilla
A reverse proxy can provide a unified frontend to a number of backend resources, associating certain URLs on the frontend machine to specific backend web servers.
ProxyPass path url
This command runs on an ordinary server and translates requests for a named directory and below to a demand to a proxy server. So, on our ordinary Butterthlies site, we might want to pass requests to /secrets onto a proxy server darkstar.com:
ProxyPass /secrets http://darkstar.com
ProxyPassReverse path url
A reverse proxy is a way to masquerade one server as another — perhaps because the "real" server is behind a firewall or because you want part of a web site to be served by a different machine but not to look that way. It can also be used to share loads between several servers — the frontend server simply accepts requests and forwards them to one of several backend servers.
Mac Apache httpd conf:
/etc/apache2/
/var/log/apache2
sudo apachectl restart
Hiding the Backend Servers
ProxyPass /crm http://crm.example.com
ProxyPassReverse /crm http://crm.example.com
ProxyErrorOverride On
Sometimes, however, the backend server will issue redirects or error pages that contain references to itself, for example in the Location: header.
The ProxyPassReverse directive will intercept these headers and rewrite them so that they include a reference to the reverse proxy (www.example.com) instead. The ProxyPassReverseCookiePath and ProxyPassReverseCookieDomain directives operate similarly, but on the path and domain strings in Set-Cookie: headers.
The Apache JServ Protocol (AJP) is a binary protocol that can proxy inbound requests from a web server through to an application server that sits behind the web server.
http://northernmost.org/blog/mod_log_forensic-howto/
mod_log_forensic howto
LoadModule log_forensic_module modules/mod_log_forensic.so
LoadModule unique_id_module modules/mod_unique_id.so
ForensicLog logs/forensic_log
http://feitianbenyue.iteye.com/blog/2056357
- <Valve className="org.apache.catalina.valves.RemoteIpValve"
- remoteIpHeader="X-Forwarded-For"
- protocolHeader="X-Forwarded-Proto"
- protocolHeaderHttpsValue="https"/>
Enabling the Apache ProxyPreserveHost directive
The ProxyPreserveHost directive is used to instruct Apache mod_proxy, when acting as a reverse proxy, to preserve and retain the original Host: header from the client browser when constructing the proxied request to send to the target server.
The default setting for this configuration directive is Off, indicating to not preserve the Host: header and instead generate a Host: header based on the target server's hostname.
Because this is often not what is wanted, you should add the ProxyPreserveHost On directive to the Apache HTTPD configuration, either in httpd.conf or related/equivalent configuration files.
http://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypassreverse
will not only cause a local request for the
http://example.com/mirror/foo/bar
to be internally converted into a proxy request tohttp://backend.example.com/bar
(the functionality which ProxyPass
provides here). It also takes care of redirects which the server backend.example.com
sends when redirecting http://backend.example.com/bar
tohttp://backend.example.com/quux
. Apache httpd adjusts this to http://example.com/mirror/foo/quux
before forwarding the HTTP redirect response to the client. Note that the hostname used for constructing the URL is chosen in respect to the setting of theUseCanonicalName
directive.
Note that this
ProxyPassReverse
directive can also be used in conjunction with the proxy feature (RewriteRule ... [P]
) frommod_rewrite
because it doesn't depend on a corresponding ProxyPass
directive.Proxy Module
ProxyPass
The directive ProxyPass allows remote servers to be mapped into the space of the local server; the local server does not act as a proxy in the conventional sense, but appears to be a mirror of the remote server.
Suppose the local server has address http://wibble.org/; then
ProxyPass /mirror/foo/ http://foo.com/
will cause a local request for the <http://wibble.org/mirror/foo/bar> to be internally converted into a proxy request to <http://foo.com/bar>.
ProxyPassReverse
The directive ProxyPassReverse lets Apache adjust the URL in the Locationheader on HTTP redirect responses. For instance this is essential when Apache is used as a reverse proxy to avoid by-passing the reverse proxy because of HTTP redirects on the backend servers which stay behind the reverse proxy.
Suppose the local server has address http://wibble.org/; then
ProxyPass /mirror/foo/ http://foo.com/
ProxyPassReverse /mirror/foo/ http://foo.com/
will not only cause a local request for the <http://wibble.org/mirror/foo/bar>to be internally converted into a proxy request to <http://foo.com/bar> (the functionality ProxyPass provides here). It also takes care of redirects the server foo.com sends: when http://foo.com/bar is redirected by him tohttp://foo.com/quux Apache adjusts this tohttp://wibble.org/mirror/foo/quux before forwarding the HTTP redirect response to the client.ProxyPassReverse /mirror/foo/ http://foo.com/
Configure tomcat with httpd
mod_proxy_ajp
- Include two directives in your
httpd.conf
file for each web application that you wish to forward to Tomcat. For example, to forward an application at context path/myapp
:
ProxyPass /myapp http://localhost:8081/myapp ProxyPassReverse /myapp http://localhost:8081/myapp
http://localhost/myapp/*
to the Tomcat connector listening on port 8081. - Configure your copy of Tomcat to include a special
<Connector>
element, with appropriate proxy settings, for example:
<Connector port="8081" ... proxyName="www.mycompany.com" proxyPort="80"/>
www.mycompany.com
on port 80.
AJP
- Well supported
- More compact a protocol
- Easily configured in Apache and Tomcat
AJP looks slightly ahead for two reasons:
- You’re not opening up HTTP on Tomcat (in fact you should close it for security)
- As it’s a more compact protocol, there’s less traffic between the front server and the back server
# within the VirtualHost section - assuming tomcat is on port 8080
ProxyPass / http://localhost:8080/
ProxyPassReverse / http://localhost:8080/
ProxyPass / ajp://localhost:8009/
ProxyPassReverse / ajp://localhost:8009/
The term Virtual Host refers to the practice of running more than one web site (such as
company1.example.com
andcompany2.example.com
) on a single machine. Virtual hosts can be "IP-based", meaning that you have a different IP address for every web site, or "name-based", meaning that you have multiple names running on each IP address. The fact that they are running on the same physical server is not apparent to the end user.
IP-based virtual hosts use the IP address of the connection to determine the correct virtual host to serve. Therefore you need to have a separate IP address for each host. With name-based virtual hosting, the server relies on the client to report the hostname as part of the HTTP headers. Using this technique, many different hosts can share the same IP address.
Name-based virtual hosting is usually simpler, since you need only configure your DNS server to map each hostname to the correct IP address and then configure the Apache HTTP Server to recognize the different hostnames. Name-based virtual hosting also eases the demand for scarce IP addresses. Therefore you should use name-based virtual hosting unless you are using equipment that explicitly demands IP-based hosting.
http://www.jscape.com/blog/bid/87783/Forward-Proxy-vs-Reverse-ProxyThe Forward Proxy vs The Reverse Proxy
While a forward proxy proxies in behalf of clients (or requesting hosts), a reverse proxy proxies in behalf of servers. A reverse proxy accepts requests from external clients on behalf of servers stationed behind it.
To the client in our example, it is the reverse proxy that is providing file transfer services. The client is oblivious to the file transfer servers behind the proxy, which are actually providing those services. In effect, whereas a forward proxy hides the identities of clients, a reverse proxy hides the identities of servers.
Common uses for a reverse proxy server include:
Load balancing
Web acceleration – Reverse proxies can compress inbound and outbound data, as well as cache commonly requested content, both of which speed up the flow of traffic between clients and servers. They can also perform additional tasks such as SSL encryption to take load off of your web servers, thereby boosting their performance.
Security and anonymity
http://blogs.citrix.com/2010/10/04/reverse-vs-forward-proxy/
Reverse Proxies
A key component of Reverse Proxies is the ability to perform TCP Multiplexing. What this means is the incoming connections are terminated, pooled and new connections are established on the back-end using fewer number of server connections resulting in a TCP Multiplexing Ratio. A typical TCP Mux ratio is 10:1 – ten incoming connections to 1 back-end connection. Another benefit of this is that the connections on the back-end to the servers are kept open even when the incoming connections terminate so that they can be re-used when new incoming connections come in – reducing the time to establish server connections hence improving performance.
Reverse Proxies are good for:
- Application Delivery including:
- Load Balancing (TCP Multiplexing)
- SSL Offload/Acceleration (SSL Multiplexing)
- Caching
- Compression
- Content Switching/Redirection
- Application Firewall
- Server Obfuscation
- Authentication
- Single Sign On
Forward Proxies are good for:
- Content Filtering
- eMail security
- NAT’ing
- Compliance Reporting
http://stackoverflow.com/questions/4119213/how-to-set-level-logging-to-debug-in-tomcat
https://tomcat.apache.org/tomcat-8.0-doc/config/valve.html
server.xml:
https://tomcat.apache.org/tomcat-7.0-doc/config/valve.html#Access_Log_Valve
http://www.symantec.com/connect/blogs/enhancing-tomcat-logging-improved-forensics
http://www.techstacks.com/howto/configure-access-logging-in-tomcat.html
Combined Log Format: %{X-Forwarded-For}i %l %u %t %r %s %b %{User-Agent}i %{Referer}i
http://serverfault.com/questions/391457/how-does-apache-merge-multiple-matching-location-sections
<Location ~ "/solr">
Options FollowSymLinks
Order allow,deny
Allow from all
ProxyPass http://ip:port/solr
ProxyPassReverse http://localhost/solr
</Location>
Header append Vary "Accept-Encoding"
#or
Header set Vary "Accept-Encoding"
Header unset Vary
“Many HTTP caches decide that Vary: User-Agent is effectively Vary: * since the number of user-agents in the wild is so large. By asking to Vary on User-Agent you are asking your CDN to store many copies of your resource which is not very efficient for them, hence their turning off caching in this case.”
The HTTP Vary header is used by servers to indicate that the object being served will vary (the content will be different) based on some attribute of the incoming request, such as the requesting client's specified user-agent or language. The Akamai servers cannot cache different versions of the content based on the values of the Vary header. As a result, objects received with a Vary header that contains any value(s) other than "Accept-Encoding" will not be cached. To do so might result in some users receiving the incorrect version of the content (wrong language, etc.)
“Vary: User-Agent is broken for the Internet in general. ...the basic problem is that the user-agents vary so wildly that they are almost unique for every individual (not quite that bad but IE made it a mess by including the version numbers of .Net that are installed on users machines as part of the string). If you Vary on User-Agent then intermediate caches will pretty much end up never caching resources (like Akamai).”
Akamai will cache the content only if Vary header has value "Accept-Encoding".
Vary: Accept-Encoding
Any other values won't be cached. Examples:
Vary: Accept-Encoding,Referer
Vary: Accept-Encoding,User-Agent
Vary: User-Agent
http://my.globaldots.com/knowledgebase.php?action=displayarticle&id=32
An "Accept-Encoding" value to the Vary header is the exception to this rule only when it relates to serving the object compressed. Since compression does not change the content of the object (only its size for transfer), an object that varies only by its compression can be safely cached.
To summarize, Akamai servers will cache the object, based on the configuration settings, in either of the following cases:
To summarize, Akamai servers will cache the object, based on the configuration settings, in either of the following cases:
- If "Vary: Accept-Encoding" is received and the content is served compressed ("Content-Encoding: gzip").
- If no Vary header at all is received.
If you accidentally send the header and you are using a CDN, you can expect to experience the following nasty problems:
1. Your server load will increase dramatically, since requests will hit the origin instead of getting the content from the CDN.
2. Your users will experience slower response times because your origin will have to recomptue the file's content on every request, and the Geo-based content delivery will not be activated (the request will have to reach your origin server anyway).
3. Your CDN bill might explode (the CDN's usually charge by bandwidth, and they will be serving a lot more content if they are not able to cache it).
The solution for this problem can be:
1. If you don't really need the vary-by: user-agent header, don't send it! (.NET guys - if you use Microsoft BundleConfigto bundle and minify your CSS/JS, you will get this header automatically. You will need to get the System.Web.Optimization source code, remove the header from there, and recompile the DLL).
2. You can try to tell the CDN to ignore the vary-by: user agent header (in Akamai it is possible by configuration).
https://www.fastly.com/blog/best-practices-for-using-the-vary-header
So now there's an object in the cache that has a little flag on it that says "only to be used for requests that have no
Accept-Encoding
in the request."Vary: *
Don't use this, period.
The HTTP RFC says that if a Vary header contains the special header name
*
, each request for said URL is supposed to be treated as a unique (and uncacheable) request.
This is much better indicated by using
Cache-Control: private
, which is clearer to anyone reading the response headers. It also signifies that the object shouldn't ever be stored, which is much more secure.Vary: Cookie
Cookie
is probably one of the most unique request headers, and is therefore very bad. Cookies often carry authentication details, in which case you're better off not trying to cache pages, but just passing them through. If you're interested in caching with tracking cookies, read more here.
However, sometimes cookies are used for A/B testing purposes, in which case it's a good idea to Vary on a custom header and leave the Cookie header intact. This avoids a lot of additional logic to make sure the Cookie header is left for URLs that need it (and are probably not cacheable).