DevHeads.net

mysql transport failover

I would like to reduce the mysql transport retry time (or perhaps the
proxymap retry time?), is there a variable that I can tweak down to
reduce the time between retries of mysql transport connection losses?

I'm using mysql for transport_maps and virtual_mailbox_maps.

transport_maps = proxy:mysql:$maps_dir/mysql_transport.cf
virtual_mailbox_maps = mysql:$maps_dir/mysql_aliases.cf

these are configured to contact a locat stunnel process which connects
to a mysql cluster over an encrypted connection. This works great,
except when the active node of the cluster crashes (and it seems to be
doing that more frequently lately). The cluster fails-over to the
standby, the connections are re-established and things return to
normal.

When the node fails, postfix naturally cannot communicate over the mysql
connection, until the cluster has failed over. This failover is fairly
fast, within seconds, but I think postfix, probably due to the use of
the proxy map, is not retrying very quickly. Is there a tunable
parameter that I can use to tweak this down to a shorter delay?

The erorrs that arrive are expected in this scenario, for example, here
is a subset:

Oct 27 13:24:23 mx1 postfix/smtpd[11045]: warning: mysql:/etc/postfix/checks/mysql_suspended.cf: table lookup problem
Oct 27 13:24:37 mx1 postfix/proxymap[14768]: warning: mysql query failed: Lost connection to MySQL server during query
Oct 27 13:24:37 mx1 postfix/trivial-rewrite[11124]: fatal: proxy:mysql:/etc/postfix/maps/mysql_aliases.cf(0,lock|fold_fix): table lookup problem
Oct 27 13:24:38 mx1 postfix/master[7511]: warning: process /usr/lib/postfix/trivial-rewrite pid 11124 exit status 1
Oct 27 13:24:38 mx1 postfix/smtpd[12834]: warning: problem talking to service rewrite: Connection reset by peer
Oct 28 09:01:57 mx1 postfix/smtpd[4945]: warning: problem talking to service rewrite: Success
Oct 28 09:01:57 mx1 postfix/smtpd[4948]: warning: problem talking to service rewrite: Connection reset by peer

Postmaster also gets quite a large number of bounces when this happens:

In: MAIL FROM:< ... at xxx dot net> SIZE=2158 BODY=8BITMIME
Out: 250 2.1.0 Ok
In: RCPT TO:< ... at xxx dot net> ORCPT=rfc822; ... at riseup dot net
Out: 451 4.3.0 < ... at xxx dot net>: Temporary lookup failure
In: DATA
Out: 554 5.5.1 Error: no valid recipients
In: RSET
Out: 250 2.0.0 Ok
In: QUIT
Out: 221 2.0.0 Bye

Presumably these are non-fatal, due to the 451, and only postmaster sees
these, not the sender, and they are just retried, is that correct?

Thanks for any advice, I haven't found anything that specifically would
be related to this in
<a href="http://www.postfix.org/postconf.5.html#command_time_limit" title="http://www.postfix.org/postconf.5.html#command_time_limit">http://www.postfix.org/postconf.5.html#command_time_limit</a> but I might
have missed something.

micah

Comments

Re: mysql transport failover

By Wietse Venema at 11/09/2009 - 18:06

Micah Anderson:

Connections to database servers should not be lost routinely.

If anything should retry the query, then it would be the mysql
client. The proxymap can't make such decisions (for example, it
makes no sense to retry after a read error from a local file).

And in fact, the mysql client does implement retry logic. It retries
if you have more than one mysql server configured. Perhaps you
can specify the same server multiple times.

Wietse

Re: mysql transport failover

By Micah Anderson at 11/10/2009 - 15:51

Excerpts from wietse's message of Mon Nov 09 17:06:11 -0500 2009:

Agreed, however it seems I'm hitting a kernel oops which is causing a
more frequent crash, resulting in a cluster fail-over, than I would
like.

Interesting idea. Do you mean specify it twice in the main.cf
configuration such as:

alias_maps = mysql:/etc/postfix/mysql_aliases.cf, mysql:/etc/postfix/mysql_aliases.cf
transport_maps = proxy:mysql:/etc/postfix/mysql_transport.cf, proxy:mysql:/etc/postfix/mysql_transport.cf

Or do you mean two times in the host field in those files, such as the
following in the mysql_transport.cf:

hosts = mysql-cluster1 mysql-cluster1
dbname = postfix
user = postfix
password = yeahrightidsendthis
query = SELECT storage_ip FROM mailboxes WHERE address = '%s'
result_format = smtp:[%s]

I am guessing the latter method is the one to use?

thanks,
micah

Re: mysql transport failover

By Wietse Venema at 11/10/2009 - 16:45

micah anderson:

Ugh. That means twice the work when a query is "not found".

This repeats the query only if the session breaks. However, the
hosts are tried without delay, so this is unlikely to be a solution
for kernel panics.

Consider configuring more than one mysql server.

Wietse

Re: mysql transport failover

By Micah Anderson at 11/10/2009 - 17:46

Excerpts from wietse's message of Tue Nov 10 15:45:38 -0500 2009:

Yeah, ugh, not a good solution there.

Actually that was the point of my original message. I setup a mysql
cluster, so there are multiple mysql servers in a cluster fail-over
scenario. The fail-over is pretty fast, maybe 1 or two seconds, but
that may not be fast enough for the above mechanism for repeating the
query when the session breaks... hence the question about
tunables.

micah

Re: mysql transport failover

By Wietse Venema at 11/10/2009 - 18:22

micah anderson:

I was talking about DIFFERENT hosts instead of repeating the same
hostname in the Postfix config file.

If all your mysql servers are behind a single point of failure
(stunnel or whatever) then you don't have redundancy, and tweaking
Postfix solves the wrong problem.

Wietse

Re: mysql transport failover

By Micah Anderson at 11/10/2009 - 19:06

Excerpts from wietse's message of Tue Nov 10 17:22:57 -0500 2009:

That makes sense.

However, the mysql_table(5) doesn't specify in the MYSQL PARAMETERS
section that the 'hosts' parameter can take different ports. The
traditional way that is handled is with a colon delimiter (e.g
127.0.0.1:3307). Is this supported, or is the documentation just
lacking that information?

I ask because I can specify different hosts, but because I am trying
to tunnel the mysql connection over stunnel, each connection requires
a separate stunnel instance and although I can run two tunnels on the
same machine, that only works if they are on different ports. I'm
suspecting that the mysql client support would allow the following to
be valid:

hosts = 127.0.0.1:3306 127.0.0.1:3307

In this case, there would be two distinct local stunnel processes,
each connecting to a different server. Ideally, I wouldn't have to
erect a teetering tower of disparate parts to encrypt the mysql
traffic, but the alternatives are rather bleak.

thanks,
micah

Re: mysql transport failover

By Wietse Venema at 11/10/2009 - 20:03

micah anderson:

The name "hosts" suggests the possibility of more than one...

hosts The hosts that Postfix will try to connect to and query from.
Specify unix: for UNIX domain sockets, inet: for TCP connections
(default). Example:
hosts = host1.some.domain host2.some.domain
hosts = unix:/file/name

The hosts are tried in random order, [...]

That should work. The documentation fails to document the ":port"
syntax.

Wietse