DevHeads.net

STARTTLS / DANE difficulties?

We are migrating our Postfix MX services and in the process have
disrupted a setup which has been very stable for the past couple of
years. One of the remaining items is this sort of message which only
started very recently:

Jul 10 11:55:29 mx31 postfix-p25/smtpd[70030]: connect from
hr1.samba.org[144.76.82.147]
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: warning: TLS library
problem: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad
certificate:/usr/src/crypto/openssl/ssl/s3_pkt.c:1493:SSL alert number
42:
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: lost connection after
STARTTLS from hr1.samba.org[144.76.82.147]
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: disconnect from
hr1.samba.org[144.76.82.147] ehlo=1 starttls=1 commands=2

I thought that these errors were the result of a misconfigured
certificate or private key for the postfix service. However, I have
examined these and they appear to be correct:

postconf -n | grep -i tls
smtp_tls_CAfile = /usr/local/etc/pki/tls/certs/ca-bundle.crt
smtp_tls_cert_file = /usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
smtp_tls_ciphers = medium
smtp_tls_exclude_ciphers = MD5, aDSS, SRP, PSK, aECDH, aDH, SEED,
IDEA, RC2, RC5
smtp_tls_key_file = /usr/local/etc/pki/tls/private/ca.harte-lyne.mx31.key
smtp_tls_protocols = !SSLv2, !SSLv3
smtp_tls_security_level = dane
smtp_tls_session_cache_database = btree:/var/db/postfix/smtp_scache
smtp_tls_session_cache_timeout = 3600s
smtpd_starttls_timeout = ${stress?10}${stress:120}s
smtpd_tls_CAfile = /usr/local/etc/pki/tls/certs/ca-bundle.crt
smtpd_tls_ask_ccert = yes
smtpd_tls_auth_only = yes
smtpd_tls_cert_file = /usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
smtpd_tls_ciphers = medium
smtpd_tls_dh1024_param_file = ${config_directory}/dh2048.pem
smtpd_tls_fingerprint_digest = sha256
smtpd_tls_key_file =
/usr/local/etc/pki/tls/private/ca.harte-lyne.mx31.key
smtpd_tls_protocols = !SSLv2, !SSLv3
smtpd_tls_received_header = yes
smtpd_tls_security_level = may
smtpd_tls_session_cache_database = btree:/var/db/postfix/smtpd_scache
smtpd_tls_session_cache_timeout = 3600s
tls_random_source = dev:/dev/urandom

# ll /usr/local/etc/pki/tls/private/
total 18
-rw------- 1 root wheel 3243 Jun 7 15:37 2016003E.key
lrwxr-xr-x 1 root wheel 12 Jul 10 12:19 ca.harte-lyne.mx31.key ->
2016003E.key

ll /usr/local/etc/pki/tls/certs
total 565
-rw-r--r-- 1 root wheel 10164 Jun 7 15:37 2016003E.pem
-rw-r--r-- 1 root wheel 822512 Jul 10 12:05 ca-bundle.crt
lrwxr-xr-x 1 root wheel 22 Jul 10 12:07 ca.harte-lyne.mx31.crt
-> ca.harte-lyne.mx31.pem
lrwxr-xr-x 1 root wheel 12 Jul 10 12:06 ca.harte-lyne.mx31.pem
-> 2016003E.pem

# openssl x509 -noout -text -in
/usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 538312766 (0x2016003e)
Signature Algorithm: sha512WithRSAEncryption
Issuer: CN=CA_HLL_ISSUER_2016, OU=Networked Data Services,
O=Harte & Lyne Limited, L=Hamilton, ST=Ontario, C=CA,
DC=harte-lyne, DC=ca
Validity
Not Before: Jun 1 00:00:00 2018 GMT
Not After : Jun 30 23:59:59 2023 GMT
O=Harte & Lyne Limited, L=Hamilton, ST=Ontario, C=CA,
DC=hamilton, DC=harte-lyne, DC=ca
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
. . .

Can someone interpret for me what these messages are telling me? Is
samba.org misconfigured or me?

Comments

Re: STARTTLS / DANE difficulties?

By Viktor Dukhovni at 07/10/2018 - 12:30

What is the MX hostname associated with this Postfix instance? What
domains does it serve? That has bearing on the TLSA records seen
by the connecting SMTP client.

The client rejected the server's certificate chain. The details
are known only to the client.

"Correct" is in the eye of the beholder. Did the certificate chain
match the associated DANE TLSA records? Might samba.org have reason
to expect to authenticate your server via WebPKI? You're using a
private CA...

Its current cert chain seems to match the TLSA records for the above
name, though two of the three TLSA records seem redundant:

mx31.harte-lyne.ca. IN A 216.185.71.31 ; AD=1 NoError
mx31.harte-lyne.ca. IN AAAA ? ; AD=1 NODATA
_25._tcp.mx31.harte-lyne.ca. IN CNAME _tlsa._dane.trust.harte-lyne.ca. ; AD=1 NoError
_tlsa._dane.trust.harte-lyne.ca. IN TLSA 2 0 2 67274b355428905895c6b581950e0ed4f7d043f31f7e7020b716b7faa06776b6aadd33e127624b6e8c75c520a01d9cad3bd29f18fa7dcb3d5fd3917510e6722a ; AD=1 NoError
_tlsa._dane.trust.harte-lyne.ca. IN TLSA 2 1 2 380259229e21a1946b38cfc594cbc993b61bc93762b7b6c6637b3eef9c5a2bb70c589b91beb73bd1304eac11b3917e33819e2b47d25d4966435a2a3e83c1f80f ; AD=1 NoError
_tlsa._dane.trust.harte-lyne.ca. IN TLSA 2 1 2 c26e0ec16a46a97386e8f31f8ecc971f2d73136aa377dfdaac2b2b00f7cab4bb29b17d913c82093b41fd0d9e40b66a68361c126f1f4017f9ce60eabc5adba90e ; AD=1 NoError
mx31.harte-lyne.ca[216.185.71.31]: pass: TLSA match: depth = 1, name = mx31.harte-lyne.ca
TLS = TLS12 with ECDHE-RSA-AES256GCM-SHA384
name = mx31.harte-lyne.ca
name = mx31
name = mx31.hamilton
name = mx31.hamilton.harte-lyne.ca
depth = 0
Issuer CommonName = CA_HLL_ISSUER_2016
Issuer Organization = Harte & Lyne Limited
notBefore = 2018-06-01T00:00:00Z
notAfter = 2023-06-30T23:59:59Z
Subject CommonName = mx31.harte-lyne.ca
Subject Organization = Harte & Lyne Limited
pkey sha256 [nomatch] <- 3 1 1 3fa3dae08e2fecff0611a75767ee0995a115e308a181ad79a6d163315742b270
cert sha512 [nomatch] <- 3 0 2 cc5bd085ba7e1c136539083bf32ad6512b6c0fe5a31a8f2f775b627ab1c6525d7464c751191a4e1747072f5bd63d364713e48a4636ca25e31532ca0657444c7f
pkey sha512 [nomatch] <- 3 1 2 39248e9342c4fc8fb67dac3f51e7a2d9e77d7a37df6fac0272006cc7d757e5346c9e11f93f7f8c34cacf95cd0e60d1ab5b3fc2b9881551fa9bc9a6fb6e3300a8
depth = 1
Issuer CommonName = CA_HLL_ROOT_2016
Issuer Organization = Harte & Lyne Limited
notBefore = 2016-11-01T00:00:00Z
notAfter = 2035-11-01T23:59:59Z
Subject CommonName = CA_HLL_ISSUER_2016
Subject Organization = Harte & Lyne Limited
pkey sha256 [nomatch] <- 2 1 1 9c19d0fed453f6c49cd9f569af9b5da75ef6d8baabd26308eee88adb2d06a3b5
cert sha512 [nomatch] <- 2 0 2 ab23a715c42f6cf8a2502b725969adedf1f6c6bedbb483fb49badc5470232297b34a3a7716b2dd7eb086bd6e462599db95f9af3415209eadea71450c72af942a
pkey sha512 [matched] <- 2 1 2 380259229e21a1946b38cfc594cbc993b61bc93762b7b6c6637b3eef9c5a2bb70c589b91beb73bd1304eac11b3917e33819e2b47d25d4966435a2a3e83c1f80f
depth = 2
Issuer CommonName = CA_HLL_ROOT_2016
Issuer Organization = Harte & Lyne Limited
notBefore = 2016-11-01T00:00:00Z
notAfter = 2036-10-31T23:59:59Z
Subject CommonName = CA_HLL_ROOT_2016
Subject Organization = Harte & Lyne Limited
pkey sha256 [nomatch] <- 2 1 1 4bd5dd98b37237136d1a5b2e45ee8ed1c9f2c2569b6dc94f0951da5af6d090c4
cert sha512 [nomatch] <- 2 0 2 4a4ea8374f20e46009b03bd19793598b5f4e0d38aeba39644f6b8659057ca16a4c5bfd7f3779ec83c1d26c732edbc9d41454f9866d25109bcde177eae58a4481
pkey sha512 [matched] <- 2 1 2 c26e0ec16a46a97386e8f31f8ecc971f2d73136aa377dfdaac2b2b00f7cab4bb29b17d913c82093b41fd0d9e40b66a68361c126f1f4017f9ce60eabc5adba90e

[ 4096-bit keys are IMHO overkill. ]

Re: STARTTLS / DANE difficulties?

By byrnejb at 07/10/2018 - 13:26

On Tue, July 10, 2018 13:30, Viktor Dukhovni wrote:
mx31.harte-lyne.ca - harte-lyne.ca / .harte-lyne.ca

<a href="https://dane-test.had.dnsops.gov/server/dane_check.cgi?host=harte-lyne.ca" title="https://dane-test.had.dnsops.gov/server/dane_check.cgi?host=harte-lyne.ca">https://dane-test.had.dnsops.gov/server/dane_check.cgi?host=harte-lyne.ca</a>
ere[prts that all declared servers, other than those currently
off-line, are error free.

Having recently replaced our entire PKI because of Mozilla determining
our root certificate had an inadequate key size (selected back in
2005) I decided overkill is not thorough enough, but perforce
suffices. That is also why we have two separate roots and certificate
chains, which will continue until the last of the original CA's
certificates are replaced or the services shutdown.

Re: STARTTLS / DANE difficulties?

By Viktor Dukhovni at 07/10/2018 - 19:35

If that's the only hostname resolving to that IP address, then its
DANE TLSA records do appear to be presently correct. Can't speak
about the past if the machine was undergoing maintenance.

The connecting client did not like one of the certificates in the
chain. Perhaps it expected to find working a WebPKI certificate
from one of the usual suspects ("browser bundle" public root CAs).

You should ask the postmaster of the sending domain? Is the problem
ongoing? Or a transient glitch?

There are interoperability advantages to being in the middle of the
pack, some implementations might have restricted key sizes. The
most popular key size is RSA-2048. There isn't much evidence that
this is the issue, so use this suggestion as you see fit.

Re: STARTTLS / DANE difficulties?

By byrnejb at 07/11/2018 - 09:13

On Tue, July 10, 2018 20:35, Viktor Dukhovni wrote:
It is an ongoing problem with delivery to us of the samba-users
mailing list digest, of which I am a subscriber.

I am in communication with the person directly responsible for
implementing DANE at that site. They have just implemented DANE which
is when the problems first started.

As we use 'smtp_tls_security_level = dane' and as they are missing a
number of TLSA RRs their problem with us may be an incomplete
implementation. I have referred them to:
<a href="https://dane-test.had.dnsops.gov/server/dane_check.cgi?host=hr1.samba.org" title="https://dane-test.had.dnsops.gov/server/dane_check.cgi?host=hr1.samba.org">https://dane-test.had.dnsops.gov/server/dane_check.cgi?host=hr1.samba.org</a>.
We will see if any changes result.

Thank you for your help, as always.

Regards,

Re: STARTTLS / DANE difficulties?

By Viktor Dukhovni at 07/11/2018 - 10:12

Any logs they're willing to share would likely be enlightening.

Do you know which MTA they're using?

Your outbound use of DANE when sending email to them has no bearing
on difficulties with their outbound use of DANE when sending email to
you.

What does that mean???

Do they support certificate usage DANE-TA(2)? Perhaps their MTA
only supports DANE-EE(3) and chokes on DANE-TA(2). You could publish
both "3 1 1" and "2 1 1" TLSA records for each MX host, and see if
that resolves the issue.

; TLSA RRs matching the EE key, intermediate CA key and root CA key, respectively
; Just the EE and intermediate should be enough.
;
_25._tcp.mx32.harte-lyne.ca. IN TLSA 3 1 1 9d111285068fd3e814269b472b75e46a2700f8655989e8c0007e33881ad09733
_25._tcp.mx32.harte-lyne.ca. IN TLSA 2 1 1 9c19d0fed453f6c49cd9f569af9b5da75ef6d8baabd26308eee88adb2d06a3b5
_25._tcp.mx32.harte-lyne.ca. IN TLSA 2 1 1 4bd5dd98b37237136d1a5b2e45ee8ed1c9f2c2569b6dc94f0951da5af6d090c4

_25._tcp.inet08.hamilton.harte-lyne.ca. IN TLSA 3 1 1 478dbe42903020004738f55fc767c6c2ed5cf5b9e7d256b797bd305e84d03a55
_25._tcp.inet08.hamilton.harte-lyne.ca. IN TLSA 2 1 1 9c19d0fed453f6c49cd9f569af9b5da75ef6d8baabd26308eee88adb2d06a3b5
_25._tcp.inet08.hamilton.harte-lyne.ca. IN TLSA 2 1 1 4bd5dd98b37237136d1a5b2e45ee8ed1c9f2c2569b6dc94f0951da5af6d090c4

_25._tcp.mx31.harte-lyne.ca. IN TLSA 3 1 1 3fa3dae08e2fecff0611a75767ee0995a115e308a181ad79a6d163315742b270
_25._tcp.mx31.harte-lyne.ca. IN TLSA 2 1 1 9c19d0fed453f6c49cd9f569af9b5da75ef6d8baabd26308eee88adb2d06a3b5
_25._tcp.mx31.harte-lyne.ca. IN TLSA 2 1 1 4bd5dd98b37237136d1a5b2e45ee8ed1c9f2c2569b6dc94f0951da5af6d090c4

... remaining MX hosts ...

If it does, the Samba list should disable DANE support until their
implementation is less crippled. It needs to either not enforce
DANE for MX hosts with just DANE-TA(2) records, or properly support
DANE-TA(2) records.

Re: STARTTLS / DANE difficulties?

By byrnejb at 07/11/2018 - 13:13

On Wed, July 11, 2018 11:12, Viktor Dukhovni wrote:
I will ask.

NMAP reports: Exim smtpd 4.91

When I run a DANE test against the domain that is failing to connect
this is among the results:

Test # Host IP Status Test Description (§ Section)

103 hr1.samba.org FAILED Service hostname must have matching TLSA record
Resolving TLSA records for hostname '_25._tcp.hr1.samba.org'

403 hr1.samba.org FAILED All IP addresses for a host that is TLSA
protected must TLSA verify
Validating TLSA records for 0 out of 1 IP addresses found for host
hr1.samba.org

I will attempt that as soon as I finish the movement of our MX
services off their current hosts and onto the new.

Ah. Well, I know how welcome the news that 'one is doing something so
wrong that one should just stop doing it' can be. I would rather
avoid the natural antagonism such advice is likely to engender.
Instead I have provided them a few clues as to where some obvious
problems lie and left it to their judgement as to how to proceed.
Eventually they will either sort out their troubles or arrive at the
same conclusion.

My concern in this is to assure myself that our services are running
correctly. If they are and the difficulties all lie with samba.org
then can live without the mailing list digest for now.

Re: STARTTLS / DANE difficulties?

By byrnejb at 07/19/2018 - 08:14

On Wed, July 11, 2018 14:13, James B. Byrne wrote:
We are encountering errors with several domains similar to the one
reported by samba.org:

. . .
Jul 18 22:36:38 mx31 postgrey[85107]: action=pass, reason=triplet
found, client_name=mailroot5.namespro.ca, client_address=158.85.87.68,
sender= ... at everydayfreight dot com, recipient=exports@harte-lyne.ca
Jul 18 22:36:38 mx31 postfix-p25/smtpd[17802]: lost connection after
DATA (0 bytes) from mailroot5.namespro.ca[158.85.87.68]
Jul 18 22:36:38 mx31 postfix-p25/smtpd[17802]: disconnect from
mailroot5.namespro.ca[158.85.87.68] ehlo=2 starttls=1 mail=1 rcpt=2
data=0/1 commands=6/7
. . .

Jul 18 23:41:45 mx31 policyd-spf[81903]: prepend Received-SPF: Pass
(mailfrom) identity=mailfrom; client-ip=66.135.118.147;
helo=mail.rosedale.ca; envelope-from= ... at connectrans dot com;
receiver=<UNKNOWN>
Jul 18 23:41:45 mx31 postfix-p25/smtpd[97338]: NOQUEUE:
client=mail.rosedale.ca[66.135.118.147]
Jul 18 23:41:45 mx31 postfix-p25/smtpd[97338]: lost connection after
DATA (0 bytes) from mail.rosedale.ca[66.135.118.147]
. . .

This is causing us problems in our operational departments. Based on
the message traffic surrounding this issue I have changed the client
certificate request setting to 'no' to see if that improves delivery.

smtpd_tls_ask_ccert = no.

Any insightful comments on this situation are welcomed.

SMTP client timeout issues - was: STARTTLS / DANE difficulties?

By byrnejb at 07/19/2018 - 09:22

After changing the client certificate request to 'no' we get a little
further in the negotiation but it still fails:

. . .
Jul 19 09:31:37 mx32 postgrey[29869]: action=pass, reason=triplet
found, client_name=mail.rosedale.ca, client_address=66.135.118.147,
sender= ... at connectrans dot com, recipient=exports@harte-lyne.ca
Jul 19 09:31:37 mx32 policyd-spf[15740]: prepend Received-SPF: Pass
(mailfrom) identity=mailfrom; client-ip=66.135.118.147;
helo=mail.rosedale.ca; envelope-from= ... at connectrans dot com;
receiver=<UNKNOWN>
Jul 19 09:31:37 mx32 postfix-p25/smtpd[44981]: NOQUEUE:
client=mail.rosedale.ca[66.135.118.147]
Jul 19 09:31:37 mx32 postfix-p25/smtpd[44981]: lost connection after
DATA (0 bytes) from mail.rosedale.ca[66.135.118.147]
Jul 19 09:31:37 mx32 postfix-p25/smtpd[44981]: disconnect from
mail.rosedale.ca[66.135.118.147] ehlo=1 mail=1 rcpt=1 data=0/1
commands=3/4
. . .

The internal mail server for this organisation is MX Exchange.
However, the MTA relay is a Barracuda firewall appliance:

PORT STATE SERVICE VERSION
25/tcp open smtp Barracuda Networks Spam Firewall smtpd
Service Info: CPE: cpe:/h:barracudanetworks:spam_%26_virus_firewall_600:-

They are reporting a timeout error when trying to transmit to our
Postfix-3.3.1 MX. All we see is the above in our maillog.

Their DSN says:

. . . conversation with mx32.harte-lyne.ca[216.185.71.32]:25 timed out
while sending MAIL FROM . . .

We have only seen this type of problem (client disconnect with 0 data
transferred) with a very few of our correspondents. As it coincides
with moving from Postfix-2.11 to 3.3 we are concerned that we have
introduced some sort of compatibility issue.

Re: SMTP client timeout issues - was: STARTTLS / DANE difficulti

By Viktor Dukhovni at 07/19/2018 - 09:42

No, you get 100% of the way through the TLS handshake, the problem
is with SMTP inside the now encrypted channel. Perhaps MTU issues
in the client->server direction.

This delivery attempt did not even do TLS at all.

The connection is lost, at the beginning of what would be the data
transfer phase. Perhaps a path MTU issue in the client -> server
direction.

What I find a bit surprising is the combination of "NOQUEUE" and
"rcpt=1", which would normally mean an accepted recipient and the
assignment of a queue-id. Do you have a pre-queue proxy-filter?
In that case the queue id is assigned only after the full message
body is received.

That's odd, they actually sent "MAIL FROM", "RCPT TO" and "DATA"
pipelined together. But somehow never saw the replies? Perhaps
buggy pipelining support?

The Postfix version is likely irrelevant. Get a PCAP file and
see what's happening at the TCP layer.

Re: SMTP client timeout issues - was: STARTTLS / DANE difficulti

By byrnejb at 07/19/2018 - 11:28

On Thu, July 19, 2018 10:42, Viktor Dukhovni wrote:
Yes we do:

smtp inet n - n - - smtpd
-o smtpd_tls_security_level=may
-o smtpd_proxy_filter=127.0.0.1:10024
-o smtpd_client_connection_count_limit=10
-o smtpd_proxy_options=speed_adjust
-o syslog_name=postfix-p25

Which is for amavisd.

/usr/local/etc/amavisd.conf:$inet_socket_port = 10024;

Our system load averages are not excessive:

MX host:
12:18PM up 8 days, 19 mins, 1 users, load averages: 0.39, 0.34, 0.29

Stand-alone Desktop workstation:
2:19PM up 13 days, 16:39, 1 users, load averages: 0.35, 0.29, 0.26

Both systems running FreeBSD-11.1

In the particular case under consideration I discovered this reference
to the Barracuda firewall:

What are potential causes of "Sender Timeout" on the Barracuda Spam
Firewall?
<a href="https://campus.barracuda.com/product/emailsecuritygateway/knowledgebase/50160000000GTxtAAG/what-are-potential-causes-of-sender-timeout-on-the-barracuda-spam-firewall/" title="https://campus.barracuda.com/product/emailsecuritygateway/knowledgebase/50160000000GTxtAAG/what-are-potential-causes-of-sender-timeout-on-the-barracuda-spam-firewall/">https://campus.barracuda.com/product/emailsecuritygateway/knowledgebase/...</a>

Which contains the following notes:

. . .
Sender timeouts can be caused by any the following: Firewall with
proxying or some type of packet filtering enabled for port 25 (sender
or receiver's firewall)
. . .
Any type of relay device between the firewall and Barracuda not
configured properly or with additional scanning enabled (receiver's
side)
. . .
Here are some references for additional information: ESMTP TLS
Configuration Note: If you use Transport Layer Security (TLS)
encryption for e-mail communication then the ESMTP inspection feature
(enabled by default) in the PIX drops the packets. In order to allow
the e-mails with TLS enabled, disable the ESMTP inspection feature as
this output shows. Refer to Cisco bug ID CSCtn08326 (registered
customers only) for more information. pix(config)#policy-map
global_policy pix(config-pmap)#class inspection_default
pix(config-pmap-c)#no inspect esmtp pix(config-pmap-c)#exit
pix(config-pmap)#exit
. . .

Re: STARTTLS / DANE difficulties?

By Viktor Dukhovni at 07/19/2018 - 09:19

This does not look *at all* similar to me. The client sent:

EHLO
STARTTLS + TLS complete handshake
EHLO (inside TLS encrypted stream)
MAIL FROM: (inside TLS encrypted stream)
RCPT TO: (inside TLS encrypted stream)
RCPT TO: (inside TLS encrypted stream)
DATA: (inside TLS encrypted stream)

Then connection was lost after "DATA". This is *not* a TLS handshake
failure. Looks rather more like an ordinary message transmission
failure, or perhaps data-stage greylisting, ...

You really need to show more of the (non-verbose) logging for this
session and the below. You're cutting out critical context.

A good idea, but exceedingly unlikely to make any difference for
the cases above.

Re: STARTTLS / DANE difficulties?

By byrnejb at 07/19/2018 - 13:30

Jul 19 13:40:35 mx31 postfix-p25/smtpd[31356]: disconnect from
unknown[93.186.253.159] commands=0/0

Jul 19 13:40:39 mx31 postgrey[85107]: action=pass, reason=triplet
found, client_name=mail.rosedale.ca, client_address=66.135.118.147,
sender= ... at connectrans dot com, recipient=X@HARTE-LYNE.CA

Jul 19 13:40:39 mx31 policyd-spf[5684]: prepend Received-SPF: Pass
(mailfrom) identity=mailfrom; client-ip=66.135.118.147;
helo=mail.rosedale.ca; envelope-from= ... at connectrans dot com;
receiver=<UNKNOWN>

Jul 19 13:40:39 mx31 postfix-p25/smtpd[96635]: NOQUEUE:
client=mail.rosedale.ca[66.135.118.147]

Jul 19 13:40:39 mx31 postfix-p25/smtpd[96635]: lost connection after
DATA (0 bytes) from mail.rosedale.ca[66.135.118.147]

Jul 19 13:40:39 mx31 postfix-p25/smtpd[96635]: disconnect from
mail.rosedale.ca[66.135.118.147] ehlo=1 mail=1 rcpt=1 data=0/1
commands=3/4

Jul 19 13:40:39 mx31 postgrey[85107]: action=pass, reason=triplet
found, client_name=mail.rosedale.ca, client_address=66.135.118.147,
sender= ... at connectrans dot com, recipient=X@HARTE-LYNE.CA

Jul 19 13:40:39 mx31 policyd-spf[34171]: prepend Received-SPF: Pass
(mailfrom) identity=mailfrom; client-ip=66.135.118.147;
helo=mail.rosedale.ca; envelope-from= ... at connectrans dot com;
receiver=<UNKNOWN>

Jul 19 13:40:39 mx31 postfix-p25/smtpd[40715]: NOQUEUE:
client=mail.rosedale.ca[66.135.118.147]

Jul 19 13:40:39 mx31 postfix-p25/smtpd[40715]: lost connection after
DATA (0 bytes) from mail.rosedale.ca[66.135.118.147]

Jul 19 13:40:39 mx31 postfix-p25/smtpd[40715]: disconnect from
mail.rosedale.ca[66.135.118.147] ehlo=1 mail=1 rcpt=1 data=0/1
commands=3/4

Jul 19 13:40:46 mx31 postfix-p25/smtpd[96548]: warning: hostname
host159-253-186-93.serverdedicati.aruba.it does not resolve to address
93.186.253.159: hostname nor servname provided, or not known

Re: Network difficulties with some senders

By Viktor Dukhovni at 07/19/2018 - 21:23

The sending client is encountering networking issues. Unrelated
to Postfix or TLS. Your new systems may have enabled TCP window
scaling or ECN, or other TCP features that are giving the sending
system indigestion. Or just a vanilla MTU issue. You could try
to disable window scaling, which makes data transfer slower for
senders with a high bandwidth delay product, but email is generally
tolerant of a few seconds to minutes of extra delay.

Stating carefully at PCAP files might help.

Much the same, no TLS in sight.

Re: Network difficulties with some senders

By byrnejb at 07/20/2018 - 14:08

On Thu, July 19, 2018 22:23, Viktor Dukhovni wrote:
We have resolved this issue. But the fix is not what one would
describe as intuitive. The local cacheing DNS service on the MX host
is local_unbound with a reference to 127.0.0.1 in resolv.conf. I set
resolv.conf to check only our forwarding DNS hosts and removed the
reference to 127.0.0.1. And the 'lost connection after DATA (0
bytes)' problem immediately disappeared and did not return.

The clue which unlocked this puzzle I discovered when I attempted to
ssh into the MX host directly rather than to the underlying host and
use a local console connection. There was a noticeable delay logging
on via ssh. Turning off UseDNS in sshd_config allowed immediate ssh
logons. That told me that the issue was with the resolver.

The MX host in question is running in a FreeBSD jail. FreeBSD ships
with a default DNS service named local_unbound. This was set up in
accordance with previous experience and appeared to be working, from
the command line.

A feature of FreeBSD jails is that each requires a dedicated cloned
loopback interface each with its own unique IP address, usually
127.0.0.[cloned index number]. Jails are supposed to automatically
map references to 127.0.0.1 to whatever IP-ADDR is assigned to that
Jail's lo interface. I had previously discovered that this was not
the case with Postfix's inet_interfaces setting. It turns out that
the same thing happens with the resolver (or rather what is expected
does not happen) when 127.0.0.1 is placed in /etc/resolv.conf.

I replaced 127.0.0.1 with the address actually assigned to the cloned
lo interface used by that jail and problem did not resurface. If I
return it to 127.0.0.1 then we get the problem behaviour.

Evidently the issue is caused by DNS timeouts. Why it affected only a
few senders I cannot explain but we have had no further disconnects
attributable to this cause since the changes were made to resolv.conf.
We changed nothing else so this is either the remedy or we have a
massively improbable coincidence. Not impossible but very unlikely.

Having written that no doubt I have called upon the fates to disabuse
me of my hubris.

Thanks for all the help. It was greatly appreciated.

Re: Network difficulties with some senders

By Jan P. Kessler at 07/19/2018 - 23:33

It could help to allow icmp type 3 code 4 (Destination Unreachable
Fragmentation Needed)  between the machines.

Best regards
  Jan

Re: STARTTLS / DANE difficulties?

By Viktor Dukhovni at 07/11/2018 - 22:30

Please do, and ask for permission to post the results here or with
me off-list, but I would also need permission to share the logs
with the Exim developers, ideally on the exim-dev or exim-users
lists, so on-list would be best.

Exim linked with OpenSSL is believed to handle DANE reasonably
correctly. Exim linked with GnuTLS is known to exhibit some warts.
In a recent TLS WG discussion Phil Pennock (one of the Exim developers,
with an apparent focus on DANE) wrote in response to me:

That's fine by me. Linking against GnuTLS has long had implications for
mail delivery. It blocked SSLv3 at a time when SSLv3 was still fairly
widespread in corporate circles (Exchange). Folks who care about TLS
interop for real mail-systems use OpenSSL.

It would be good to know which flavour of Exim they have. You don't
appear to be using SHA-1, indeed I see SHA512. So if the issue is
with GnuTLS, it is not the SHA-1 issue.

That has no bearin on traffic from them to you.

Exim should have working DANE-TA(2) support, when linked with
OpenSSL, in is doing DANE X.509 verification via code I contributed
that is based on the original implementation of DANE in Postfix.
When linked with GnuTLS is it using the GnuTLS DANE implementation,
whose issues may not as yet have all been uncovered.

There are ways of communicating the message that their MTA's DANE
support is not ready for prime-time that will be gratefully accepted
(thanks for letting us know).

I've not found any issues on your side.

RE: STARTTLS / DANE difficulties?

By Fazzina, Angelo at 07/10/2018 - 12:05

When you test connecting to your servers yourself do you get any errors ?
Not sure if sslv3 is ok to see if using TLS ???

Commands to try, just replace with your server name
openssl s_client -connect mta5.uits.uconn.edu:465
openssl s_client -starttls smtp -connect mta5.uits.uconn.edu:587

openssl s_client -connect <yourname>:465
openssl s_client -starttls smtp -connect <yourname>:587

good luck.

-ANGELO FAZZINA

ITS Service Manager:
Spam and Virus Prevention
Mass Mailing
G Suite/Gmail

<a href="mailto: ... at uconn dot edu"> ... at uconn dot edu</a>
University of Connecticut,  ITS, SSG, Server Systems
860-486-9075

We are migrating our Postfix MX services and in the process have
disrupted a setup which has been very stable for the past couple of
years. One of the remaining items is this sort of message which only
started very recently:

Jul 10 11:55:29 mx31 postfix-p25/smtpd[70030]: connect from
hr1.samba.org[144.76.82.147]
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: warning: TLS library
problem: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad
certificate:/usr/src/crypto/openssl/ssl/s3_pkt.c:1493:SSL alert number
42:
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: lost connection after
STARTTLS from hr1.samba.org[144.76.82.147]
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: disconnect from
hr1.samba.org[144.76.82.147] ehlo=1 starttls=1 commands=2

I thought that these errors were the result of a misconfigured
certificate or private key for the postfix service. However, I have
examined these and they appear to be correct:

postconf -n | grep -i tls
smtp_tls_CAfile = /usr/local/etc/pki/tls/certs/ca-bundle.crt
smtp_tls_cert_file = /usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
smtp_tls_ciphers = medium
smtp_tls_exclude_ciphers = MD5, aDSS, SRP, PSK, aECDH, aDH, SEED,
IDEA, RC2, RC5
smtp_tls_key_file = /usr/local/etc/pki/tls/private/ca.harte-lyne.mx31.key
smtp_tls_protocols = !SSLv2, !SSLv3
smtp_tls_security_level = dane
smtp_tls_session_cache_database = btree:/var/db/postfix/smtp_scache
smtp_tls_session_cache_timeout = 3600s
smtpd_starttls_timeout = ${stress?10}${stress:120}s
smtpd_tls_CAfile = /usr/local/etc/pki/tls/certs/ca-bundle.crt
smtpd_tls_ask_ccert = yes
smtpd_tls_auth_only = yes
smtpd_tls_cert_file = /usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
smtpd_tls_ciphers = medium
smtpd_tls_dh1024_param_file = ${config_directory}/dh2048.pem
smtpd_tls_fingerprint_digest = sha256
smtpd_tls_key_file =
/usr/local/etc/pki/tls/private/ca.harte-lyne.mx31.key
smtpd_tls_protocols = !SSLv2, !SSLv3
smtpd_tls_received_header = yes
smtpd_tls_security_level = may
smtpd_tls_session_cache_database = btree:/var/db/postfix/smtpd_scache
smtpd_tls_session_cache_timeout = 3600s
tls_random_source = dev:/dev/urandom

# ll /usr/local/etc/pki/tls/private/
total 18
-rw------- 1 root wheel 3243 Jun 7 15:37 2016003E.key
lrwxr-xr-x 1 root wheel 12 Jul 10 12:19 ca.harte-lyne.mx31.key ->
2016003E.key

ll /usr/local/etc/pki/tls/certs
total 565
-rw-r--r-- 1 root wheel 10164 Jun 7 15:37 2016003E.pem
-rw-r--r-- 1 root wheel 822512 Jul 10 12:05 ca-bundle.crt
lrwxr-xr-x 1 root wheel 22 Jul 10 12:07 ca.harte-lyne.mx31.crt
-> ca.harte-lyne.mx31.pem
lrwxr-xr-x 1 root wheel 12 Jul 10 12:06 ca.harte-lyne.mx31.pem
-> 2016003E.pem

# openssl x509 -noout -text -in
/usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 538312766 (0x2016003e)
Signature Algorithm: sha512WithRSAEncryption
Issuer: CN=CA_HLL_ISSUER_2016, OU=Networked Data Services,
O=Harte & Lyne Limited, L=Hamilton, ST=Ontario, C=CA,
DC=harte-lyne, DC=ca
Validity
Not Before: Jun 1 00:00:00 2018 GMT
Not After : Jun 30 23:59:59 2023 GMT
O=Harte & Lyne Limited, L=Hamilton, ST=Ontario, C=CA,
DC=hamilton, DC=harte-lyne, DC=ca
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
. . .

Can someone interpret for me what these messages are telling me? Is
samba.org misconfigured or me?

RE: STARTTLS / DANE difficulties?

By byrnejb at 07/10/2018 - 13:26

On Tue, July 10, 2018 13:05, Fazzina, Angelo wrote:

I can connect to my services without difficulty:

# openssl s_client -starttls smtp -connect mx31.harte-lyne.ca:587
CONNECTED(00000003)
depth=2 CN = CA_HLL_ROOT_2016, ST = Ontario, O = Harte & Lyne Limited,
OU = Networked Data Services, C = CA, DC = harte-lyne, DC = ca, L =
Hamilton
verify return:1
depth=1 CN = CA_HLL_ISSUER_2016, OU = Networked Data Services, O =
Harte & Lyne Limited, L = Hamilton, ST = Ontario, C = CA, DC =
harte-lyne, DC = ca
verify return:1
depth=0 CN = mx31.harte-lyne.ca, OU = Networked Data Services, O =
Harte & Lyne Limited, L = Hamilton, ST = Ontario, C = CA, DC =
hamilton, DC = harte-lyne, DC = ca
verify return:1
Start Time: 1531246713
Timeout : 300 (sec)
Verify return code: 0 (ok)

[root@inet18 ~]# openssl s_client -starttls smtp -connect
mx32.harte-lyne.ca:587
CONNECTED(00000003)
depth=2 CN = CA_HLL_ROOT_2016, ST = Ontario, O = Harte & Lyne Limited,
OU = Networked Data Services, C = CA, DC = harte-lyne, DC = ca, L =
Hamilton
verify return:1
depth=1 CN = CA_HLL_ISSUER_2016, OU = Networked Data Services, O =
Harte & Lyne Limited, L = Hamilton, ST = Ontario, C = CA, DC =
harte-lyne, DC = ca
verify return:1
depth=0 CN = mx32.harte-lyne.ca, OU = Networked Data Systems, O =
Harte & Lyne Limited, L = Hamilton, ST = Ontario, C = CA, DC =
hamilton, DC = harte-lyne, DC = ca
verify return:1
Start Time: 1531246902
Timeout : 300 (sec)
Verify return code: 0 (ok)

RE: STARTTLS / DANE difficulties?

By Fazzina, Angelo at 07/10/2018 - 12:17

My test of connecting to your server
openssl s_client -starttls smtp -connect mx31.harte-lyne.ca:587

Start Time: 1531242804
Timeout : 300 (sec)
Verify return code: 19 (self signed certificate in certificate chain)
MY SERVER

Start Time: 1531242903
Timeout : 300 (sec)
Verify return code: 0 (ok)

-ANGELO FAZZINA

ITS Service Manager:
Spam and Virus Prevention
Mass Mailing
G Suite/Gmail

<a href="mailto: ... at uconn dot edu"> ... at uconn dot edu</a>
University of Connecticut,  ITS, SSG, Server Systems
860-486-9075

When you test connecting to your servers yourself do you get any errors ?
Not sure if sslv3 is ok to see if using TLS ???

Commands to try, just replace with your server name
openssl s_client -connect mta5.uits.uconn.edu:465
openssl s_client -starttls smtp -connect mta5.uits.uconn.edu:587

openssl s_client -connect <yourname>:465
openssl s_client -starttls smtp -connect <yourname>:587

good luck.

-ANGELO FAZZINA

ITS Service Manager:
Spam and Virus Prevention
Mass Mailing
G Suite/Gmail

<a href="mailto: ... at uconn dot edu"> ... at uconn dot edu</a>
University of Connecticut,  ITS, SSG, Server Systems
860-486-9075

We are migrating our Postfix MX services and in the process have
disrupted a setup which has been very stable for the past couple of
years. One of the remaining items is this sort of message which only
started very recently:

Jul 10 11:55:29 mx31 postfix-p25/smtpd[70030]: connect from
hr1.samba.org[144.76.82.147]
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: warning: TLS library
problem: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad
certificate:/usr/src/crypto/openssl/ssl/s3_pkt.c:1493:SSL alert number
42:
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: lost connection after
STARTTLS from hr1.samba.org[144.76.82.147]
Jul 10 11:55:30 mx31 postfix-p25/smtpd[70030]: disconnect from
hr1.samba.org[144.76.82.147] ehlo=1 starttls=1 commands=2

I thought that these errors were the result of a misconfigured
certificate or private key for the postfix service. However, I have
examined these and they appear to be correct:

postconf -n | grep -i tls
smtp_tls_CAfile = /usr/local/etc/pki/tls/certs/ca-bundle.crt
smtp_tls_cert_file = /usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
smtp_tls_ciphers = medium
smtp_tls_exclude_ciphers = MD5, aDSS, SRP, PSK, aECDH, aDH, SEED,
IDEA, RC2, RC5
smtp_tls_key_file = /usr/local/etc/pki/tls/private/ca.harte-lyne.mx31.key
smtp_tls_protocols = !SSLv2, !SSLv3
smtp_tls_security_level = dane
smtp_tls_session_cache_database = btree:/var/db/postfix/smtp_scache
smtp_tls_session_cache_timeout = 3600s
smtpd_starttls_timeout = ${stress?10}${stress:120}s
smtpd_tls_CAfile = /usr/local/etc/pki/tls/certs/ca-bundle.crt
smtpd_tls_ask_ccert = yes
smtpd_tls_auth_only = yes
smtpd_tls_cert_file = /usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
smtpd_tls_ciphers = medium
smtpd_tls_dh1024_param_file = ${config_directory}/dh2048.pem
smtpd_tls_fingerprint_digest = sha256
smtpd_tls_key_file =
/usr/local/etc/pki/tls/private/ca.harte-lyne.mx31.key
smtpd_tls_protocols = !SSLv2, !SSLv3
smtpd_tls_received_header = yes
smtpd_tls_security_level = may
smtpd_tls_session_cache_database = btree:/var/db/postfix/smtpd_scache
smtpd_tls_session_cache_timeout = 3600s
tls_random_source = dev:/dev/urandom

# ll /usr/local/etc/pki/tls/private/
total 18
-rw------- 1 root wheel 3243 Jun 7 15:37 2016003E.key
lrwxr-xr-x 1 root wheel 12 Jul 10 12:19 ca.harte-lyne.mx31.key ->
2016003E.key

ll /usr/local/etc/pki/tls/certs
total 565
-rw-r--r-- 1 root wheel 10164 Jun 7 15:37 2016003E.pem
-rw-r--r-- 1 root wheel 822512 Jul 10 12:05 ca-bundle.crt
lrwxr-xr-x 1 root wheel 22 Jul 10 12:07 ca.harte-lyne.mx31.crt
-> ca.harte-lyne.mx31.pem
lrwxr-xr-x 1 root wheel 12 Jul 10 12:06 ca.harte-lyne.mx31.pem
-> 2016003E.pem

# openssl x509 -noout -text -in
/usr/local/etc/pki/tls/certs/ca.harte-lyne.mx31.crt
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 538312766 (0x2016003e)
Signature Algorithm: sha512WithRSAEncryption
Issuer: CN=CA_HLL_ISSUER_2016, OU=Networked Data Services,
O=Harte & Lyne Limited, L=Hamilton, ST=Ontario, C=CA,
DC=harte-lyne, DC=ca
Validity
Not Before: Jun 1 00:00:00 2018 GMT
Not After : Jun 30 23:59:59 2023 GMT
O=Harte & Lyne Limited, L=Hamilton, ST=Ontario, C=CA,
DC=hamilton, DC=harte-lyne, DC=ca
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
. . .

Can someone interpret for me what these messages are telling me? Is
samba.org misconfigured or me?