Discussion:
[Linux-HA] the maximum message size which bcast can handle
Junko IKEDA
2008-09-05 08:53:53 UTC
Permalink
Hi,

I encountered the following error.
There are 9 nodes ( 8 actives + 1 standby) and each active node has 16
resources.

heartbeat[23545]: 2008/09/02_17:12:10 ERROR: glib: Unable to send bcast [-1]
packet(len=71866): Message too long
heartbeat[23545]: 2008/09/02_17:12:10 ERROR: write_child: write failure on
bcast eth1.: Message too long
heartbeat[23547]: 2008/09/02_17:12:10 ERROR: glib: Unable to send bcast [-1]
packet(len=71866): Message too long
heartbeat[23547]: 2008/09/02_17:12:10 ERROR: write_child: write failure on
bcast eth3.: Message too long
heartbeat[23545]: 2008/09/02_17:12:10 ERROR: glib: Unable to send bcast [-1]
packet(len=71866): Message too long
heartbeat[23545]: 2008/09/02_17:12:10 ERROR: write_child: write failure on
bcast eth1.: Message too long
heartbeat[23547]: 2008/09/02_17:12:10 ERROR: glib: Unable to send bcast [-1]
packet(len=71866): Message too long
heartbeat[23547]: 2008/09/02_17:12:10 ERROR: write_child: write failure on
bcast eth3.: Message too long
heartbeat[23545]: 2008/09/02_17:12:10 ERROR: glib: Unable to send bcast [-1]
packet(len=71866): Message too long
heartbeat[23545]: 2008/09/02_17:12:10 ERROR: write_child: write failure on
bcast eth1.: Message too long
heartbeat[23545]: 2008/09/02_17:12:10 WARN: Temporarily Suppressing write
error messages
heartbeat[23545]: 2008/09/02_17:12:10 WARN: Is a cable unplugged on bcast
eth1?
heartbeat[23547]: 2008/09/02_17:12:10 ERROR: glib: Unable to send bcast [-1]
packet(len=71866): Message too long
heartbeat[23547]: 2008/09/02_17:12:10 ERROR: write_child: write failure on
bcast eth3.: Message too long
heartbeat[23547]: 2008/09/02_17:12:10 WARN: Temporarily Suppressing write
error messages
heartbeat[23547]: 2008/09/02_17:12:10 WARN: Is a cable unplugged on bcast
eth3?
heartbeat[23538]: 2008/09/02_17:12:11 ERROR: Message hist queue is filling
up (499 messages in queue)

It seems that heartbeat can compress a message, so the size of final message
would be less than 256 Kbyte.

/* MAXMSG is the maximum final message size on the wire. */
#define MAXMSG (256*1024)

/* MAXUNCOMPRESSED is the maximum, raw data size prior to compression. */
/* 1:8 compression ratio is to be expected on data such as xml */

But the error I found came here;

lib/plugins/HBcomm/bcast.c
bcast_write(struct hb_media* mp, void *pkt, int len){

if ((rc=sendto(ei->wsocket, pkt, len, 0

}

The maximum size for sendto() is about 64Kbyte.
Is there any limitation for the message size which bcast_write() can handle?
Will it send MAXMSG ( + header )?


Best Regards,
Junko Ikeda

NTT DATA INTELLILINK CORPORATION
Dejan Muhamedagic
2008-09-05 15:08:10 UTC
Permalink
Hi Junko-san,
Post by Junko IKEDA
Hi,
I encountered the following error.
There are 9 nodes ( 8 actives + 1 standby) and each active node has 16
resources.
heartbeat[23545]: 2008/09/02_17:12:10 ERROR: glib: Unable to send bcast [-1]
packet(len=71866): Message too long
It seems that heartbeat can compress a message, so the size of final message
would be less than 256 Kbyte.
Yes, that was fixed about last November, I can very well recall
that.
Post by Junko IKEDA
/* MAXMSG is the maximum final message size on the wire. */
#define MAXMSG (256*1024)
/* MAXUNCOMPRESSED is the maximum, raw data size prior to compression. */
/* 1:8 compression ratio is to be expected on data such as xml */
But the error I found came here;
lib/plugins/HBcomm/bcast.c
bcast_write(struct hb_media* mp, void *pkt, int len){
if ((rc=sendto(ei->wsocket, pkt, len, 0
}
The maximum size for sendto() is about 64Kbyte.
Is there any limitation for the message size which bcast_write() can handle?
Will it send MAXMSG ( + header )?
Hmm. Perhaps this (the maximum packet size) has been checked by
somebody before, then forgotten and it never got into discussion
about the message compression. When I started working on the
compression, the MAXMSG was already temporarily set to 2MB.

Also, I can distinctly recall that Lars was setting the MAXMSG
size to 2MB (before the compression has been fixed) to support
his 9-node cluster (or thereabouts). I don't know what kind of hb
media he used.

Cheers,

Dejan
Post by Junko IKEDA
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Andrew Beekhof
2008-09-07 20:38:47 UTC
Permalink
alternatively, consider using openais which is less likely to run into
these sorts of limits
Post by Dejan Muhamedagic
Hi Junko-san,
Post by Junko IKEDA
Hi,
I encountered the following error.
There are 9 nodes ( 8 actives + 1 standby) and each active node has 16
resources.
heartbeat[23545]: 2008/09/02_17:12:10 ERROR: glib: Unable to send bcast [-1]
packet(len=71866): Message too long
It seems that heartbeat can compress a message, so the size of final message
would be less than 256 Kbyte.
Yes, that was fixed about last November, I can very well recall
that.
Post by Junko IKEDA
/* MAXMSG is the maximum final message size on the wire. */
#define MAXMSG (256*1024)
/* MAXUNCOMPRESSED is the maximum, raw data size prior to compression. */
/* 1:8 compression ratio is to be expected on data such as xml */
But the error I found came here;
lib/plugins/HBcomm/bcast.c
bcast_write(struct hb_media* mp, void *pkt, int len){
if ((rc=sendto(ei->wsocket, pkt, len, 0
}
The maximum size for sendto() is about 64Kbyte.
Is there any limitation for the message size which bcast_write() can handle?
Will it send MAXMSG ( + header )?
Hmm. Perhaps this (the maximum packet size) has been checked by
somebody before, then forgotten and it never got into discussion
about the message compression. When I started working on the
compression, the MAXMSG was already temporarily set to 2MB.
Also, I can distinctly recall that Lars was setting the MAXMSG
size to 2MB (before the compression has been fixed) to support
his 9-node cluster (or thereabouts). I don't know what kind of hb
media he used.
Cheers,
Dejan
Post by Junko IKEDA
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Junko IKEDA
2008-09-08 06:12:28 UTC
Permalink
Post by Dejan Muhamedagic
Hmm. Perhaps this (the maximum packet size) has been checked by
somebody before, then forgotten and it never got into discussion
about the message compression. When I started working on the
compression, the MAXMSG was already temporarily set to 2MB.
Also, I can distinctly recall that Lars was setting the MAXMSG
size to 2MB (before the compression has been fixed) to support
his 9-node cluster (or thereabouts). I don't know what kind of hb
media he used.
Hi,

MAXUNCOMPRESSED is 2MB, and MAXMSG is 256kbyte.

#define MAXMSG (256*1024)
#define MAXUNCOMPRESSED (2048*1024)

Does it mean 2MB message would be compressed to 256kbyte, and sent via some
media?
If MAXMSG(256kbyte) is sent as it is, sendto() will not be able to handle
it, because the max size for sendto() is 64kbyte.
256kbyte message should be split into pieces before sending as packet.
by the way, I set "bcast" in ha.cf as media.

and, openais...
Unfortunately, we don't have enough experience to select this.

Thanks,
Junko
Andrew Beekhof
2008-09-08 07:17:01 UTC
Permalink
Post by Junko IKEDA
Post by Dejan Muhamedagic
Hmm. Perhaps this (the maximum packet size) has been checked by
somebody before, then forgotten and it never got into discussion
about the message compression. When I started working on the
compression, the MAXMSG was already temporarily set to 2MB.
Also, I can distinctly recall that Lars was setting the MAXMSG
size to 2MB (before the compression has been fixed) to support
his 9-node cluster (or thereabouts). I don't know what kind of hb
media he used.
Hi,
MAXUNCOMPRESSED is 2MB, and MAXMSG is 256kbyte.
#define MAXMSG (256*1024)
#define MAXUNCOMPRESSED (2048*1024)
Does it mean 2MB message would be compressed to 256kbyte, and sent via some
media?
No

It means that heartbeat can't deliver a message if the uncompressed
size is bigger than 2MB.
It also means that heartbeat can't deliver a message if, after
compressing the message, the size is still bigger than 256kB.
Post by Junko IKEDA
If MAXMSG(256kbyte) is sent as it is, sendto() will not be able to handle
it, because the max size for sendto() is 64kbyte.
256kbyte message should be split into pieces before sending as packet.
by the way, I set "bcast" in ha.cf as media.
I assume you're working on a patch for this?
Post by Junko IKEDA
and, openais...
Unfortunately, we don't have enough experience to select this.
I highly suggest you rectify that.
Junko IKEDA
2008-09-08 08:24:30 UTC
Permalink
Post by Andrew Beekhof
It means that heartbeat can't deliver a message if the uncompressed
size is bigger than 2MB.
It also means that heartbeat can't deliver a message if, after
compressing the message, the size is still bigger than 256kB.
I see,
First control gate is 2MB, second is 256kB.
Post by Andrew Beekhof
Post by Junko IKEDA
If MAXMSG(256kbyte) is sent as it is, sendto() will not be able to handle
it, because the max size for sendto() is 64kbyte.
256kbyte message should be split into pieces before sending as packet.
by the way, I set "bcast" in ha.cf as media.
I assume you're working on a patch for this?
That means, heartbeat doesn't care for 256kB message before putting it to
sendto()...
Post by Andrew Beekhof
Post by Junko IKEDA
and, openais...
Unfortunately, we don't have enough experience to select this.
I highly suggest you rectify that.
It's hard to convince many managers in a moment.

Thanks,
Junko
Lars Marowsky-Bree
2008-09-08 11:57:54 UTC
Permalink
Post by Junko IKEDA
Post by Andrew Beekhof
It also means that heartbeat can't deliver a message if, after
compressing the message, the size is still bigger than 256kB.
I see,
First control gate is 2MB, second is 256kB.
Yes.
Post by Junko IKEDA
Post by Andrew Beekhof
Post by Junko IKEDA
it, because the max size for sendto() is 64kbyte.
256kbyte message should be split into pieces before sending as packet.
by the way, I set "bcast" in ha.cf as media.
I assume you're working on a patch for this?
That means, heartbeat doesn't care for 256kB message before putting it to
sendto()...
Where did you find the 64kb limit here?
Post by Junko IKEDA
Post by Andrew Beekhof
Post by Junko IKEDA
and, openais...
Unfortunately, we don't have enough experience to select this.
I highly suggest you rectify that.
It's hard to convince many managers in a moment.
Then you may have to recompile with larger limits, but UDP is not very
efficient at transmitting such large messages.



Regards,
Lars
--
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG N?rnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
Junko IKEDA
2008-09-09 01:46:01 UTC
Permalink
Post by Lars Marowsky-Bree
Post by Junko IKEDA
Post by Andrew Beekhof
Post by Junko IKEDA
it, because the max size for sendto() is 64kbyte.
256kbyte message should be split into pieces before sending as packet.
by the way, I set "bcast" in ha.cf as media.
I assume you're working on a patch for this?
That means, heartbeat doesn't care for 256kB message before putting it to
sendto()...
Where did you find the 64kb limit here?
Post by Junko IKEDA
Post by Andrew Beekhof
Post by Junko IKEDA
and, openais...
Unfortunately, we don't have enough experience to select this.
I highly suggest you rectify that.
It's hard to convince many managers in a moment.
Then you may have to recompile with larger limits, but UDP is not very
efficient at transmitting such large messages.
Yes, it's UDP.

net/ipv4/udp.c(2.6.18-92.el5)

??495 int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct \
????msghdr *msg,
??496 size_t len)
??497 {

??511 if (len > 0xFFFF)
??512 return -EMSGSIZE;


in line 511, the limit for UDP packet is 65535 kbyte.
I got "EMSGSIZE" from here, so I suspect heartbeat as the cause of this.
If heartbeat send 256 kbyte message as it is,
UDP layer could not handle it.

Thanks,
Junko
Lars Marowsky-Bree
2008-09-09 08:15:11 UTC
Permalink
Post by Junko IKEDA
net/ipv4/udp.c(2.6.18-92.el5)
??495 int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct \
????msghdr *msg,
??496 size_t len)
??497 {
??511 if (len > 0xFFFF)
??512 return -EMSGSIZE;
in line 511, the limit for UDP packet is 65535 kbyte.
I got "EMSGSIZE" from here, so I suspect heartbeat as the cause of this.
If heartbeat send 256 kbyte message as it is,
UDP layer could not handle it.
Ah, right. Yes, CIBs which compress to >64kb then won't work.


Regards,
Lars
--
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG N?rnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
Junko IKEDA
2008-09-09 08:37:10 UTC
Permalink
Post by Lars Marowsky-Bree
Post by Junko IKEDA
net/ipv4/udp.c(2.6.18-92.el5)
??495 int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct \
????msghdr *msg,
??496 size_t len)
??497 {
??511 if (len > 0xFFFF)
??512 return -EMSGSIZE;
in line 511, the limit for UDP packet is 65535 kbyte.
I got "EMSGSIZE" from here, so I suspect heartbeat as the cause of this.
If heartbeat send 256 kbyte message as it is,
UDP layer could not handle it.
Ah, right. Yes, CIBs which compress to >64kb then won't work.
It seems that Heartbeat is sending the comparatively small messages ( < 1500
byte ) when I run 230 Dummy resources,
But "cibadmin -Q" causes EMSGSIZE error.
It might be the problem for cibadmin.
(but maybe, CIBs which is bigger than 64kb would fail...)

Thanks,
Junko
Andrew Beekhof
2008-09-09 08:41:06 UTC
Permalink
Post by Junko IKEDA
Post by Lars Marowsky-Bree
Post by Junko IKEDA
net/ipv4/udp.c(2.6.18-92.el5)
495 int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct \
msghdr *msg,
496 size_t len)
497 {
511 if (len > 0xFFFF)
512 return -EMSGSIZE;
in line 511, the limit for UDP packet is 65535 kbyte.
I got "EMSGSIZE" from here, so I suspect heartbeat as the cause of this.
If heartbeat send 256 kbyte message as it is,
UDP layer could not handle it.
Ah, right. Yes, CIBs which compress to >64kb then won't work.
It seems that Heartbeat is sending the comparatively small messages ( < 1500
byte ) when I run 230 Dummy resources,
because we try to be smart and send only the changes

if the two nodes ever get out of sync, we'd need to send a full copy
and it will break
Post by Junko IKEDA
But "cibadmin -Q" causes EMSGSIZE error.
thats because you're asking for the whole CIB
Post by Junko IKEDA
It might be the problem for cibadmin.
no
Post by Junko IKEDA
(but maybe, CIBs which is bigger than 64kb would fail...)
Thanks,
Junko
_______________________________________________
Linux-HA mailing list
Linux-HA at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Loading...