IRC chat logs for #ltsp on irc.freenode.net (webchat)


Channel log from 10 April 2016   (all times are UTC)

00:48vagrantc has left IRC (vagrantc!~vagrant@unaffiliated/vagrantc, Quit: leaving)
01:19GodFather has left IRC (GodFather!~rcc@96-35-101-212.dhcp.bycy.mi.charter.com, Ping timeout: 240 seconds)
01:25BassetFever has left IRC (BassetFever!a2f5d1e5@gateway/web/freenode/ip.162.245.209.229, Quit: Page closed)
02:30andygraybeal has left IRC (andygraybeal!~andy@h44.174.133.40.static.ip.windstream.net, Ping timeout: 276 seconds)
02:44andygraybeal has joined IRC (andygraybeal!~andy@h31.39.30.71.dynamic.ip.windstream.net)
03:53kjackal has joined IRC (kjackal!~kjackal@185.16.164.7)
04:31kjackal has left IRC (kjackal!~kjackal@185.16.164.7, Ping timeout: 276 seconds)
06:26semo163 has joined IRC (semo163!5f43d329@gateway/web/freenode/ip.95.67.211.41)
06:26
<semo163>
hi all
06:27
<alkisg>
Hello
06:28
<semo163>
alkisg: 2 news about nbd disconnects of my ubuntu ltsp5 pnp server and thin/fat clients
06:28
first one is very good for ltsp
06:29
actually all my disconnects were not because of some misconfiguration of ltsp staff, so it works as suppose to
06:29
<alkisg>
What was the cause?
06:30
<semo163>
bad news : the reason was in cisco switch
06:30
<alkisg>
Cisco switches manage to cause tcp timeouts? how is that possible?
06:31
<semo163>
i changed it just for those clients-server for dlink (cheapest smart one) and clienst stay alive for weekends now
06:32
i don't know in details about cisco switch and what is exactly in it cause disconnects
06:32
<alkisg>
Ah ok its hardware might be broken then, it's not necessarily by design...
06:32
Nothing important to the ltsp world :)
06:34
<semo163>
so thank you alkisg for your and colleges hard work to get ltsp going
06:34
<alkisg>
You're welcome
06:35
<semo163>
going to find out wat is causing disconnects in cisco switch next week
06:36
<alkisg>
It'll probably be easier to do it without ltsp; just an nbd-server / nbd-client connection, an ssh connection etc, between a client and a server
06:36
So that you can see the logs etc without the clients hanging
06:36
if ssh stays up while nbd dies, that will indeed mean some software issue
06:39
<semo163>
to find out i/ve made 2 experiments^ started 2 fat (but diskless) clients one with autologin second with just nbd connection (without loggining) but both disconnects the same time after about 1hour 50 vin
06:39
in cisco i mean
06:43
<alkisg>
man tcp => tcp_keepalive_time (integer; default: 7200; since Linux 2.2)
06:43
The number of seconds a connection needs to be idle before TCP begins sending out keep-alive probes. Keep-alives are sent only when the SO_KEEPALIVE socket option is enabled. The default value is 7200 seconds (2 hours). An idle connection is terminated after approximately an additional 11 minutes (9 probes an interval of 75 sec‐ onds apart) when keep-alive is enabled.
06:43yanu has left IRC (yanu!~yanu@178-116-58-90.access.telenet.be, Remote host closed the connection)
06:44
<alkisg>
So the disconnection happens in 2 hours, 11 minutes, if it's a keepalive issue
06:44
<semo163>
when client boots it says block nbd9: Receive control failed (result -32) Apr 9 11:38:26 gnutova kernel: [ 31.696825] block nbd9: queue cleared
06:44
<alkisg>
If cisco has a setting to block that, I don't think it would be disabled by default, it would be something in your setup
06:44
Ignore the nbd9 message, it's normal
06:45
<semo163>
and i think this cause 2 hours later disconnects from cisco which thinks this is dead session
06:46
im not expert in this though
06:48
i've changed TCP_KEEPALIVE kernel params as one german user suggested (i cannot find it now)net.ipv4.tcp_keepalive_intvl = 60 net.ipv4.tcp_keepalive_probes = 20 net.ipv4.tcp_keepalive_time = 600
06:49
they got cisco switch problems as well
06:50
but after they changed tcp_keepalive params as above problem disappeared
06:50
<alkisg>
It's probably an unrelated forum message, ignore it
06:50
It doesn't match your symptoms
06:50
<semo163>
some archive of 2008
06:57
switch cisco SG300-52
06:58
switch dlink DES-1210-52
06:59
<alkisg>
Have you tried a factory reset on the cisco switch?
06:59
<semo163>
no difference in logs with 2 different switches
06:59
i mean syslog on clients server (ltsp ones)
07:01
cisco actually brand new with just minor security settings like dhcp snooping (trusted) ip source guard and arp inspections
07:02
all interfaces on cisco correspond to hosts (i mean if dhcp server on say int gi01 it trusted as ltsp dnsmasq)
07:03
<alkisg>
OK, it's completely beyond ltsp, maybe you should ask in #cisco then
07:03
<semo163>
on dlink no config but default and it works fine (i actually planned to replase it with cisco :-( )
07:04ricotz has joined IRC (ricotz!~ricotz@ubuntu/member/ricotz)
07:05
<semo163>
yes I know it looks like cisco special question... just for info for others
07:38
<sebd>
semo163: I guess there is a default rule in Cisco device sthat says "disconnect idle connexions after n seconds", I have had the same problem for my SSH sessions to a specific remote server. A Cisco device, too.
07:40
<alkisg>
It'd be nice to understand how that happens, since the switch is not an endpoint to the connection so it can't just "drop it"
07:41
<sebd>
it can act as a transparent firewall however
07:42semo163 has left IRC (semo163!5f43d329@gateway/web/freenode/ip.95.67.211.41, Ping timeout: 250 seconds)
07:43semo163 has joined IRC (semo163!5f43ebf9@gateway/web/freenode/ip.95.67.235.249)
07:43
<alkisg>
And it would allow all the keepalive messages, but then block them, and even block reconnection attempts?
07:44
<semo163>
But i cannot such settings on SG300-52
07:44
<sebd>
mmm no. You are right.
07:44
<alkisg>
The explanation "it completely blocks keepalive messages" sounds more sane...
07:45
<semo163>
actually I use this switch for ssh for all day long and there are lots of idle sessions for >5 hours, when I get back it stay alive
07:45
<sebd>
ok, so these are not the same symptoms
07:46
<alkisg>
ssh has more advanced mechanism for keeping alive a connection afaik
07:47
something like http://unix.stackexchange.com/questions/34004/how-does-tcp-keepalive-work-in-ssh
07:48
semo163, it's also very easy to use aoe if you want, which won't have the keepalive issues as it operates on the ethernet level
07:48
aoe instead of nbd
07:49
<semo163>
point please to how to configure it
07:49
is there some resource to read on how to do so
07:49
<alkisg>
Hmm I don't have any tutorial link handy, google a bit for aoe ltsp... it only requires 3-4 commands
07:50
<semo163>
ok. Thank you.
07:58
by the way i've got iscsi traff goes through this switch with no problems (no disconnects or so)
08:02
<alkisg>
That too might not rely only on tcp-keepalive, it might use other means itself, like pinging
08:02
You can use a simple `nc` if you want to test tcp keepalive without additional protocols
10:07kjackal has joined IRC (kjackal!~kjackal@37.205.61.203)
11:57kjackal has left IRC (kjackal!~kjackal@37.205.61.203, Ping timeout: 276 seconds)
12:24schlady has joined IRC (schlady!~schlady@ip1f111304.dynamic.kabel-deutschland.de)
12:30schlady has left IRC (schlady!~schlady@ip1f111304.dynamic.kabel-deutschland.de, Remote host closed the connection)
12:31schlady has joined IRC (schlady!~schlady@ip1f111304.dynamic.kabel-deutschland.de)
14:07andygraybeal has left IRC (andygraybeal!~andy@h31.39.30.71.dynamic.ip.windstream.net, Ping timeout: 252 seconds)
14:08labkomltsp^labko has joined IRC (labkomltsp^labko!73b2d416@gateway/web/freenode/ip.115.178.212.22)
14:09
<labkomltsp^labko>
i have a problem on my epoptes, any body can help mi
14:09
*me
14:16tharkun has left IRC (tharkun!~0@unaffiliated/tharkun, Ping timeout: 250 seconds)
14:17tharkun has joined IRC (tharkun!~0@201.157.71.45)
14:20andygraybeal has joined IRC (andygraybeal!~andy@h26.184.190.173.dynamic.ip.windstream.net)
14:25GodFather has joined IRC (GodFather!~rcc@96-35-101-212.dhcp.bycy.mi.charter.com)
14:35labkomltsp^labko has left IRC (labkomltsp^labko!73b2d416@gateway/web/freenode/ip.115.178.212.22, Ping timeout: 250 seconds)
14:56schlady has left IRC (schlady!~schlady@ip1f111304.dynamic.kabel-deutschland.de, Read error: Connection reset by peer)
16:51andygraybeal has left IRC (andygraybeal!~andy@h26.184.190.173.dynamic.ip.windstream.net, Read error: Connection reset by peer)
17:35vagrantc has joined IRC (vagrantc!~vagrant@unaffiliated/vagrantc)
18:07semo163 has left IRC (semo163!5f43ebf9@gateway/web/freenode/ip.95.67.235.249, Ping timeout: 250 seconds)
18:10gehidore is now known as man
18:10man is now known as gehidore
19:18andygraybeal has joined IRC (andygraybeal!~andy@h26.184.190.173.dynamic.ip.windstream.net)
20:28ricotz has left IRC (ricotz!~ricotz@ubuntu/member/ricotz, Quit: Leaving)
21:17GodFather has left IRC (GodFather!~rcc@96-35-101-212.dhcp.bycy.mi.charter.com, Ping timeout: 240 seconds)
21:40
<alkisg>
vagrantc: I'm thinking of adding a cleanup.d/script that installs ltsp-client if it's not already installed as part of the ltsp-update-image -c process,
21:41
an example usage is to have a stretch VM without ltsp in vbox, and run ltsp-update-image -c /path/to/vdi and create a squashfs out of it
21:41
It's not too intrusive, is it?
21:42
of course it'll only be activated if it operates on a cow system...
21:44alkisg is now known as alkisg_away
21:51
<vagrantc>
alkisg_away: so it would only install it in the resulting image? ... there's a certain elegance to that
22:07GodFather has joined IRC (GodFather!~rcc@96-35-101-212.dhcp.bycy.mi.charter.com)
22:44sutula has left IRC (sutula!~sutula@207-118-151-30.dyn.centurytel.net, Ping timeout: 252 seconds)
22:46sutula has joined IRC (sutula!~sutula@207-118-151-30.dyn.centurytel.net)