IRC chat logs for #ltsp on irc.libera.chat (webchat)


Channel log from 6 September 2023   (all times are UTC)

00:29sunweaver has left IRC (sunweaver!~sunweaver@fylgja.das-netzwerkteam.de, *.net *.split)
00:29wyre has left IRC (wyre!~wyre@user/wyre, *.net *.split)
00:29sunweaver has joined IRC (sunweaver!~sunweaver@fylgja.das-netzwerkteam.de)
00:29sunweaver is now away: not here ...
00:30wyre has joined IRC (wyre!~wyre@user/wyre)
00:30wyre is now away: Auto away at Wed Sep 6 00:25:20 2023 UTC
05:28woernie has joined IRC (woernie!~werner@p5ddedeef.dip0.t-ipconnect.de)
05:34wyre is back
06:33ricotz has joined IRC (ricotz!~ricotz@ubuntu/member/ricotz)
06:40wyre is now away: Auto away at Wed Sep 6 06:35:20 2023 UTC
08:41danboid has joined IRC (danboid!~danboid@remote.salford.ac.uk)
08:43
<danboid>
A few months ago, LTSP slowed to a crawl but I don't know why. It takes about 3 minutes to load Chrome under LTSP on an i7 with 32 GB RAM. Has anyone got any suggestions for troubleshooting this?
08:46
Our students have been off since June so it hasn't been a big issue but we're going to want to use it again shortly and its not usable. Hopefully I won't have to reinstall it but I at least want to know why this has happened before I do re-install, if thats whats required.
08:46
I thought it could be ZFS snapshots but I destroyed them all and its still slug slow
08:47
There were too many tho. I've lost trust in sanoid for snapshotting after that. I'll still use syncoid tho.
08:49
Its seems sanoid can't handle purging snapshots across hundreds of datasets. zfs-auto-snapshot can handle that fine
08:50
alkisg: Any tips to help troubleshoot very slow LTSP clients?
08:50
<alkisg>
danboid: try to login on the server and see if it's slow, it might be unrelated to the network
08:50
E.g. worn out disks can do that
08:51
<danboid>
alkisg: The server is running fine as far as I can tell. Its got 128 GB RAM etc
08:51
<alkisg>
If the disk is slow, RAM won't help
08:51
When the clients work and are slow, at that time login on the server and see if it's slow or not
08:52
<danboid>
The clients are always slow now, even if only one client is logged in
08:53
<alkisg>
OK, login on the server and see if it's slow or not
08:53
If the server appears to be fast after login, then VNC to me so that I take a quick look
08:53
!vnc-edide
08:53
<ltspbot>
vnc-edide: To share your screen with me, open Epoptes → Help menu → Remote support → Host: srv1-dide.ioa.sch.gr, and click the Connect button
08:54
<alkisg>
On another note, to check networking, there's the epoptes lan benchmark function
08:56
<danboid>
Ah, I didn't know about the lan benchmark. Thats worth giving a go
09:12
alkisg: Hmm, thats odd. When I open epoptes on my LTSP server, it doesn't list any clients as being connected but I know at least one is booted up.
09:18
<alkisg>
danboid: you probably haven't configured it properly
09:19
Also, the VM takes a whole lot of time to boot, so something is definitely wrong with the server itself
09:19
Login there, open a terminal, and sudo -i
09:21
<danboid>
OK
09:23
alkisg: I've logged in as root to the VM terminal
09:23
<alkisg>
danboid: ok wait, I'm currently multitasking with a lot of tasks...
09:24
<danboid>
alkisg: NP, thanks for looking at this. I've got to go to a meeting now but I'll leave VNC open. Back in 30m
09:27
<alkisg>
@danboid eh what's that, the VM has the same IP as the server?! How are they going to communicate then...
09:32buringman42 has joined IRC (buringman42!~ident@20.sub-174-215-145.myvzw.com)
09:34
<alkisg>
danboid: additionally, you have snap installed; it might be triggering mass-updates while the clients are booted, making all of them slow
10:02buringman42 has left IRC (buringman42!~ident@20.sub-174-215-145.myvzw.com, Quit: Quit)
10:09
<danboid>
alkisg: Back. I didn't think the network settings of the VM image mattered? It was running fine for a year with that VM config. I need to give the VM a static IP?
10:10
alkisg: OK I'll remove snap then but I don't think thats the main cause of my problems here is it
10:12
alkisg: I thought LTSP assigned each client an IP on boot
10:29
<alkisg>
If a few months pass (not sure how many, 90 days?) then snap may force itself to refresh
10:30
danboid: so if you run ltsp image now, since snap is now fresh, it won't update itself so it might solve the client issue
10:31
<danboid>
alkisg: I'm asking Jim about the snapshots https://github.com/jimsalterjrs/sanoid/discussions/848
10:32
I can't help but think thats why its running so slowly
10:32
<alkisg>
Then it would be slow if you logged in on the server too
10:33
<danboid>
Thanks for updating the image. Hopefully this will fix epoptes at least
10:33
<alkisg>
And snap refresh
10:36
<danboid>
alkisg: What were you saying about the IP addresses? I've always presumed ltsp image would scrub/ignore any network settings in the source image, no?
10:36
<alkisg>
danboid: see how slow your disk is. With 48 processors, it should have finished in less than a minute
10:36
I was saying that you had the same IP in the server as in the VM
10:36
You shouldn't have the same IP in two different PCs
10:38
<danboid>
alkisg: but the VM isn't running most of the time. That would only be an issue if the VM was running on a bridge right?
10:38
<alkisg>
The VM has a bridge, yes
10:38
I'm not saying it's what causes the issue. The issue is caused by a worn-out disk that is currently dog-slow
10:38
The VM IP issue is another matter, which you should solve separatelyl
10:42
<danboid>
alkisg: You say the disks are worn out? You spotted something their? mdstat seems happy enough
10:43
<alkisg>
I spotted ltsp image taking a LOT of time, with a monster CPU
10:43
Which either means bad/worn out disk, or sure, a zfs issue
10:43
I didn't have time to look into it more
10:44
<danboid>
Thanks for your help! I'll wait to see what Jim has to say about recovering from too many snapshots.
10:46
I run another server with a similar number of datasets / users and its not had an such issue with too many snapshots but that uses zfs-auto-snapshot
10:46
<alkisg>
I've seen many disks start with write speed = 500 mbps and end up with 10 mbps after a couple of years
10:47
After ltsp image finishes, try an hdparm -t /dev/sda to see the speed, if it's under 100 mb/sec that's the issue
10:47
*mbyte, not bit
10:49
<danboid>
I heard that ZFS performance is supposed to be OK up to about 10K snapshots per dataset. We ended up going well over that.
10:53
<alkisg>
I'm talking about a hardware issue, unrelated to the file system
11:32sunweaver is back
11:41wyre is back
12:10wyre is now away: Auto away at Wed Sep 6 12:05:17 2023 UTC
14:12ogra is now away: currently disconnected
14:12ogra is back
15:07sunweaver is now away: not here ...
15:40danboid has left IRC (danboid!~danboid@remote.salford.ac.uk, Quit: Client closed)
16:25eu^43611162 has joined IRC (eu^43611162!~eu^436111@165.225.63.57)
16:35vagrantc has joined IRC (vagrantc!~vagrant@2600:3c01:e000:21:7:77:0:50)
17:23eu^43611162 has left IRC (eu^43611162!~eu^436111@165.225.63.57, Quit: Client closed)
18:46wyre is back
19:14woernie has left IRC (woernie!~werner@p5ddedeef.dip0.t-ipconnect.de, Remote host closed the connection)
20:47ricotz has left IRC (ricotz!~ricotz@ubuntu/member/ricotz, Quit: Leaving)
22:17wyre is now away: Auto away at Wed Sep 6 22:13:02 2023 UTC