|00:29||sunweaver has left IRC (firstname.lastname@example.org, *.net *.split)|
|00:29||wyre has left IRC (wyre!~wyre@user/wyre, *.net *.split)|
|00:29||sunweaver has joined IRC (email@example.com)|
|00:29||sunweaver is now away: not here ...|
|00:30||wyre has joined IRC (wyre!~wyre@user/wyre)|
|00:30||wyre is now away: Auto away at Wed Sep 6 00:25:20 2023 UTC|
|05:28||woernie has joined IRC (firstname.lastname@example.org)|
|05:34||wyre is back|
|06:33||ricotz has joined IRC (ricotz!~ricotz@ubuntu/member/ricotz)|
|06:40||wyre is now away: Auto away at Wed Sep 6 06:35:20 2023 UTC|
|08:41||danboid has joined IRC (email@example.com)|
A few months ago, LTSP slowed to a crawl but I don't know why. It takes about 3 minutes to load Chrome under LTSP on an i7 with 32 GB RAM. Has anyone got any suggestions for troubleshooting this?
Our students have been off since June so it hasn't been a big issue but we're going to want to use it again shortly and its not usable. Hopefully I won't have to reinstall it but I at least want to know why this has happened before I do re-install, if thats whats required.
I thought it could be ZFS snapshots but I destroyed them all and its still slug slow
There were too many tho. I've lost trust in sanoid for snapshotting after that. I'll still use syncoid tho.
Its seems sanoid can't handle purging snapshots across hundreds of datasets. zfs-auto-snapshot can handle that fine
alkisg: Any tips to help troubleshoot very slow LTSP clients?
danboid: try to login on the server and see if it's slow, it might be unrelated to the network
E.g. worn out disks can do that
alkisg: The server is running fine as far as I can tell. Its got 128 GB RAM etc
If the disk is slow, RAM won't help
When the clients work and are slow, at that time login on the server and see if it's slow or not
The clients are always slow now, even if only one client is logged in
OK, login on the server and see if it's slow or not
If the server appears to be fast after login, then VNC to me so that I take a quick look
vnc-edide: To share your screen with me, open Epoptes → Help menu → Remote support → Host: srv1-dide.ioa.sch.gr, and click the Connect button
On another note, to check networking, there's the epoptes lan benchmark function
Ah, I didn't know about the lan benchmark. Thats worth giving a go
alkisg: Hmm, thats odd. When I open epoptes on my LTSP server, it doesn't list any clients as being connected but I know at least one is booted up.
danboid: you probably haven't configured it properly
Also, the VM takes a whole lot of time to boot, so something is definitely wrong with the server itself
Login there, open a terminal, and sudo -i
alkisg: I've logged in as root to the VM terminal
danboid: ok wait, I'm currently multitasking with a lot of tasks...
alkisg: NP, thanks for looking at this. I've got to go to a meeting now but I'll leave VNC open. Back in 30m
@danboid eh what's that, the VM has the same IP as the server?! How are they going to communicate then...
|09:32||buringman42 has joined IRC (firstname.lastname@example.org)|
danboid: additionally, you have snap installed; it might be triggering mass-updates while the clients are booted, making all of them slow
|10:02||buringman42 has left IRC (email@example.com, Quit: Quit)|
alkisg: Back. I didn't think the network settings of the VM image mattered? It was running fine for a year with that VM config. I need to give the VM a static IP?
alkisg: OK I'll remove snap then but I don't think thats the main cause of my problems here is it
alkisg: I thought LTSP assigned each client an IP on boot
If a few months pass (not sure how many, 90 days?) then snap may force itself to refresh
danboid: so if you run ltsp image now, since snap is now fresh, it won't update itself so it might solve the client issue
alkisg: I'm asking Jim about the snapshots https://github.com/jimsalterjrs/sanoid/discussions/848
I can't help but think thats why its running so slowly
Then it would be slow if you logged in on the server too
Thanks for updating the image. Hopefully this will fix epoptes at least
And snap refresh
alkisg: What were you saying about the IP addresses? I've always presumed ltsp image would scrub/ignore any network settings in the source image, no?
danboid: see how slow your disk is. With 48 processors, it should have finished in less than a minute
I was saying that you had the same IP in the server as in the VM
You shouldn't have the same IP in two different PCs
alkisg: but the VM isn't running most of the time. That would only be an issue if the VM was running on a bridge right?
The VM has a bridge, yes
I'm not saying it's what causes the issue. The issue is caused by a worn-out disk that is currently dog-slow
The VM IP issue is another matter, which you should solve separatelyl
alkisg: You say the disks are worn out? You spotted something their? mdstat seems happy enough
I spotted ltsp image taking a LOT of time, with a monster CPU
Which either means bad/worn out disk, or sure, a zfs issue
I didn't have time to look into it more
Thanks for your help! I'll wait to see what Jim has to say about recovering from too many snapshots.
I run another server with a similar number of datasets / users and its not had an such issue with too many snapshots but that uses zfs-auto-snapshot
I've seen many disks start with write speed = 500 mbps and end up with 10 mbps after a couple of years
After ltsp image finishes, try an hdparm -t /dev/sda to see the speed, if it's under 100 mb/sec that's the issue
*mbyte, not bit
I heard that ZFS performance is supposed to be OK up to about 10K snapshots per dataset. We ended up going well over that.
I'm talking about a hardware issue, unrelated to the file system
|11:32||sunweaver is back|
|11:41||wyre is back|
|12:10||wyre is now away: Auto away at Wed Sep 6 12:05:17 2023 UTC|
|14:12||ogra is now away: currently disconnected|
|14:12||ogra is back|
|15:07||sunweaver is now away: not here ...|
|15:40||danboid has left IRC (firstname.lastname@example.org, Quit: Client closed)|
|16:25||eu^43611162 has joined IRC (email@example.com)|
|16:35||vagrantc has joined IRC (vagrantc!~vagrant@2600:3c01:e000:21:7:77:0:50)|
|17:23||eu^43611162 has left IRC (firstname.lastname@example.org, Quit: Client closed)|
|18:46||wyre is back|
|19:14||woernie has left IRC (email@example.com, Remote host closed the connection)|
|20:47||ricotz has left IRC (ricotz!~ricotz@ubuntu/member/ricotz, Quit: Leaving)|
|22:17||wyre is now away: Auto away at Wed Sep 6 22:13:02 2023 UTC|