DefederateLemmyMl

  • Gen𝕏
  • Engineer ⚙
  • Techie 💻
  • Linux user 🐧
  • Ukraine supporter 🇺🇦
  • Pro science 💉
  • Dutch speaker
  • 1 Post
  • 164 Comments
Joined 1 year ago
cake
Cake day: August 8th, 2023

help-circle



  • I’ve always thought of dependencies as equivalent to dlls. Is that right?

    Usually, but not always. Most of the times a dependency is a software library contained within a shared object file (a .so file), and that is indeed analogous to a dll.

    A dependency can be other things as well though, like a specific program that a software package depends on being present. For example, the handbrake program to reencode videos will call ffmpeg under the hood. So naturally ffmpeg is a dependency.

    Why is Linux so fiddley with dependencies?

    I don’t think it is? I mean, software depending on external shared libraries isn’t exactly a Linux only concept, and if anything I think most Linux distros’ ways of handling dependencies are superior.

    The main difference with Windows is that third party software tends to bring their own dlls for anything that’s not a standard part of Windows, which is wasteful because of duplication, and less secure because the included libraries may be out of date and contain known security holes.

    On Linux, distributions usually have every library under the sun in their repositories, managed by the package manager and kept up to date by the maintainers. As long as you stick to software included with your distro, or software packages for your specific distro, dependencies should be resolved automatically by the package manager. For example: if you download the Google Chrome .deb file, and install it with apt-get, it will pull in all the dependencies it needs to run.

    If you go outside of that, for example compiling software yourself, or downloading non-distro specific binaries, you will have to take care of dependencies yourself. Perhaps that’s what you mean with the fiddly bit.








  • You are misunderstanding.

    The file cache is never written out to the swapfile, because files are already on disk, like you say. The file cache is kept in memory and the kernel may decide it’s more advantageous to swap out unused anonymous memory pages to disk than flushing a file from the cache. You can use the vm.swappiness parameter to finetune this behavior to your liking btw. Lower values favor keeping more anonymous memory pages in memory, higher values favor file backed pages.

    To give an extreme example of where this is useful: I have a use case where I process a number of large video files (each 2GiB-10Gib in size). The job I’m doing involves doing several passes over the same file sequentially. You can bet your ass that caching them in memory speeds things up dramatically: the first pass, where it has to read the file on disk is at 200x speed (relative to the video’s duration), the second pass at 15000x speed.

    Even in less extreme circumstances it also helps by keeping frequently accessed files in your home directory in memory, for example your browser profile. Your browser and desktop environment would be much more sluggish if it had to reach out to disk every time for every file it touched.

    And you are free to disagree, but it’s not my opinion but the opinion of kernel developers of just about every operating system built the past 4 decades. So I’d say: take up the argument with them and see how far you get.


  • If he’s processing LLMs or really any non-trivial DB (read: any business DB)

    Actually… as a former DBA on large databases, you typically want to minimize swapping on a dedicated database system. Most database engines do a much better job at keeping useful data in memory than the Linux kernel’s file caching, which is agnostic about what your files contain. There are some exceptions, like elasticsearch which almost entirely relies on the Linux filesystem cache for buffering I/O.

    Anyway, database engines have query optimizers to determine the optimal path to resolve a query, but they rely on it that the buffers that they consider to be “in memory” are actually residing in physical memory, and not sitting in a swapfile somewhere.

    So typically, on a large database system the vendor recommendation will be to set vm.swappiness=0 to minimize memory pressure from filesystem caching, and to set the database buffers as high as the amount of memory you have in your system minus a small amount for the operating system.


  • I have 20GB in my current setup and it was never full. If anything gets swapped in this situation it means it needlessly slows me down.

    Not necessarily. Your memory also contains file backed pages (i.e. “file system cache”). These pages are typically not counted when determining “memory usage”, because they can always be discarded.

    It is often advantageous to keep frequently use files in cache in favor of unfrequently used memory pages.


  • I’ve never understood why GNU/Linux actually needs swap

    It doesn’t. It’s just good to have in most circumstances.

    Also, sidenote: “GNU” doesn’t apply here. Swapping is purely kernel business, no GNU involvement here.

    Okay, I created a 4G partition for it, having 32G of RAM. I never used all that RAM, but even so, stuff regularly ends up in swap. Why does the OS waste write cycles on my SSD if it doesn’t have to?

    Physical memory does not just contain program data, it also contains the filesystem cache, which is also important for performance and responsiveness. The idea is that some of the least recently used memory pages are sometimes evicted to swap in favor of more file caching.

    You can tweak this behavior by setting the vm.swappiness kernel parameter with sysctl. Basically higher values mean higher preference for keeping file backed pages in memory, lower values mean higher preference for keeping regular memory pages in memory.

    By default vm.swappiness = 60. If you have an abundance of memory, like a desktop system with 32G, it can be advantageous to lower the value of this parameter. If you set it to something low like 10 or 1, you will rarely see any of this paradoxical swap usage, but the system will still swap if absolutely necessary. I remember reading somewhere that it’s not a good idea to set it to 0, but I don’t remember the reason for that.

    Alternatively, there is no rule that says you can’t disable swap entirely. I’ve run a 32G desktop system without any swap for years. The downside is that if your 32G does run out, there will be no warning signs and the OOM killer will unceremoniously kill whatever is using the most memory.

    tl;dr just do this:

    sysctl vm.swappiness=10
    echo "vm.swappiness=10" > /etc/sysctl.d/99-swappiness.conf
    

  • I run a lot of VMs; I typically run 2 at the same time in addition to running other programs in the background, my usecase is more eccentric than most users in the Linux space which is already pretty niche

    If what you’re doing involves using close to all of your system memory, it does make sense to add swap. So your use case is a good example actually.

    I also have an old Arch PC that I use to run various VMs on (currently 6 VMs in use). It does have a swapfile, but the most swap I’ve ever seen in use is about 1GB.

    I’m using BTRFS with LUKS-based Full Disk Encryption, the last time I used swapfiles with BTRFS with FDE it was in 2019 and it was painful to say the least, I rememeber spending several weeks scouring Stack and the Arch forums in order to get it to work properly.

    Weird. Sounds like you may have painted yourself a bit into a corner by using BTRFS then. I use trusty old ext4 on top of LUKS FDE, no issues with swapfiles whatsoever.

    That brings me to another downside of swap partitions: encryption. You can leak sensitive data through your swap partition, so it should be encrypted. If you use a plain partition, without LUKS in between, information in your swap is exposed. So you need to do more configuration to setup LUKS on your swap partition.

    If you use a swapfile on an already encrypted filesystem though, you don’t have to worry about it.

    when would I even want to resize swap for a single system

    Maybe your requirements change (e.g. “I want to be able to hibernate”), maybe your memory configuration changes, maybe you’ve underestimated or overestimated how much swap you need.

    Case in point: the Arch PC I mentioned above only uses upto 1GB of swap, but it has a 16GB swapfile. This discussion has brought to my attention that perhaps I should downsize the swapfile a bit and free up disk space.

    you don’t really want to depend on swap if can

    That is my position too. It’s always better to have a properly sized system, or limit what you push on an existing system. High swap usage rarely results in a good experience.


  • 0 swap: which was pretty awful with constant unexpected system freezes/crashes

    I’ve run Arch without swap for many years without issues. The key of course is that you need enough RAM for what you are trying to do with your computer.

    There’s no reason why a 32GB RAM + 0GB swap system should have more problems than a 16GB RAM + 16GB swap system with the same workload. If anything, the former is going to run much better.

    swap file: finicky but doable

    What is finicky about a swap file?

    It’s just this:

    mkswap -U clear --size 4G --file /swapfile
    swapon /swapfile
    

    Done

    If anything it’s way easier to create a file in your filesystem than having to (re-)partition your drive to have a swap partition. Much more flexible too if you want to change your swap configuration in the future.


  • We are talking about addresses, not counters. An inherently hierarchical one at that. If you don’t use the bits you are actually wasting them.

    Bullshit.

    I have a 64-bit computer, it can address up to 18.4 exabytes, but my computer only has 32GB, so I will never use the vast majority that address space. Am I “wasting” it?

    All the 128 bits are used in IPv6. ;)

    Yes they are all “used” but you don’t need them. We are not using 2^128 ip addresses in the world. In your own terminology: you are using 4 registers for a 2 register problem. That is much more wasteful in terms of hardware than using 40 bits to represent an ip address and wasting 24 bits.


  • you are wasting 24 bits of a 64-bit register

    You’re not “wasting” them if you just don’t need the extra bits, Are you wasting a 32-bit integer if your program only ever counts up to 1000000?

    Even so when you do start to need them, you can gradually make the other bits available in the form of more octets. Like you can just define it as a.b.c.d.e = 0.a.b.c.d.e = 0.0.a.b.c.d.e = 0.0.0.a.b.c.d.e

    Recall that IPv6 came out just a year before the Nintendo 64

    If you’re worried about wasting registers it makes even less sense to switch from a 32-bit addressing space to a 128-bit one in one go.

    Anyway, your explanation is a perfect example of “second system effect” at work. You get all caught up in the mistakes of the first system, in casu the lack of addressing bits, and then you go all out to correct those mistakes for your second system, giving it all the bits humanity could ever need before the heat death of the universe, while ignoring the real world implications of your choices. And now you are surprised that nobody wants to use your 128-bit abomination.


  • Hmm, I can’t say that I’ve ever noticed this. I have a 3950x 16-core CPU and I often do video re-encoding with ffmpeg on all cores, and occasionally compile software on all cores too. I don’t notice it in the GUI’s responsiveness at all.

    Are you absolutely sure it’s not I/O related? A compile is usually doing a lot of random IO as well. What kind of drive are you running this on? Is it the same drive as your home directory is on?

    Way back when I still had a much weaker 4-core CPU I had issues with window and mouse lagging when running certain heavy jobs as well, and it turned out that using ionice helped me a lot more than using nice.

    I also remember that fairly recently there was a KDE/plasma stutter bug due to it reading from ~/.cache constantly. Brodie Robertson talked about it: https://www.youtube.com/watch?v=sCoioLCT5_o