Bytedance Proposes Faster Linux Inter-Process Communication With "Run Process As Library"

cm0002@lemmy.world · 2 months ago

Bytedance Proposes Faster Linux Inter-Process Communication With "Run Process As Library"

fubarx@lemmy.world · 2 months ago

The ultimate endpoint for this sort of optimization is to do away with processes and just have threads or coroutines. Or, if you want to go super-optimized, a super-loop.

Hey, Arduino!

mutual_ayed@sh.itjust.works · 2 months ago

Good to C you

Trimatrix@lemmy.world · 2 months ago

confused C++ noises /s

some_guy@lemmy.sdf.org · 2 months ago

I’m pretty technical, but this is slightly beyond my grasp. I understand that this is streamlining efficiency but it’s written for someone who understands the OS at levels I don’t. Can someone EILI5?

treadful@lemmy.zip · 2 months ago

Not a kernel expert but this is how it reads to me.

Instead of connecting via a socket through the kernel, this would allow processes to execute functions of other processes already in memory.

This would normally be prevented by processor memory protections, but a new processor feature has functionality to allow this now.

helvetpuli@sopuli.xyz · 2 months ago

This sounds dangerous.

jonne@infosec.pub · 2 months ago

Their use case is to run their own application(s) on their own servers in their own datacenter, so they’re probably ok with that tradeoff. But yeah, I can imagine this opening up a whole bunch of security issues if this starts getting used widely.

MadhuGururajan@programming.dev · 2 months ago

The gist is that a system call is introduced to go into the PCB and change the Effective UID of a process. Security is ensured by a processor MPK which is a CPU provided guard so that a {Process, Library} has only a restricted set of Effective UIDs it can switch to. This operations is supposed to use 30 to 50 clock cycles. So entry + exit is supposed to be done in 100 cycles. This is considered low overhead context switch compared to the traditional context switch on Linux for slower IPCs. They don’t do a comparison against iouring, or simply multi-threaded process.

treadful@lemmy.zip · 2 months ago

You sound like you’re living in the weeds, friend.

What’s MPK? And by UID I assume you’re not talking about the system level user ID but some kind of processor-level process ID?

MadhuGururajan@programming.dev · 2 months ago

I don’t know the details of the MPK. So I consider it as some kind of function that maps {process PID, DLL} => Set of UID. And by UID, I AM talking about the system level user ID. Remember that this feature is a processor level feature. So it has to be transparent to the OS (well at least, to the OS Scheduler). Hence the output of this feature should be understandable to the OS kernel. Or so I hope as the implementation details are vague till now.

mutual_ayed@sh.itjust.works · 2 months ago

So are sockets.

some_guy@lemmy.sdf.org · 2 months ago

Aha, thank you!

FizzyOrange@programming.dev · 2 months ago

The linked patch actually explains it really well.

If you want two processes to communicate (IPC) normally, you would set up some shared memory, so that in each process’s virtual memory some pages are shared between them both, so when process A writes to that memory it is visible to process B and vice versa. There are other ways to do IPC but this is the fastest since it doesn’t involve any syscalls (context switch to the kernel which is slow).

However it still requires each process to copy its private data into and out of the shared memory. Also it still requires a context switch from process A to process B.

A common use of IPC is for RPC (Remote Procedure Calls). Basically you want to run a function in the other process. This solution does that but differently and faster.

First, both processes share the same virtual memory. This is normally a recipe for disaster since it completely destroys the normal process memory sandboxing (a memory error in one process can now crash the other one!). However they use a new Intel MPK hardware feature that allows splitting the pages into one of 16 groups (“keys”), and you can control read/write access to each group independently.
Next when process A wants to call a function in process B, the RPAL code does a lightweight context switch in user space. I guess this just means changing the MPK key, and normal saving of registers for the function call. I’m not sure what else you’d need to change. It’s a little unclear about how you pass heap data from process A to process B. I guess both processes just have read access to the entire address space? That doesn’t really change the security model on Unix since any process can already debug any other process with no extra permissions.

It sounds very similar to running two processes as if they were threads in the same process. Why not just use threads? Probably a better option if you can, but I guess if you are e.g. running processes written in different languages that might be tricky?

some_guy@lemmy.sdf.org · 2 months ago

Ok, I feel like I kind of understand. I saved this so I can reread it later to make sure. Thanks!

macniel@feddit.org · 2 months ago

Isn’t that the company behind TikTok?

pohart@programming.dev · 2 months ago

Yes it is

macniel@feddit.org · 2 months ago

It was kinda rhetorical; I really don’t like to have tencent any say/proposals by them in our kernel.

pelya@lemmy.world · 2 months ago

What length would people go to work around Python’s GIL.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 2 months ago

So, moving in the opposite direction of microkernels. Great. Just what we need: less reliability and more programs crashing each other.

But Rust will save us from all that, right?

Bytedance Proposes Faster Linux Inter-Process Communication With "Run Process As Library"

Bytedance Proposes Faster Linux Inter-Process Communication With "Run Process As Library"

Attention Required! | Cloudflare