5/25/2010

05-25-10 - Thread Insurance

I just multi-threaded my video test app recently, and it was reasonably easy, but I had a few nagging bugs because of hidden ways they were touching shared memory without protection deep inside functions. Okay, so I found them and fixed them, but I'm left with a problem - any time I touch one of those deep functions, I could screw up the threading without realizing it. And I might not get any indication of what I did for weeks if it's a rare race.

What I would like is a way to make this more robust. I have very strong threading primitives, I want a way to make sure that I use them! In particular, I want to be able to mark certain structs as only touchable when a critsec is locked or whatever.

I think that a lot of this could be done with Win32 memory page protections. So far as I know there's no way to associate protections per-thread, (eg. to make a page read/write for thread A but no-access for thread B). If I could do that it would be super sweet.

One idea is to make the page no access and then install my own exception handler that checks what thread it is, but that might be too much overhead (and not sure if that would fail for other reasons).

The main usage is not for protected crit-sec'ed structs, that is really the easiest case to maintain because it's very obvious right there in the code that you need to take the critsec to touch the variables. The hard case to maintain is the ad hoc "I know this is safe to touch without protection". In particular I have a lot of code that runs like this :


Phase 1 : I know no threads are touching shared data item A
main thread does lots of writing in A

Phase 2 : fire up threads.  They only read from A and do so without protection.  They each write to unique areas B,C,D.

Phase 3 : spin down threads.  Now main thread can write A and read B,C,D.

So what I would really like to do is :

Phase 1 : I know no threads are touching shared data item A
main thread does lots of writing in A

-- set A memory to be read-only !
-- set B,C,D memory to be read/write only for their own thread

Phase 2 : fire up threads.  They only read from A and do so without protection.  They each write to unique areas B,C,D.

-- make A,B,C,D read/write only for main thread !

Phase 3 : spin down threads.  Now main thread can write A and read B,C,D.

The thing that this saves me from is when I'm tinkering in DoComplicatedStuff() which is some function called deep inside Phase 2 somewhere and I change it to no longer follow the memory access rule that it is supposed to be following. This is just my hate for having rules for code correctness that are not enforced by the compiler or at least by run-time asserts.

5 comments:

Kevin Gadd said...

The exception handler approach would work, but the exceptions would end up being really expensive. Maybe you could use that approach to run automated tests of your app to verify the thread-safety, but leave the exceptions off in actual builds when you use them to crunch on lots of data?

Tom Forsyth said...

> This is just my hate for having rules for code correctness that are not enforced by the compiler or at least by run-time asserts.

Isn't that what "const" is for?

Thankyou folks, I'm here all night.

Brian said...

Catching page faults I believe is pretty expensive if you actually take the page faults often. You can typically set up stuff like this on a per process basis in Linux. You would map shared pages to let each process see the others. I'd imagine that windows would also support this type of thing.

We're actually playing around with language support for stuff like this and having the type checker in the compiler complain about accessing protected data. It appears to be pretty useful. If we get some time, we might port it to annotations in Java so people can actually use it.

cbloom said...

"Catching page faults I believe is pretty expensive if you actually take the page faults often."

Yeah, it would be a debug-runs only thing, but I wonder if it would be too slow even for debug testing runs.

"You can typically set up stuff like this on a per process basis in Linux. You would map shared pages to let each process see the others. I'd imagine that windows would also support this type of thing."

You can do it per-process in Windows, but I think that's just too much friction. If you use processes rather than threads it is much easier to enforce cleanness because there is no implicitly shared memory, all sharing must be done explicitly by mapping process memory to each other.

"We're actually playing around with language support for stuff like this and having the type checker in the compiler complain about accessing protected data."

Yeah there is a lot of good work in this direction (see Bartosz etc and previous posts on my blog), but it really only applies well to the struct-critsec kind of paradigm, not to the kind of ad-hoc threading I'm talking about here.

Thatcher Ulrich said...

You could implement this with a handle or smart ptr idiom, with methods to grab a mutable or const ptr to the actual data, and those methods assert that the current thread has the right permissions. You still have a code-correctness criterion which is "don't pass bare pointers between threads."

old rants