If you're in trouble and cannot find an answer to a question which goes beyond Stack Overflow...
If you have a not-so-usual solution for your problems but need to justify it to your boss...
If you like to think on your own rather than blindly follow "common wisdom" and "profound truth"...
...then 'No Bugs' Hare on Soft.ware might be the right place for you.
Your mileage may vary. Batteries not included

System Architecture (and it’s subfield Software Architecture) is a discipline which is surprisingly poorly covered. In a sense, it is still more an art than a science, and usually requires somebody intimately familiar with practical systems, to tell what’s to do and what’s to avoid when building a system.

IT Hares have lots of experience in both Software Architecture and more general System Architecture, and are trying to share their knowledge (and more importantly, their feelings) about them.

All Musings on System Architecture, page 4:

Distributed Databases 101: CAP, BASE, and Replication

Quote: “The worst case of inconsistency happens when row X in database A is modified (by a Client connected to datacenter A), and the same row X in database B is independently modified too (by a Client connected to datacenter B).”
Another Quote: “One way to avoid reports affecting operational DB, is via creating a read-only replica of our main operational DB – and running our reports off that replica.”
[]

Databases 101: ACID, MVCC vs Locks, Transaction Isolation Levels, and Concurrency

Quote: “for larger-scale MOGs, functionality-wise it is common to have 4 different types of databases: transactional processing (OLTP) DBs, real-time reporting DBs, archive DBs, and analytical DBs (OLAP)”
Another Quote: “both MVCC-based and Lock-based DBMS issue locks (and therefore, both can cause all kinds of trouble such as deadlocks etc.); however, the difference lies with the number of locks issued by them.”
[]

Deterministic Components for Distributed Systems

Quote: “Then you can recover from any single server failure in a perfectly transparent manner”
Another Quote: “after the program fails in production, we can get the input log and run it in the comfort of a developer’s machine, under a debugger, as many times as we want, and get exactly the same variables at exactly the same points as happened in production”
[]

Implementing Queues for Event-Driven Programs

Quote: “full queues SHOULD NOT happen during normal operation”
Another Quote: “With queues-implemented-over-mutexes like the ones we’ve written above, the most annoying thing performance-wise is that there is a chance that the OS’s scheduler can force the preemptive context switch right when the thread-being-preempted-is-owning-our-mutex.”
[]

Network Programming: Socket Peculiarities, Threads, and Testing

Quote: “I am not saying that this architecture is the only viable one, but it does work for TCP for sure (and performs reasonably well too)”
Another Quote: “The whole task of optimizing performance beyond, say, 20-50K packets/second per box tends to be Quite Elaborated, and involves quite a few things which are platform- and hardware-dependent.”
[]

Avoiding Ugly Afterthoughts. Part a. From Writing for Cross-Platform, to Writing for Debugging and Production Post-Mortem, with Error Handling in between

Quote: “It is strongly recommended to have your build server to compile your game for at least two sufficiently-different platforms from the very beginning”
Another Quote: “If allocation of 50 bytes causes an “out of memory” error, we’re probably already long dead because of unacceptable swapping. And even if we disabled swap file – chances that we will recover from this condition, are infinitesimally small”
[]