On 17th May 2010, we went to a free talk by Simon Brown about architecture. It was a very informative, and with my background of being part of teams of self-organising developers, the 'architect' role was not clearly defined. His talk helped me identify what the architect role involves.
One thing I learned was the definition of "non-functional requirements". As developers we'd often discuss stuff like portability, scalability, high availability etc, but with no quantified requirements or direction from the business.
His website is at codingthearchitecture.com and contains a lot of useful information.
Processing should continue if a node or instance has died. This includes computers, web containers, databases
Processing should continue if an external system is down. e.g. Insurance system is not critical in a booking system, so user should be able to continue if insurance is not available.
The load balancer should monitor notes and check they are healthy.
Check if writing of logs is asynchronous and non-blocking.
Logs should be automatically archived / rotated / purged. Check how long it will take before disk space runs out if the archiving job fails.
The application should still function if the disk is full.
Check if the application can limit requests to protect itself, and what happens if it is flooded with requests. It could 'drop' requests or return busy responses.
Check if there an alert system in place in case throttling is required.
A timeout should be set between the application and database
A timeout should be set for incoming transactions into the application
A timeout should be set for any interaction between tiers
A timeout should be set for interactions between application and external systems.