AOSA即《The Architecture of Open Source Application》是本不错的书,这本书的写成本本身也采源了开源社区的协作方式,目前已经出了两部,最新的版本为POSA即《The Performance of Open Source Application》,专注于开源软件的性能。
最近抽时间看了AOSA中关于ZeroMQ的章节,自己先前由于工作需求,简单了解到过ZeroMQ,这次可以借机会读读ZeroMQ创始人亲自写的章节,确实有不少的收获,记录在这里,在后续的项目实践中可以予以参考使用。
Library设计
The lesson here is pretty obvious: Don’t use global state in libraries. If you do, the library is likely to break when it happens to be instantiated twice in the same process.
ZeroMQ设计时经过分析和对比,最终采用了Library而非单一的消息服务器的方案,在设计Library时,得出了上述结论。
也即对于Library而言,最好避免全局状态,采用Context的方式较好,特别是存在library被额外的library依赖在同一程序中存在多份library实例时可以避免带来的竞争性问题。
了解真正的问题
There are many more pitfalls in benchmarking the messaging systems that we won’t go further into. The stress should rather be placed on the lesson learned: Make sure you understand the problem you are solving. Even a problem as simple as “make it fast” can take lot of work to understand properly. What’s more, if you don’t understand the problem, you are likely to build implicit assumptions and popular myths into your code, making the solution either flawed or at least much more complex or much less useful than it could possibly be.
原文附了一个很经典的双向消息交互的图片,解释了吞吐率和时延的评估上的思维误区,用户也许更关注的是从一个单点看到的吞吐率和时延而不是全局吞吐率和时延,这提醒我们要搞清楚到底我们面对的是什么样的问题,以便能够找出解决的办法。
写这个笔记时总让我想起自己这些年工作上的一些感悟,近来听到的一句很经典的话就网络设备商有时候是在自己发明方案,然后再去寻找问题,这样如何能做出真正满足客户需求的产品呢?
内存分配
Lesson learned: optimize where it makes difference. Optimizing pieces of code that are not on the critical path is wasted effort.
要在关键路径上做优化,在非关键路径上瞎优化是在浪费时间。
When thinking about performance, don’t assume there’s a single best solution. It may happen that there are several subclasses of the problem (e.g., small messages vs. large messages), each having its own optimal algorithm.
在做性能考量时,不要假定存在单一最佳方案,很可能一个问题下存在多个子类(如小消息和大消息,ZeroMQ的方案是小消息直接编码在消息句柄中,大消息则采用指针引用,避免内存拷贝),每个子类都有自己的最优算法。
批处理
Lesson learned: To get optimal throughput combined with optimal response time in an asynchronous system, turn off all the batching algorithms on the low layers of the stack and batch on the topmost level. Batch only when new data are arriving faster than they can be processed.
在异步系统中,在底层最好关闭批处理算法而让上层进行批处理操作。并且要按需开启批处理,在处理能力足够的情况,可以不进行批处理而减少开销。
并发处理
Lesson learned: When striving for extreme performance and scalability, consider the actor model; it’s almost the only game in town in such cases. However, if you are not using a specialised system like Erlang or ØMQ itself, you’ll have to write and debug a lot of infrastructure by hand. Additionally, think, from the very beginning, about the procedure to shut down the system. It’s going to be the most complex part of the codebase and if you have no clear idea how to implement it, you should probably reconsider using the actor model in the first place.
在追求高性能和弹性的时候,要考虑使用actor模型,ZeroMQ这里采用了多个线程,线程间采用Event进行通信(印象中好像基于Libevent),从而使得线程可以在CPU核上进行水平扩展,取得极大的并发性能。
这里ZeroMQ的创建中还额外提醒要在设计之初就考虑系统的关闭处理,这通常是系统中最为复杂的地方;以我们的经验来看,我们很多业务进程都不能良好的shutdown,或者说都不支持shutdown…
无锁算法
Lesson learned: Lock-free algorithms are hard to invent, troublesome to implement and almost impossible to debug. If at all possible, use an existing proven algorithm rather than inventing your own. When extreme performance is required, don’t rely solely on lock-free algorithms. While they are fast, the performance can be significantly improved by doing smart batching on top of them.
尽可能采用已知的无锁算法,避免自己造轮子发明一个(前面在公司我在开发一个软转发时,就参考了一个网上搜索到的无锁ring queue算法,实践证明比我自己想一个要好用的多,性能也比较好),另外,在无锁算法上再加上一些智能的批处理机制,会取得更大的性能提升。
API设计
Lesson learned: While code reuse has been promoted from time immemorial and pattern reuse joined in later on, it’s important to think of reuse in an even more generic way. When designing a product, have a look at similar products. Check which have failed and which have succeeded; learn from the successful projects. Don’t succumb to Not Invented Here syndrome. Reuse the ideas, the APIs, the conceptual frameworks, whatever you find appropriate. By doing so you are allowing users to reuse their existing knowledge. At the same time you may be avoiding technical pitfalls you are not even aware of at the moment.
这个经验与上面的类似,本质上就是尽可能减少造轮子,ZeroMQ参考BSD socket的API设计是非常成功的,从而使得用户学习起来非常容易;这就如同在通信业务里面,采用CISCO风格的CLI,显然用户会更容易上手一些。
附:
[1] The Architecture of Open Source Application
[2] WIKI百科上的Actor模型