β

config.threadsafe! Rails 中的多进程和多线程(翻译)

yanghao's blog 240 阅读

原文写在 Rails 3时代,目前 Rails 4已经去除 config.threadsafe!,默认已经是线程安全的。

CONFIG.THREADSAFE!: WHAT DOES IT DO?

def threadsafe!
  @preload_frameworks = true
  @cache_classes      = true
  @dependency_loading = false
  @allow_concurrency  = true
  self
end

Preloading Frameworks

The first option @preload_frameworks does pretty much what it says, it forces the Rails framework to be eagerly loaded on boot. When this option is not enabled, framework classes are loaded lazily via autoload.

第一个选项 @preload_frameworks 强制 Rails framework 在启动时全部加载。当这个选项没有启用,framework 类将会延时加载。

In multi-threaded environments, the framework needs to be eagerly loaded before any threads are created because of thread safety issues with autoload.

在多线程环境中,framework 需要在任何一个线程创建之前全部加载,因为自动加载具有线程安全问题。

We know that loading the framework isn’t threadsafe, so the strategy is to load it all up before any threads are ready to handle requests.

我们知道 framework 加载不是线程安全的,所以在任何一个线程准备接受请求之前要先全部加载。

Caching classes

The @cache_classes option controls whether or not classes get reloaded.

@cache_classes 选项控制类是否重新加载。

Remember when you’re doing “TDD” in your application?

谨记,在程序中使用 TDD 方法的时候?

You modify a controller, then reload the page to “test” it and see that things changed?

修改一个控制器,然后重新加载这个页面进行 test,你能看到改变?

Ya, that’s what this option controls.

是的,那就是这个选项所控制。

When this option is false, as in development, your classes will be reloaded when they are modified.

在开发环境中当这个选项为 false,被修改过的类会重新加载。

Without this option, we wouldn’t be able to do our “F5DD” (yes, that’s F5 Driven Development).

如果没有这个选项,那么我们就不可能使用 F5DD (F5 Driven Development)

In production, we know that classes aren’t going to be modified on the fly, so doing the work to figure out whether or not to reload classes is just wasting resources, so it makes sense to never reload class definitions.

在生产环境中,我们不会去修改类,因此去搞明白是否重新加载类是在浪费资源,因此永远不要去重新加载类是有意义的。

Dependency loading

This option, @dependency_loading controls code loading when missing constants are encountered.

当遇到找不到的常量时候 @dependency_loading 控制如何加载代码。

For example, a controller references the User model, but the User constant isn’t defined.

例如:一个控制器引用了一个叫 User 的模型,但是并没有定义 User 常量。

In that case, if @dependency_loading is true, Rails will find the file that contains the User constant, and load that file.

在这个例子中,如果 @dependency_loading 是 true,Rails将会寻找包涵 User 常量的文件,并且加载这个文件。

We already talked about how code loading is not thread safe, so the idea here is that we should load the framework, then load all user code, then disable dependency loading.

我们已经知道在非线程安全中如何加载代码,因此一个想法是,我们加载 framework,然后加载所有用户代码,然后关闭 dependency loading。

Once dependency loading is disabled, framework code and app code should be loaded, and any missing constants will just raise an exception rather than attempt to load code.

一旦 dependency loading 被禁用,框架代码和 app 代码将会被加载,任何找不到的常量会立即抛出异常,而不是尝试着加载代码。

We justify disabling this option in production because (as was mentioned earlier) code loading is not threadsafe, and we expect to have all code loaded before any threads can handle requests.

在生产环境中我们禁用这个选项是因为代码加载并非线程安全,我们希望在任何一个线程处理请求之前加载全部的代码。

Allow concurrency

@allow_concurrency is my favorite option.

@allow_concurrency 是我最喜欢的选项。

This option controls whether or not the Rack::Lock middleware is used in your stack.

这个选项控制着 Rack::Lock 是否被在你的 stack 中使用。

Rack::Lock wraps a mutex around your request.

Rack::Lock 给你的请求前后加一个互斥。

The idea being that if you have code that is not threadsafe, this mutex will prevent multiple threads from executing your controller code at the same time.

这个想法是来自于如果你的代码是非线程安全的,这个互斥防止多线程在同时执行你的控制器代码。

When threadsafe! is set, this middleware is removed, and controller code can be executed in parallel.

当使用 threadsafe!,这个中间件就会被移除,控制器代码就会被同时执行。

MULTI PROCESS VS MULTI THREAD

Whether a multi-process setup or a multi-threaded setup is best for your application is beyond the scope of this article.

这篇文章不介绍多进程或者多线程是否对你的应用程序是最好的。

Instead, let’s look at how the threadsafe! option impacts each configuration (multi-proc vs mult-thread) and compare and contrast the two.

相反,让我们来看看 threadsafe! 是怎么影响每个 configruation (多进程和多线程),然后把他们进行对比和比较。

Code loading and caching

I’m going to lump the first three options (@preload_frameworks, @cache_classes, and @dependency_loading) together because they control roughly the same thing: code loading.

我准备把前3个选项放在一起,因为他们都控制着大致同样的事情:代码加载。

We know autoload to not be threadsafe, so it makes sense that in a threaded environment we should do these things in advance to avoid deadlocks.

我们知道 autoload 不是线程安全的,所以在多线程环境中,我们应该预先做些事情来避免发生死锁,这样是有意义的。

@cache_classes is enabled by default regardless of your concurrency model.

@cache_classes 默认是启用的,和你的并发模型没有关系。

In production, Rails automatically preloads your application code so if we were to disable @dependency_loading in either a multi-process model or a multi-threading model, it would have no impact.

在生产环境中,Rails 自动预加载你的程序代码,所以,如果我们在多进程模型或者多线程模型中关闭 @dependency_loading 选项,是不会有任何影响的。

Among these settings, the one to differ most depending on concurrency model would be @preload_frameworks.

在这些设置中,和大部分不同的,依靠并发模型的总是 @preload_frameworks 。

In a multi-process environment, if @preload_frameworks is enabled, it’s possible that the total memory consumption could go up.

在多进程环境中,如果 @preload_frameworks 被启用,所有内存可能会消耗完。

But this depends on how much of the framework your application uses.

但是,这依赖你的应用程序使用多少框架。

For example, if your Rails application makes no use of Active Record, enabling @preload_frameworks will load Active Record in to memory even though it isn’t used.

例如,如果你的 Rails 程序没有使用 Active Record, 即使不使用他,启用 @preload_frameworks 将加载 Active Record 到内存中。

So the worst case scenario in a multi-process environment is that a process might take up slightly more memory.

所以多线程环境中最坏的情况是进程可能多占用一些内存。

This is the situation today, but I think that with smarter application loading techniques, we could actually remove the @preload_frameworks option, and maintain minimal memory usage.

这是目前的情况,但我认为有更智能的加载技术,事实上我们可以删除 @preload_frameworks 选项,保持最小的内存使用量。

Rack::Lock and the multi-threaded Bogeyman

Rack::Lock is a middleware that is inserted to the Rails middleware stack in order to protect our applications from the multi-threaded Bogeyman.

Rack::Load 是一个被插入到 Rails 中间件堆栈的中间件,使我们的应用不受多线程的影响。

This middleware is supposed to protect us from nasty race conditions and deadlocks by wrapping our requests with a mutex.

这个中间件通过给我们的请求加一个互斥来保护我们避免竟太条件和死锁。

The middleware locks a mutex at the beginning of the request, and unlocks the mutex when the request finishes.

这个中间件在请求开始之前加一个互斥锁,在请求结束之后解开这个锁。

To study the impact of this middleware, let’s write a controller that is not threadsafe, and see what happens with different combinations of webservers and different combinations of config.threadsafe!.

为了学习这个中间件效果,让我们写一个非线程安全的控制器,看看组合不同的 webserviers 和组合不同的 config.threadsafe! 会发生什么。

Here is the code we’ll use for comparing concurrency models and usage of Rack::Lock:

class UsersController < ApplicationController
  @counter = 0
  class << self
    attr_accessor :counter
  end
  trap(:INFO) {
    $stderr.puts "Count: #{UsersController.counter}"
  }
  def index
    counter = self.class.counter # read
    sleep(0.1)
    counter += 1                 # update
    sleep(0.1)
    self.class.counter = counter # write
    @users = User.all
    respond_to do |format|
      format.html # index.html.erb
      format.json { render json: @users }
    end
  end
end

This controller has a classic read-update-write race condition.

这个控制器包涵一个典型的 读-更新-写 的竟态条件。

Typically, you would see this code in the form of variable += 1, but in this case it’s expanded to each step along with a sleep in order to exacerbate the concurrency problems.

通常情况,你会看到这样的代码 variable += 1,但是在这种情况下,这个例子在每一步增加 sleep 加剧并发问题。

Our code increments a counter every time the action is run, and we’ve set a trap so that we can ask the controller what the count is.

每次运行这个 action 我们的代码给计数器加一,并且设置一个捕捉器,这样就可以记录下数值。

We’ll run the following code to test our controller:

运行下面测试代码:

require 'net/http'
uri = URI('http://localhost:9292/users')
.times {
.times.map {
    Thread.new { Net::HTTP.get_response(uri) }
  }.each(&:join)
}

This code generates 500 requests, doing 5 requests simultaneously 100 times.

这段代码生成500次请求,并发5个请求总共100次。

Rack::Lock and a mult-threaded webserver

First, let’s test against a threaded webserver with threadsafe! disabled.

首先,让我们通过关闭 threadsafe! 来测试线程 webserver。

That means we’ll have Rack::Lock in our middleware stack.

他的意思是我们在中间件堆栈中保留 Rack::Lock。

For the threaded examples, we’re going to use the puma webserver.

我们将用 puma webserver 来说明线程的例子。

Puma is set up to handle 16 concurrent requests by default, so we’ll just start the server in one window:

Puma 被默认设置处理16个并发请求,因此我们在一个窗口中启动这个 server:

[aaron@higgins omglol]$ RAILS_ENV=production puma 
Puma 1.4.0 starting...
* Min threads: 0, max threads: 16
* Listening on tcp://0.0.0.0:9292
Use Ctrl-C to stop

Then run our test in the other and send a SIGINFO to the webserver:

然后在其它 server 上进行测试,并发送 SIGINFO 信号给 webserver:

[aaron@higgins omglol]$ time ruby multireq.rb 
real  1m46.591s
user  0m0.709s
sys 0m0.369s
[aaron@higgins omglol]$ kill -INFO 59717
[aaron@higgins omglol]$

If we look at the webserver terminal, we see the count is 500, just like we expected:

如果我们查看 webserver 终端,可以看到计数器是500,和预期的样:

.0.0.1 - - [16/Jun/2012 16:25:58] "GET /users HTTP/1.1" 200 - 0.8815
.0.0.1 - - [16/Jun/2012 16:25:59] "GET /users HTTP/1.1" 200 - 1.0946
Count: 500

Now let’s retry our test, but enable config.threadsafe! so that Rack::Lock is not in our middleware:

现在让我们再次执行测试,但是启用 config.threadsafe!,这样一来 Rack::Lock 将不会被使用:

[aaron@higgins omglol]$ time ruby multireq.rb 
real  0m24.452s
user  0m0.724s
sys 0m0.382s
[aaron@higgins omglol]$ kill -INFO 59753
[aaron@higgins omglol]$

This time the webserver logs are reporting “200”, not even close to the 500 we expected:

这次 webserver 日志显示计数器是200,甚至没有达到我们的预期500:

.0.0.1 - - [16/Jun/2012 16:30:50] "GET /users HTTP/1.1" 200 - 0.2232
.0.0.1 - - [16/Jun/2012 16:30:50] "GET /users HTTP/1.1" 200 - 0.4259
Count: 200

So we see that Rack::Lock is ensuring that our requests are running in a thread safe environment.

因此我们看到 Rack::Lock 确保在线程安全的环境中运行我们的请求。

You may be thinking to yourself “This is awesome! I don’t want to think about threading, let’s disable threadsafe! all the time!”, however let’s look at the cost of adding Rack::Lock.

你可以会想 “这是可怕的! 我不想要考虑线程,让我们永久关闭 threadsafe!“,然而让我们来看看添加了 Rack::Lock 的成本。

Did you notice the run times of our test program?

你注意到测试程序的运行时间了吗?

The first run took 1 min 46 sec, where the second run took 24 sec.

第一个运行1分46秒,第二个运行24秒。

The reason is because Rack::Lock ensured that we have only one concurrent request at a time.

因为 Rack::Lock 确保在同一时间只允许一个并发。

If we can only handle one request at a time, it defeats the purpose of having a threaded webserver in the first place.

如果我们只能一次处理一个请求,这就违背多线程服务器的目的。

Hence the option to remove Rack::Lock.

因为要删除选项 Rack::Lock。

Rack::Lock and a mult-process webserver

Now let’s look at the impact Rack::Lock has on a multi-process webserver.

现在让我们看看 Rack::Lock 对多进程 webserver 的影响。

For this test, we’re going to use the Unicorn webserver.

我们继续使用 Unicorn webserver 来做这个测试。

We’ll use the same test program to generate 5 concurrent requests 100 times.

我们使用相同的测试程序生成5个并发请求执行100次。

First let’s test with threadsafe! disabled, so Rack::Lock is in the middleware stack:

先看看禁用 threadsafe! 的测试,因此 Rack::Lock 被使用:

[aaron@higgins omglol]$ unicorn -E production
I, [2012-06-16T16:45:48.942354 #59827]  INFO -- : listening on addr=0.0.0.0:8080 fd=5
I, [2012-06-16T16:45:48.942688 #59827]  INFO -- : worker=0 spawning...
I, [2012-06-16T16:45:48.943922 #59827]  INFO -- : master process ready
I, [2012-06-16T16:45:48.945477 #59829]  INFO -- : worker=0 spawned pid=59829
I, [2012-06-16T16:45:48.946027 #59829]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:45:51.983627 #59829]  INFO -- : worker=0 ready

Unicorn only forks one process by default, so we’ll increase it to 5 processes and run our test program:

Unicorn 默认只 fork 一个进程,因为我们增加到5个进程,运行测试程序:

code

We have to run kill on multiple pids because we have multiple processes listening for requests. If we look at the logs:

因为有多个进程监听请求,所有必须运行结束多个 pid。 日志如下:

[aaron@higgins omglol]$ kill -SIGTTIN 59827
[aaron@higgins omglol]$ kill -SIGTTIN 59827
[aaron@higgins omglol]$ kill -SIGTTIN 59827
[aaron@higgins omglol]$ kill -SIGTTIN 59827
[aaron@higgins omglol]$ time ruby multireq.rb 
real  0m23.080s
user  0m0.634s
sys 0m0.320s
[aaron@higgins omglol]$ kill -INFO 59829 59843 59854 59865 59876
[aaron@higgins omglol]$

We see the count totals to 500. Great! No surprises, we expected a total of 500.

我们看到计数器总共等于500。很好,和我们的预期的一样。

Now let’s run the same test but with threadsafe! enabled.

现在让我们运行同样的测试程序,但是启用 threadsafe!。

We learned from our previous tests that we’ll get a race condition, so let’s see the race condition in action in a multi-process environment.

我们从上一个测试得知,将会遇到竟态条件,因此在多进程环境中让我们看看竟态条件的运行状况。

We enable threadsafe mode to eliminate Rack::Lock, and fire up our webserver:

启用 threadsafe 模式,删除 Rack::Lock,…….

[aaron@higgins omglol]$ unicorn -E production
I, [2012-06-16T16:45:48.942354 #59827]  INFO -- : listening on addr=0.0.0.0:8080 fd=5
I, [2012-06-16T16:45:48.942688 #59827]  INFO -- : worker=0 spawning...
I, [2012-06-16T16:45:48.943922 #59827]  INFO -- : master process ready
I, [2012-06-16T16:45:48.945477 #59829]  INFO -- : worker=0 spawned pid=59829
I, [2012-06-16T16:45:48.946027 #59829]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:45:51.983627 #59829]  INFO -- : worker=0 ready
I, [2012-06-16T16:46:54.379332 #59827]  INFO -- : worker=1 spawning...
I, [2012-06-16T16:46:54.382832 #59843]  INFO -- : worker=1 spawned pid=59843
I, [2012-06-16T16:46:54.384204 #59843]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:46:56.624781 #59827]  INFO -- : worker=2 spawning...
I, [2012-06-16T16:46:56.635782 #59854]  INFO -- : worker=2 spawned pid=59854
I, [2012-06-16T16:46:56.636441 #59854]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:46:57.703947 #59827]  INFO -- : worker=3 spawning...
I, [2012-06-16T16:46:57.708788 #59865]  INFO -- : worker=3 spawned pid=59865
I, [2012-06-16T16:46:57.709620 #59865]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:46:58.091562 #59843]  INFO -- : worker=1 ready
I, [2012-06-16T16:46:58.799433 #59827]  INFO -- : worker=4 spawning...
I, [2012-06-16T16:46:58.804126 #59876]  INFO -- : worker=4 spawned pid=59876
I, [2012-06-16T16:46:58.804822 #59876]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:47:01.281589 #59854]  INFO -- : worker=2 ready
I, [2012-06-16T16:47:02.292327 #59865]  INFO -- : worker=3 ready
I, [2012-06-16T16:47:02.989091 #59876]  INFO -- : worker=4 ready
Count: 100
Count: 100
Count: 100
Count: 100
Count: 100

Now increase to 5 processes and run our test:

增加到5个进程并运行测试:

[aaron@higgins omglol]$ unicorn -E production
I, [2012-06-16T16:53:48.480272 #59920]  INFO -- : listening on addr=0.0.0.0:8080 fd=5
I, [2012-06-16T16:53:48.480630 #59920]  INFO -- : worker=0 spawning...
I, [2012-06-16T16:53:48.482540 #59920]  INFO -- : master process ready
I, [2012-06-16T16:53:48.484182 #59921]  INFO -- : worker=0 spawned pid=59921
I, [2012-06-16T16:53:48.484672 #59921]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:53:51.666293 #59921]  INFO -- : worker=0 ready

Finally, take a look at our webserver output:

最后,看看 webserver 的输出:

[aaron@higgins omglol]$ kill -SIGTTIN 59920
[aaron@higgins omglol]$ kill -SIGTTIN 59920
[aaron@higgins omglol]$ kill -SIGTTIN 59920
[aaron@higgins omglol]$ kill -SIGTTIN 59920
[aaron@higgins omglol]$ time ruby multireq.rb 
real  0m22.920s
user  0m0.641s
sys 0m0.327s
[aaron@higgins omglol]$ kill -INFO 59932 59921 59943 59953 59958

Strange. Our counts total 500 again despite the fact that we clearly saw this code has a horrible race condition.

奇怪。计数器总数仍然等于500,尽管我们知道这段代码会遇到可怕的竟态条件。

The fact of the matter is that we don’t need Rack::Lock in a multi-process environment.

事实上,在多进程环境中我们不需要 Rack::Lock。

We don’t need the lock because the socket is our lock.

不需要他是因为 socket 为我们锁定。

In a multi-process environment, when one process is handling a request, it cannot listen for another request at the same time (you would need threads to do this).

在一个多进程环境中,当一个进程正在处理一个请求,他不能同时监听其它的请求(之前,你需要线程才可以这么做)。

That means that wrapping a mutex around the request is useless overhead.

那就意味着给请求加一个互斥是无用的开销。

CONCLUSION

I think this blurgh post is getting too long, so let’s wrap it up.

我认为这个 blurgh post 太长了,因此我们来把他总结一下。

The first three options that config.threadsafe! controls (@preload_frameworks, @cache_classes, and @dependency_loading) are either already used in a multi-process environment, or would have little to no overhead if used in a multi-process environment.

config.threadsafe! 控制的前3个选项(@preload_frameworks, @cache_classes, @dependency_loading)都已经在多进程环境中被使用,如果在多进程环境中使用,将会没有一点开销。

The final configuration option, @allow_concurrency is completely useless in a multi-process environment.

最后一个选项,@allow_concurrency 在多进程环境中完全无效。

In a multi-threaded environment, the first three options that config.threadsafe! controls are either already used by default or are absolutely necessary for a multi-threaded environment.

在一个多线程环境中,config.threadsafe! 控制着前3个选项使用在多线程环境中是非常有必要的。

Rack::Lock cripples a multi-threaded server such that @allow_concurrency should always be enabled in a multi-threaded environment.

Rack::Lock 削弱多进程服务器,因此 @allow_concurrency 在多线程环境中应该始终被开启。

In other words, if you’re using code that is not thread safe, you should either fix that code, or consider moving to the multi-process model.

换句话说,如果你使用的代码不是线程安全的,你需要修复你的代码,或者考虑使用多进程模型。

Because enabling config.threadsafe! would have little to no impact in a multi-process environment, and is absolutely necessary in a multi-threaded environment, I think that we should enable this flag by default in new Rails applications with the intention of removing the flag in future versions of Rails.

我认为我们应该在新的 Rails 应用中默认启用这个标志,来达到在未来的 Rails 版本中移除这个标志的目的。因为,启用 config.threadsafe! 对多进程环境没有影响,而且这对于多线程环境是非常有必要的。

作者:yanghao's blog
这是一个wordpress博客

发表评论