Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distribute worker threads to all cores #197

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zhanghongzhou
Copy link
Contributor

Signed-off-by: hongzhou zhang zhang.hongzhou@hotmail.com

Signed-off-by: hongzhou zhang <zhang.hongzhou@hotmail.com>
@mitake
Copy link
Contributor

mitake commented Nov 24, 2015

@zhanghongzhou thanks a lot for your PR. Could you provide benchmarking result? Of course it doesn't have to be good for now.

@mitake
Copy link
Contributor

mitake commented Nov 24, 2015

@johnzhang1985 please comment in English :)

@vtolstov
Copy link
Contributor

I can test this patch and compare to old version.

@mitake
Copy link
Contributor

mitake commented Nov 24, 2015

@zhanghongzhou @johnzhang1985 @vtolstov

I created an additional patch for creating per cpu work queue. You can find it here: https://github.com/sheepdog/sheepdog/tree/per-cpu-work-queue

It is not tested well. I'm glad if you can test and share benchmark result.

@tmenjo
Copy link
Contributor

tmenjo commented Mar 30, 2016

@mitake:

We plan to test sheepdog/per-cpu-work-queue next quarter. The branch is based on a-little-old master (04fe301 committed on 13 Nov 2015), so first I try to rebase onto the latest master (8772904 22 days ago).

However, I encountered rebase conflict. Some trial-and-error have done and I found that the branch conflicted with the commit 698e5d2 merging PR #217.

So I made my own two branches:

  • First one, per-cpu-work-queue-1, is the branch rebased onto "another master" which skips the conflicting commit above. This encountered neither conflict nor compile error, but its history is very different from true master.
  • Second one, per-cpu-work-queue-2, is the branch simply rebased onto the latest master. This encountered both conflict and compile error so I resolved them, but I am not so confident that my work is right.

I would be very glad if you review my branches, especially the last some commits in the second one. I think it is a little better than the first one.

@mitake
Copy link
Contributor

mitake commented Apr 1, 2016

@tmenjo I think the second branch would be ok. Could you squash the commits that resolve conflict? Then they can be merged.

@tmenjo
Copy link
Contributor

tmenjo commented Apr 1, 2016

@mitake I force-updated per-cpu-work-queue-2. Past four commits are squashed into three commits; zhanghongzhou's one, mitake's one (which my one resolving conflict is meld into) and my another one resolving compile error. Could you review again?

@sa-kura
Copy link
Contributor

sa-kura commented Apr 26, 2016

I tried https://github.com/tmenjo/sheepdog/tree/per-cpu-work-queue-2 branch.
It was not an expected behavior.

I started sheepdog and executed taskset command.
Cpu affinities of all threads were used all processors, as follows.

# taskset -p -a `pgrep sheep| head -1`
pid 32710's current affinity mask: ffffff
pid 32712's current affinity mask: ffffff
pid 32713's current affinity mask: ffffff
            (snip)

A result of cause analysis,
pthread_setaffinity_np() function isn't called because 'wi->q.wq_state' don't set value after the initialize.
'wi->q.wq_state' and pthread_setaffinity_np() function are used, as follows.

    if (wi->q.wq_state != WQ_ORDERED &&
        pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset))

'wi->q.wq_state' is always 0.
WQ_ORDERED's enum value is 0.
So, pthread_setaffinity_np() function isn't executed.

I think 'wi->q.wq_state' should change to 'wi->tc'.
I modified the source code as follows.

# diff -u work.c.org work.c
--- work.c.org  2016-04-21 17:44:30.889242586 +0900
+++ work.c      2016-04-25 18:56:36.529934803 +0900
@@ -293,7 +293,7 @@
                        return -1;
                }
-               if (wi->q.wq_state != WQ_ORDERED &&
+               if (wi->tc != WQ_ORDERED &&
                    pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset))
                        sd_info("set thread affinity failed");
@@ -312,7 +312,7 @@
        struct wq_info *wi = container_of(q, struct wq_info, q);
        int cpu;
-       if (q->wq_state == WQ_ORDERED)
+       if (wi->tc == WQ_ORDERED)
                cpu = 0;
        else if (is_main_thread()) {
                static __thread int next_core;
@@ -379,7 +379,7 @@
        /* wait pthread_setaffinity_np() in create_worker_threads() */
        eventfd_xread(wi->create_efd);
        /* now I'm running on correct cpu */
-       cpu = wi->q.wq_state == WQ_ORDERED ? 0 : mycpu();
+       cpu = wi->tc == WQ_ORDERED ? 0 : mycpu();
        while (true) {

I started sheepdog by code as above, and executed taskset command.
Cpu affinities of all threads were used dispersed processors, as follows.

# taskset -p -a `pgrep sheep| head -1`
pid 31032's current affinity mask: ffffff
pid 31034's current affinity mask: 1
pid 31035's current affinity mask: 2
pid 31036's current affinity mask: 4
pid 31037's current affinity mask: 8
pid 31038's current affinity mask: 10
pid 31039's current affinity mask: 20
pid 31040's current affinity mask: 40
pid 31041's current affinity mask: 80
            (snip)

@mitake
Copy link
Contributor

mitake commented Apr 26, 2016

@sa-kura thanks a lot for your analysis. Is it possible to evaluate performance after your change?

@sa-kura
Copy link
Contributor

sa-kura commented Apr 26, 2016

@mitake Yes. I will evaluate performance with @tmenjo

tmenjo added a commit to tmenjo/sheepdog that referenced this pull request May 13, 2016
This fixes an issue reported by sa-kura.

See: sheepdog#197 (comment)

Signed-off-by: Takashi Menjo <menjo.takashi@lab.ntt.co.jp>
@vtolstov
Copy link
Contributor

vtolstov commented Mar 6, 2017

@mitake @tmenjo does this useful now after some patches to control workqueue size ?

@mitake
Copy link
Contributor

mitake commented Mar 7, 2017

@vtolstov I'm not sure yet. But if you can benchmark https://github.com/sheepdog/sheepdog/tree/per-cpu-work-queue with rebasing on the master, it will be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants