Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ThreadPool] Run a thread per CPU in SimpleExecutor #104

Merged
merged 1 commit into from
May 17, 2022

Conversation

ChuanqiXu9
Copy link
Collaborator

Why

Now we don't bind threads in ThreadPool to physical CPU cores. So we might not be able to get the expected concurrency.

What is changing

This one should be transparent for users.

@ChuanqiXu9 ChuanqiXu9 changed the title [ThreadPool] Run a thraed per CPU in SimpleExecutor [ThreadPool] Run a thread per CPU in SimpleExecutor May 16, 2022
// Run threads per core.
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(i, &cpuset);
Copy link
Collaborator

@RainMark RainMark May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1、cpu id maybe not start at 0 or not be sequenced in container environment
2、_threadNum maybe great than available cpus
3、macOS support pthread?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. According to https://linux.die.net/man/3/cpu_set, the cpu id starts at 0 and is continuous.
  2. I refactored it to CPU_SET(i % CPU_SETSIZE, &cpuset); So it wouldn't overflow.
  3. I didn't find related things in https://developer.apple.com/search/?q=CPU_SET. Given the users who would need performance in macOS should be less, I think we could give up to support them.

int rc = pthread_setaffinity_np(_threads[i].native_handle(),
sizeof(cpu_set_t), &cpuset);
if (rc != 0)
std::cerr << "Error calling pthread_setaffinity_np: " << rc << "\n";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need a log component?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I filed an issue at #105

Copy link
Collaborator

@qicosmos qicosmos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qicosmos
Copy link
Collaborator

Do you compare the performance using this improvement? @ChuanqiXu9

@ChuanqiXu9 ChuanqiXu9 force-pushed the main branch 2 times, most recently from 05dbd9b to 0fa280a Compare May 17, 2022 02:38
@ChuanqiXu9
Copy link
Collaborator Author

Do you compare the performance using this improvement? @ChuanqiXu9

Yeah, I found this one when looking at #99

@RainMark
Copy link
Collaborator

LGTM

@ChuanqiXu9 ChuanqiXu9 merged commit e24ab38 into alibaba:main May 17, 2022
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(cpu_ids[i % cpu_ids.size()], &cpuset);
int rc = pthread_setaffinity_np(_threads[i].native_handle(),
Copy link
Collaborator

@RainMark RainMark May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use sched_setaffinity syscall to avoid include pthread,because already be linux system, but it is also ok to use pthread

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants