O número correto depende de quanto tempo os processos passam bloqueados no IO.
O livro "Programação de Concorrência na JVM" tem algumas boas informações sobre isso:
"Determining the Number of Threads". For a large problem, we'd want to have at least as many threads as the number of available cores. This will ensure that as many cores as available to the process are put to work to solve our problem...
So the minimum number of threads is equal to the number of available cores. If all tasks are computation intensive, then this is all we need. Having more threads will actually hurt in this case because cores would be context switching between threads when there is still work to do. If tasks are IO intensive, then we should have more threads.
When a task performs an IO operation, its thread gets blocked. The processor immediately context switches to run other eligable threads. If we had only as many threads as the number of available cores, even though we have tasks to perform, they can't run because we haven't scheduled them on threads for the processors to pick up.
If tasks spend 50 percent of the time being blocked, then the nubmer of threads should be twice the number of available cores. If they spend less time being blocked--that is, they're computation intensive--then we should have fewer threads but no less than the number of cores. If they spend more time being blocked--that is, they're IO intensive--then we should have more threads, specifically, several multiples of the number of cores.
So we can compute the total number of threads we'd need as follows:
Number of threads = Number of Available Cores / (1 - Blocking Coefficient)
Se você precisar executar vários cálculos simultaneamente, talvez veja se é possível executá-los em um processo com um conjunto de encadeamentos dimensionado adequadamente.
Caso contrário, se você tiver o número ideal de encadeamentos para um cálculo, mas depois executar 8 de cada vez, você pode ter muitos.
A melhor solução é compará-lo experimentalmente.
Não sei exatamente o que você quer dizer com estacionamento central, mas a CPU tende a continuar executando o mesmo thread em um determinado núcleo por motivos de cache, embora também o mova algumas vezes por diferentes razões de calor / energia. Você pode investigar isso usando uma ferramenta como htop.