提交 75224c4c 编写于 作者: T tbbdev

Committing Intel(R) TBB 2017 Update 8 source code

上级 eb6336ad
......@@ -2,6 +2,17 @@
The list of most significant changes made over time in
Intel(R) Threading Building Blocks (Intel(R) TBB).
Intel TBB 2017 Update 8
TBB_INTERFACE_VERSION == 9108
Changes (w.r.t. Intel TBB 2017 Update 7):
Bugs fixed:
- Fixed an assertion failure in debug tbbmalloc binaries when
TBBMALLOC_CLEAN_ALL_BUFFERS is used.
------------------------------------------------------------------------
Intel TBB 2017 Update 7
TBB_INTERFACE_VERSION == 9107
......
# Intel(R) Threading Building Blocks 2017 Update 7
[![Stable release](https://img.shields.io/badge/version-2017_U7-green.svg)](https://github.com/01org/tbb/releases/tag/2017_U7)
# Intel(R) Threading Building Blocks 2017 Update 8
[![Stable release](https://img.shields.io/badge/version-2017_U8-green.svg)](https://github.com/01org/tbb/releases/tag/2017_U8)
[![Apache License Version 2.0](https://img.shields.io/badge/license-Apache_2.0-green.svg)](LICENSE)
Intel(R) Threading Building Blocks (Intel(R) TBB) lets you easily write parallel C++ programs that take
......
......@@ -36,7 +36,7 @@ Use cases of Intel TBB integration into CMake-aware projects
------------------------------------------------------------
There are two types of Intel TBB packages:
* Binary packages with pre-built binaries for Windows* OS, Linux* OS and macOS*. They are available on the releases page of the Github repository: https://github.com/01org/tbb/releases. The main purpose of the binary package integration is the ability to build Intel TBB header files and binaries into your CMake-aware project.
* A source package is also available to download from the release page via the “Source code” link. In addition, it can be cloned from the repository by ``git clone https://github.com/01org/tbb.git``. The main purpose of the source package integration is to allow you to do a custom build of the Intel TBB library from the source files and then build that into your CMake-aware project.
* A source package is also available to download from the release page via the "Source code" link. In addition, it can be cloned from the repository by ``git clone https://github.com/01org/tbb.git``. The main purpose of the source package integration is to allow you to do a custom build of the Intel TBB library from the source files and then build that into your CMake-aware project.
There are four types of CMake modules that can be used to integrate Intel TBB: `TBBConfig`, `TBBGet`, `TBBMakeConfig` and `TBBBuild`. See `Technical documentation for CMake modules`_ section for additional details.
......
# Copyright (c) 2017 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#
#
#
function(tbb_conf_gen_print_help)
message("Usage: cmake -DTBB_ROOT=<tbb_root> -DTBB_OS=Linux|Windows|Darwin [-DSAVE_TO=<path>] -P tbb_config_generator.cmake")
endfunction()
if (NOT DEFINED TBB_ROOT)
tbb_conf_gen_print_help()
message(FATAL_ERROR "Required parameter TBB_ROOT is not defined")
endif()
if (NOT EXISTS "${TBB_ROOT}")
tbb_conf_gen_print_help()
message(FATAL_ERROR "TBB_ROOT=${TBB_ROOT} does not exist")
endif()
if (NOT DEFINED TBB_OS)
tbb_conf_gen_print_help()
message(FATAL_ERROR "Required parameter TBB_OS is not defined")
endif()
if (DEFINED SAVE_TO)
set(tbb_conf_gen_save_to_param SAVE_TO ${SAVE_TO})
endif()
include(${CMAKE_CURRENT_LIST_DIR}/TBBMakeConfig.cmake)
tbb_make_config(TBB_ROOT ${TBB_ROOT} CONFIG_DIR tbb_config_dir SYSTEM_NAME ${TBB_OS} ${tbb_conf_gen_save_to_param})
message(STATUS "TBBConfig files were created in ${tbb_config_dir}")
......@@ -26,7 +26,7 @@
#define TBB_VERSION_MINOR 0
// Engineering-focused interface version
#define TBB_INTERFACE_VERSION 9107
#define TBB_INTERFACE_VERSION 9108
#define TBB_INTERFACE_VERSION_MAJOR TBB_INTERFACE_VERSION/1000
// The oldest major interface version still supported
......
......@@ -18,7 +18,6 @@
#
#
# System imports
from __future__ import print_function
import platform
......
......@@ -18,7 +18,6 @@
#
#
# Based on the software developed by:
# Copyright (c) 2008,2016 david decotigny (Pool of threads)
# Copyright (c) 2006-2008, R Oudkerk (multiprocessing.Pool)
......
......@@ -140,7 +140,8 @@ market& market::global_market ( bool is_public, unsigned workers_requested, size
// The requested number of threads is intentionally not considered in
// computation of the hard limit, in order to separate responsibilities
// and avoid complicated interactions between global_control and task_scheduler_init.
const unsigned workers_hard_limit = max(factor*governor::default_num_threads(), app_parallelism_limit());
// The market guarantees that at least 256 threads might be created.
const unsigned workers_hard_limit = max(max(factor*governor::default_num_threads(), 256u), app_parallelism_limit());
const unsigned workers_soft_limit = calc_workers_soft_limit(workers_requested, workers_hard_limit);
// Create the global market instance
size_t size = sizeof(market);
......@@ -695,9 +696,9 @@ void market::process( job& j ) {
// Workers leave market because there is no arena in need. It can happen earlier than
// adjust_job_count_estimate() decreases my_slack and RML can put this thread to sleep.
// It might result in a busy-loop checking for my_slack<0 and calling this method instantly.
// first_interval>0 and the pause refines this spinning.
// first_interval>0 and the yield refines this spinning.
if( i > 0 )
prolonged_pause();
__TBB_Yield();
else
#if !__TBB_SLEEP_PERMISSION
break;
......
......@@ -81,6 +81,15 @@ bool RecursiveMallocCallProtector::noRecursion() {
#endif // MALLOC_CHECK_RECURSION
/** Support for handling the special UNUSABLE pointer state **/
const intptr_t UNUSABLE = 0x1;
inline bool isSolidPtr( void* ptr ) {
return (UNUSABLE|(intptr_t)ptr)!=UNUSABLE;
}
inline bool isNotForUse( void* ptr ) {
return (intptr_t)ptr==UNUSABLE;
}
/*
* Block::objectSize value used to mark blocks allocated by startupAlloc
*/
......@@ -346,14 +355,14 @@ protected:
class Block : public LocalBlockFields,
Padding<2*blockHeaderAlignment - sizeof(LocalBlockFields)> {
public:
bool empty() const { return allocatedCount==0 && publicFreeList==NULL; }
bool empty() const { return allocatedCount==0 && !isSolidPtr(publicFreeList); }
inline FreeObject* allocate();
inline FreeObject *allocateFromFreeList();
inline bool emptyEnoughToUse();
bool freeListNonNull() { return freeList; }
void freePublicObject(FreeObject *objectToFree);
inline void freeOwnObject(void *object);
void makeEmpty();
void reset();
void privatizePublicFreeList( bool cleanup = false );
void restoreBumpPtr();
void privatizeOrphaned(TLSData *tls, unsigned index);
......@@ -390,7 +399,7 @@ public:
// expected after double free
MALLOC_ASSERT(toFree != freeList, msg);
// check against head of publicFreeList, to detect double free
// involiving foreign thread
// involving foreign thread
MALLOC_ASSERT(toFree != publicFreeList, msg);
}
#else
......@@ -887,16 +896,6 @@ void BootStrapBlocks::reset()
static MallocMutex publicFreeListLock; // lock for changes of publicFreeList
#endif
const uintptr_t UNUSABLE = 0x1;
inline bool isSolidPtr( void* ptr )
{
return (UNUSABLE|(uintptr_t)ptr)!=UNUSABLE;
}
inline bool isNotForUse( void* ptr )
{
return (uintptr_t)ptr==UNUSABLE;
}
/********* End rough utility code **************/
/* LifoList assumes zero initialization so a vector of it can be created
......@@ -1034,7 +1033,7 @@ Block *MemoryPool::getEmptyBlock(size_t size)
void MemoryPool::returnEmptyBlock(Block *block, bool poolTheBlock)
{
block->makeEmpty();
block->reset();
if (poolTheBlock) {
extMemPool.tlsPointerKey.getThreadMallocTLS()->freeSlabBlocks.returnBlock(block);
}
......@@ -1310,7 +1309,7 @@ void Block::freeOwnObject(void *object)
else
STAT_increment(getThreadId(), getIndex(objectSize), freeToActiveBlock);
#endif
if (allocatedCount==0 && publicFreeList==NULL) {
if (empty()) {
// The bump pointer is about to be restored for the block,
// no need to find objectToFree here (this is costly).
......@@ -1373,6 +1372,7 @@ void Block::freePublicObject (FreeObject *objectToFree)
void Block::privatizePublicFreeList( bool cleanup )
{
FreeObject *temp, *localPublicFreeList;
const intptr_t endMarker = cleanup? UNUSABLE : 0;
// During cleanup of orphaned blocks, the calling thread is not registered as the owner
MALLOC_ASSERT( cleanup || isOwnedByCurrentThread(), ASSERT_TEXT );
......@@ -1380,9 +1380,8 @@ void Block::privatizePublicFreeList( bool cleanup )
temp = publicFreeList;
do {
localPublicFreeList = temp;
temp = (FreeObject*)AtomicCompareExchange(
(intptr_t&)publicFreeList,
0, (intptr_t)localPublicFreeList);
temp = (FreeObject*)AtomicCompareExchange( (intptr_t&)publicFreeList,
endMarker, (intptr_t)localPublicFreeList);
// no backoff necessary because trying to make change, not waiting for a change
} while( temp != localPublicFreeList );
#else
......@@ -1390,16 +1389,16 @@ void Block::privatizePublicFreeList( bool cleanup )
{
MallocMutex::scoped_lock scoped_cs(publicFreeListLock);
localPublicFreeList = publicFreeList;
publicFreeList = NULL;
publicFreeList = endMarker;
}
temp = localPublicFreeList;
#endif
MALLOC_ITT_SYNC_ACQUIRED(&publicFreeList);
// there should be something in publicFreeList, unless called by cleanup of orphaned blocks
MALLOC_ASSERT( cleanup || localPublicFreeList, ASSERT_TEXT );
// publicFreeList must have been UNUSABLE (possible for orphaned blocks) or valid, but not NULL
MALLOC_ASSERT( localPublicFreeList!=NULL, ASSERT_TEXT );
MALLOC_ASSERT( localPublicFreeList==temp, ASSERT_TEXT );
if( isSolidPtr(temp) ) { // return/getPartialBlock could set it to UNUSABLE
if( isSolidPtr(temp) ) {
MALLOC_ASSERT( allocatedCount <= (slabSize-sizeof(Block))/objectSize, ASSERT_TEXT );
/* other threads did not change the counter freeing our blocks */
allocatedCount--;
......@@ -1447,13 +1446,13 @@ void Block::shareOrphaned(intptr_t binTag, unsigned index)
if ((intptr_t)nextPrivatizable==binTag) {
void* oldval;
#if FREELIST_NONBLOCKING
oldval = (void*)AtomicCompareExchange((intptr_t&)publicFreeList, (intptr_t)UNUSABLE, 0);
oldval = (void*)AtomicCompareExchange((intptr_t&)publicFreeList, UNUSABLE, 0);
#else
STAT_increment(getThreadId(), ThreadCommonCounters, lockPublicFreeList);
{
MallocMutex::scoped_lock scoped_cs(publicFreeListLock);
if ( (oldval=publicFreeList)==NULL )
(uintptr_t&)(publicFreeList) = UNUSABLE;
(intptr_t&)(publicFreeList) = UNUSABLE;
}
#endif
if ( oldval!=NULL ) {
......@@ -1478,7 +1477,7 @@ void Block::shareOrphaned(intptr_t binTag, unsigned index)
// it is caller responsibility to ensure that the list of blocks
// formed by nextPrivatizable pointers is kept consistent if required.
// if only called from thread shutdown code, it does not matter.
(uintptr_t&)(nextPrivatizable) = UNUSABLE;
(intptr_t&)(nextPrivatizable) = UNUSABLE;
}
void Block::cleanBlockHeader()
......@@ -1549,7 +1548,7 @@ bool OrphanedBlocks::cleanup(Backend* backend)
Block* next = block->next;
block->privatizePublicFreeList( /*cleanup=*/true );
if (block->empty()) {
block->makeEmpty();
block->reset();
// slab blocks in user's pools do not have valid backRefIdx
if (!backend->inUserPool())
removeBackRef(*(block->getBackRefIdx()));
......@@ -1626,12 +1625,12 @@ bool FreeBlockPool::externalCleanup()
return nonEmpty;
}
/* We have a block give it back to the malloc block manager */
void Block::makeEmpty()
/* Prepare the block for returning to FreeBlockPool */
void Block::reset()
{
// it is caller's responsibility to ensure no data is lost before calling this
MALLOC_ASSERT( allocatedCount==0, ASSERT_TEXT );
MALLOC_ASSERT( publicFreeList==NULL, ASSERT_TEXT );
MALLOC_ASSERT( !isSolidPtr(publicFreeList), ASSERT_TEXT );
if (!isStartupAllocObject())
STAT_increment(getThreadId(), getIndex(objectSize), freeBlockBack);
......
......@@ -490,9 +490,9 @@ private:
};
void TestArenaEntryConsistency() {
REMARK("test arena entry consistency\n" );
REMARK("test arena entry consistency\n");
tbb::task_arena a(2,1);
tbb::task_arena a(2, 1);
tbb::atomic<int> c;
ForEachArenaEntryBody body(a, c);
......@@ -500,7 +500,7 @@ void TestArenaEntryConsistency() {
a.initialize(); // capture FP settings to arena
fp_scope.setNextFPMode();
for(int i = 0; i < 100; i++) // not less than 32 = 2^5 of entry types
for (int i = 0; i < 100; i++) // not less than 32 = 2^5 of entry types
body.test(i);
}
......@@ -545,8 +545,8 @@ public:
}
};
void TestArenaConcurrency( int p ) {
for ( int reserved = 0; reserved <= p; ++reserved ) {
void TestArenaConcurrency( int p, int reserved = 0, int step = 1) {
for (; reserved <= p; reserved += step) {
REMARK("TestArenaConcurrency: %d slots, %d reserved\n", p, reserved);
tbb::task_arena a( p, reserved );
{ // Check concurrency with worker & reserved master threads.
......@@ -1139,21 +1139,73 @@ void TestSmallStackSize() {
}
}
//--------------------------------------------------//
void TestConcurrentFunctionality(int min_thread_num = MinThread, int max_thread_num = MaxThread) {
InitializeAndTerminate(max_thread_num);
for (int p = min_thread_num; p <= max_thread_num; ++p) {
REMARK("testing with %d threads\n", p);
TestConcurrentArenas(p);
TestMultipleMasters(p);
TestArenaConcurrency(p);
}
}
//--------------------------------------------------//
struct DefaultCreatedWorkersAmountBody {
int my_threadnum;
DefaultCreatedWorkersAmountBody(int threadnum) : my_threadnum(threadnum) {}
void operator()(int) const {
ASSERT(my_threadnum == tbb::this_task_arena::max_concurrency(), "concurrency level is not equal specified threadnum");
ASSERT(tbb::this_task_arena::current_thread_index() < tbb::this_task_arena::max_concurrency(), "amount of created threads is more than specified by default");
local_id.local() = 1;
Harness::Sleep(1);
}
};
struct NativeParallelForBody {
int my_thread_num;
int iterations;
NativeParallelForBody(int thread_num, int multiplier = 100) : my_thread_num(thread_num), iterations(multiplier * thread_num) {}
void operator()(int idx) const {
ASSERT(idx == 0, "more than 1 thread is going to reset TLS");
ResetTLS();
tbb::parallel_for(0, iterations, DefaultCreatedWorkersAmountBody(my_thread_num), tbb::simple_partitioner());
ASSERT(local_id.size() == size_t(my_thread_num), "amount of created threads is not equal to default num");
}
};
void TestDefaultCreatedWorkersAmount() {
NativeParallelFor(1, NativeParallelForBody(tbb::task_scheduler_init::default_num_threads()));
}
void TestAbilityToCreateWorkers(int thread_num) {
tbb::task_scheduler_init init_market_with_necessary_amount_plus_one(thread_num);
// Checks only some part of reserved-master threads amount:
// 0 and 1 reserved threads are important cases but it is also needed
// to collect some statistic data with other amount and to not consume
// whole test sesion time checking each amount
TestArenaConcurrency(thread_num - 1, 0, int(thread_num / 2.72));
TestArenaConcurrency(thread_num, 1, int(thread_num / 3.14));
}
void TestDefaultWorkersLimit() {
TestDefaultCreatedWorkersAmount();
// Shared RML might limit the number of workers even if you specify the limits
// by the reason of (default_concurrency==max_concurrency) for shared RML
#ifndef RML_USE_WCRM
TestAbilityToCreateWorkers(256);
#endif
}
//--------------------------------------------------//
int TestMain () {
#if __TBB_TASK_ISOLATION
TestIsolatedExecute();
#endif /* __TBB_TASK_ISOLATION */
TestSmallStackSize();
TestDefaultWorkersLimit();
// The test uses up to MaxThread workers (in arenas with no master thread),
// so the runtime should be initialized appropriately.
tbb::task_scheduler_init init_market_p_plus_one(MaxThread+1);
InitializeAndTerminate(MaxThread);
for( int p=MinThread; p<=MaxThread; ++p ) {
REMARK("testing with %d threads\n", p );
TestConcurrentArenas( p );
TestMultipleMasters( p );
TestArenaConcurrency( p );
}
tbb::task_scheduler_init init_market_p_plus_one(MaxThread + 1);
TestConcurrentFunctionality();
TestArenaEntryConsistency();
TestAttach(MaxThread);
TestConstantFunctorRequirement();
......
......@@ -240,7 +240,7 @@ int main(int argc, char *argv[] ) {
void initialize_strings_vector(std::vector <string_pair>* vector)
{
vector->push_back(string_pair("TBB: VERSION\t\t2017.0", required)); // check TBB_VERSION
vector->push_back(string_pair("TBB: INTERFACE VERSION\t9107", required)); // check TBB_INTERFACE_VERSION
vector->push_back(string_pair("TBB: INTERFACE VERSION\t9108", required)); // check TBB_INTERFACE_VERSION
vector->push_back(string_pair("TBB: BUILD_DATE", required));
vector->push_back(string_pair("TBB: BUILD_HOST", required));
vector->push_back(string_pair("TBB: BUILD_OS", required));
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册