Jedi Survivor has been recently released and has suffered from some performance problems. EA has responded by saying that some of these issues are due to players “using cutting-edge, multi-threaded chipsets designed for Windows 11 were encountering problems on Windows 10” - presumably referring to Intel’s Alder Lake and Raptor Lake heterogenous cores used in the 12th and 13th gen Core processors. While Windows 11 does handle them better when it comes to scheduling between P and E cores, 74% of PC players are using Windows 10 at the time of writing according to Steam’s Hardware Survey. Is it acceptable to pass the buck and blame the users (who are paying you $70 of their hard-earned money) for their OS choices, or can something be done about this from the developer’s side?
Yes. Yes there is, and it can all be done in a lunch hour. Let’s start by grabbing Intel’s Hybrid Detect code, which include a header-only library that wraps the APIs needed to detect P and E cores. We’ll use the GetLogicalProcessors()
function to detect cores, and build a list of CPU Set IDs for P and E cores:
bool CuttingEdgeChipset = false;
ULONG PCores[64] = {};
int cntPCore = 0;
ULONG ECores[64] = {};
int cntECore = 0;
PROCESSOR_INFO procInfo = {};
GetLogicalProcessors(procInfo);
for(size_t i = 0; i < procInfo.cores.size(); ++i)
{
auto& c = procInfo.cores[i];
if(c.efficiencyClass == 1)
{
CuttingEdgeChipset = true;
PCores[cntPCore] = c.id;
++cntPCore;
}
else
{
ECores[cntECore] = c.id;
++cntECore;
}
}
Next, let’s create a couple of helper functions to assign a thread to P or E cores:
auto AssignToPCore = [&](HANDLE hThread)
{
BOOL r = SetThreadSelectedCpuSets(hThread, PCores, cntPCore);
assert(r);
};
auto AssignToECore = [&](HANDLE hThread)
{
BOOL r = SetThreadSelectedCpuSets(hThread, ECores, cntECore);
assert(r);
};
Now let’s figure out which threads should go where. Looking at a trace from Jedi Survivor, some thread names look promising:
A search reveals that these are probably High and Background priority threads, from this public doc of Unreal Engine:
The plan of action is simple: iterate over all threads in the process, and based on the thread description assign it to P or E cores using the functions above. This has a lot of scaffolding code, but the core is a single string check (the wchar_t version of strstr() basically):
if(CuttingEdgeChipset)
{
const DWORD pid = GetCurrentProcessId();
const DWORD tid = GetCurrentThreadId();
HANDLE h = CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD, pid);
if (h != INVALID_HANDLE_VALUE)
{
THREADENTRY32 te;
te.dwSize = sizeof(te);
if (Thread32First(h, &te))
{
do
{
if (te.dwSize >= FIELD_OFFSET(THREADENTRY32, th32OwnerProcessID) + sizeof(te.th32OwnerProcessID))
{
HANDLE thread = ::OpenThread(THREAD_ALL_ACCESS, FALSE, te.th32ThreadID);
if(thread != NULL)
{
wchar_t* desc = 0;
HRESULT hr = GetThreadDescription(thread, &desc);
if(SUCCEEDED(hr))
{
//This is where we detect the thread and assign it to P or E cores
if(wcsstr(desc, L"TaskGraphThreadHP"))
{
AssignToPCore(thread);
}
if(wcsstr(desc, L"TaskGraphThreadBP"))
{
AssignToECore(thread);
}
LocalFree(desc);
}
CloseHandle(thread);
}
}
te.dwSize = sizeof(te);
} while (Thread32Next(h, &te));
}
CloseHandle(h);
}
}
And we’re done! Writing this post took me a little under an hour, which is probably less time than it took EA’s PR department to draft their message passing the buck and blaming the user. I’m not sure whether the executives at EA know how much the lack of an hour of good systems engineering has cost them on their balance sheet, but here’s hoping they consider hiring people with a systems programming background in their studios.