Время на прочтение4 мин
Количество просмотров80K
Многим пользователям ПК под управлением ОС Windows, не говоря о разработчиках, знакомы проблемы при работе с длинными (более 260 символов, MAX_PATH) путями файлов или каталогов.
В данной статье рассматриваются способы избавления от этого пережитка при разработке приложений на различных платформах (WinApi, .Net Framework, .Net Core) и активации нативной поддержки длинных путей в Windows 10 (Anniversary Update).
Приложения Win API
В приложениях, которые используют Win API для работы с файлами, рецепт избавления от ограничения MAX_PATH был известен с незапамятных времён – необходимо было использовать Unicode версию функции с окончанием «W» для работы с директорией или файлом и начинать путь с префикса \\?\. Это давало возможность использовать пути длинной до 32767 символов.
В Windows 10 (1607) поведение функций для работы с файлами изменилось: появилась возможность отключить проверку ограничений MAX_PATH на уровне системы.
Это коснулось следующих функций:
Для работы с каталогами: CreateDirectoryW, CreateDirectoryExW, GetCurrentDirectoryW, RemoveDirectoryW, SetCurrentDirectoryW. И для работы с файлами: CopyFileW, CopyFile2, CopyFileExW, CreateFileW, CreateFile2, CreateHardLinkW, CreateSymbolicLinkW, DeleteFileW, FindFirstFileW, FindFirstFileExW, FindNextFileW, GetFileAttributesW, GetFileAttributesExW, SetFileAttributesW, GetFullPathNameW, GetLongPathNameW, MoveFileW, MoveFileExW, MoveFileWithProgressW, ReplaceFileW, SearchPathW, FindFirstFileNameW, FindNextFileNameW, FindFirstStreamW, FindNextStreamW, GetCompressedFileSizeW, GetFinalPathNameByHandleW.
Это избавляет от необходимости использовать префикса \\?\ и потенциально даёт шанс приложениям, работающим напрямую или косвенно через Win API, получить поддержку длинных путей без необходимости их пересборки. Как активировать эту возможность описано в конце статьи.
.Net Framework
Хотя .Net Framework и использует Win API для работы с файлами — предыдущее изменение не принесло бы результата, т.к. в код BCL встроены предварительные проверки на допустимость длинны имён каталогов и файлов, и до вызова функций Win API дело даже не доходило, выдавая известное исключение. По многочисленным просьбам сообщества (более 4500 на UserVoice) в версии 4.6.2 из кода BCL вырезали проверки ограничения длинны пути, отдав это на откуп операционной и файловой системам!
Вот что это даёт:
- При использовании префикса “\\?\” мы можем работать с длинными путями как в Win API,
Directory.CreateDirectory("\\\\?\\" + long_dir_name);
- Если активировать нативную поддержку длинных имен файлов Windows 10 (1607), то даже не потребуется использовать префикс!
Как включить:
- Использовать .Net Framework 4.6.2 как цель при сборке приложения.
- Использовать конфигурационный файл, например, если приложение уже было собрано под .Net 4.0:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<startup>
<supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/>
</startup>
<runtime>
<AppContextSwitchOverrides value="Switch.System.IO.UseLegacyPathHandling=false;Switch.System.IO.BlockLongPaths=false" />
</runtime>
</configuration>
.Net Core
Тут поддержку длинных путей анонсировали ещё в ноябре 2015 года. Видимо сказалось Open Source природа проекта и отсутствие строгой необходимости обеспечения обратной совместимости.
Как включить:
Всё работает из коробки. В отличие от реализации в .Net Framework – тут нет необходимости в добавлении префикса “\\?\” – он добавляется автоматически при необходимости.
Вот тут можно посмотреть пример.
Как включить поддержку длинных путей в Windows 10 (1607)
Эта возможность по умолчанию отключена. Это объясняется тем, что данная функция является экспериментальной, и имеется необходимость дорабатывать различные подсистемы и приложения для полной поддержки.
Включить встроенную поддержку длинных путей можно создав или изменив следующий параметр системного реестра: HKLM\SYSTEM\CurrentControlSet\Control\FileSystem Параметр LongPathsEnabled (Тип: REG_DWORD) 1 – соответствует значению включено.
Или через групповые политики (Win+R\gpedit.msc) Computer Configuration > Administrative Templates > System > Filesystem > Enable NTFS long paths.Оно же в локализованном варианте: Конфигурация компьютера > Административные шаблоны > Система > Файловая система > Включить длинные пути Win32.
Далее источники расходятся во мнении относительно манифеста (или я неправильно понял, но на данный момент проверить не имею возможности). Например, в документации MSDN написано, что манифест можно использовать в качестве альтернативного способа активации поддержки длинных путей в отдельных приложениях, а в блоге MSDN указано, что это является вторым обязательным шагом после активации в политиках.
Но они сходятся в формате задания данной опции:
<application xmlns="urn:schemas-microsoft-com:asm.v3">
<windowsSettings xmlns:ws2="http://schemas.microsoft.com/SMI/2016/WindowsSettings">
<ws2:longPathAware>true</ws2:longPathAware>
</windowsSettings>
</application>
С CMD, к сожалению, это не сработает, на данный момент, из-за особенностей работы с путями, а в PowerShell должно всё заработать.
P.S.
На этом мой небольшой пятничный пост заканчивается, оставив за рамками вопросы полноты реализации поддержки длинных путей в Windows 10 (1607), или работоспособность при использовании различных комбинаций редакций Windows, файловых систем и API. По мере поступления новых фактов и результатов экспериментов пост будет обновляться.
Спасибо за внимание!
Если эта публикация вас вдохновила и вы хотите поддержать автора — не стесняйтесь нажать на кнопку
Большинство администраторов и пользователей Windows при работе с файлами, так или иначе сталкивались с ошибкой “path too long”. Эта ошибка возникает при превышении полного пути к файлу (вместе с его именем) значения 260 символов. Многие приложения, в том числе проводник Windows, неправильно работают с такими длинными именами файлов, оказываясь их открывать, перемещать и удалять. Это ограничение не файловой системы NTFS, а библиотеки Win32 API (подробнее о проблеме и обходных способах ее решения рассказано здесь).
В Windows 10 появилась возможность отключить ограничение на максимальную длину пути.
Отключить ограничение MAX_PATH можно двумя способами: с помощью редактора групповых политик или через реестр. Рассмотрим оба:
- Запустите консоль редактора локальной групповой политики, нажав Win+R и выполнив команду gpedit.msc
- Перейдите в раздел редактора Local Computer Policy -> Computer Configuration -> Administrative Templates -> System -> Filesystem -> NTFS (Конфигурация компьютера -> Административные шаблоны -> Система -> Файловая система -> NTFS)
- Откройте политику Enable NTFS long paths
- Включите политику, переведя ее в состояние Enabled
- Сохраните изменения
При использовании домашней версии Windows 10, в которой отсутствует редактор GPO, это же изменение можно внедрить с помощью редактора реестра.
-
- Запустите редактор реестра regedit.exe
- Перейдите в ветку HKLM\SYSTEM\CurrentControlSet\Control\FileSystem
- Создайте в данной ветке новый параметр типа Dword (32-bit) Value с именем LongPathsEnabled
- Чтобы отключить ограничение MAX_PATH, измените значение ключа на 1
Также вы можете включить эту функцию одной командой PowerShell:
Set-ItemProperty -Path HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem -Name LongPathsEnabled -Value 1
Для вступления изменений в силу в обоих случаях требуется перезагрузка компьютера. После перезагрузки пользователи и программы смогут без ограничений работать с файлами, длина пути к которым превышает 260 символов. Теперь на файлы будет действовать только ограничение файловой системы NTFS – 32767 символов .
Этот функционал доступен всем пользователям Windows 10, начиная с Anniversary Update (1607), и в Windows Server 2016.
When testing development versions of Rtools for Windows, I’ve ran into
strange failures of several CRAN packages where R could not find, read from
or write to some files. The files should have been in temporary directories
which get automatically deleted, so it took some effort to find out that
actually they existed and were accessible. That didn’t make any sense at
first, but eventually I got to this output:
Warning in gzfile(file, "wb") :
cannot open compressed file 'C:\msys64\home\tomas\ucrt3\svn\ucrt3\r_packages\pkgcheck\CRAN\ADAPTS\tmp\RtmpKWYapj/gList.Mast.cells_T.cells.follicular.helper_T.cells.CD4.memory.activated_T.cells.CD4.memory.resting_T.cells.CD4.naive_T.cells.C_Plasma.cells_B.cells.memory_B.cells.naive.RData.RData', probable reason 'No such file or directory'
Error in gzfile(file, "wb") : cannot open the connection
Calls: remakeLM22p -> save -> gzfile
Execution halted
Such a long file name. The entire path in the warning message takes 265
bytes. Perhaps it is too long and, for some reason it can be created but not
read in a particular way?
To confirm the theory, I’ve created a mapped drive to get rid of the
/msys64/home/tomas/ucrt3/svn/ucrt3/r_packages/pkgcheck
prefix of the path.
This package and several other started to pass their checks. Interestingly,
a junction didn’t work that well, because path normalization followed it in
some cases, getting again the long paths. R has been improved since and is
more likely to provide a hint (warning or error) that the path is too long,
so diagnosing the problem is often easier than this, yet the message may
also be too pessimistic (more below).
This text provides some background on path-length limits and offers
recommendations for what to do about them. It reports on recent
improvements in R, which allow R and packages to work with longer paths on
recent Windows 10, where and when the system limit can be overriden.
Following the changes in R, some of R packages will have to be updated as
well to work with long paths. Primarily authors of packages using
PATH_MAX
or even MAX_PATH
in their code are advised to continue reading.
The changes in R make the updating of packages possible (they can be
tested), but also more important (they could crash when seeing long paths).
It is therefore not advised to enable long paths on production systems, yet
— the feature needs to be considered experimental with the R ecosystem.
Background
On Windows, there used to be a limit on the length of the entire path name
imposed by the operating system. It is derived from constant MAX_PATH
(260, not likely to change) and limits the number of UCS-2 (16-bit, so only
BMP characters) words including the terminator. Depending on the API, it
may be in addition applied directly to the number of bytes accepted as path
names in ANSI functions, e.g. 259 UTF-8 bytes plus a 1-byte terminator.
But, it may also be only applied once converted to UCS-2, and then 259
BMP
characters with a 2-byte terminator may correspond up to 3*259
UTF-8 bytes
with a 1-byte terminator.
However, for quite some time, much longer path names can exist on Windows.
The file system normally used (NTFS) allows that. Windows API started
supporting so called extended-length path syntax (\\?\D:\long_path
) in
some functions which allowed to overcome the limit, even though
anecdotically it is not used much. In addition, where it seemed safe wrt to
the applications, Windows API started accepting much longer path names even
with the regular syntax, primarily in Unicode variants of the functions.
Hence, while some Windows applications are written assuming that no path can
be longer than MAX_PATH
, such paths may and do exist in practice. How
come that the old applications making that assumptions still seem to be
(mostly) working?
The trick is that Windows hides long paths from old applications in APIs
where it is believed they could cause trouble, which typically means APIs
where the path is being returned to the application. The idea is that long
paths are rare, anyway, and users would unlikely try using them especially
with old applications.
Once an application is updated to work with long paths, it can opt in to see
them by declaring it is long-path-aware in its manifest (so in the .EXE
file, at build time). In addition, this needs to be allowed system-wide.
It is supported since somewhat recent Windows 10 and is not enabled by
default.
The current path length limit imposed by Windows is approximately 32,767
UCS-2 words. An exact single limit does not exist (the documentation says
it is approximate and depends on internal expansions), and that is in
addition to the mentioned uncertainty due to encoding and ANSI vs Unicode
functions described before.
R uses MinGW-W64 on Windows, which defines PATH_MAX
to the same value as
MAX_PATH
, so 260, to help compiling code written originally for POSIX
systems. The macros have a similar meaning, but the details are different.
Readers interested in the exact wording in POSIX are advised to check the
specification. I didn’t try to find out whether that was the correct
interpretation in the past, but today PATH_MAX
is not a limit for the
entire path length that may exist in the system. When PATH_MAX
is
defined, it is the maximum number of bytes that will be stored to a user
buffer by functions that do not allow to specify the buffer size. Such
calls are rare today (R uses realpath
for instance) and PATH_MAX
is then
explicitly mentioned in their documentation. Also, if PATH_MAX
is defined
and the OS limits path lengths, it cannot limit them to a smaller number
than PATH_MAX
. But, the OS may accept much longer paths and much longer
paths may exist.
In addition, the limit may differ based on the file system. On Unix, all
file systems are mounted to the same tree, so essentially the limit may
depend on a path. If it does, PATH_MAX
shall be undefined and instead the
user can use pathconf
(or fpathconf
) to find the limit for particular
path. Again, no limit may be given. Also, a limit too large for allocation
may be given. Some applications tend to define PATH_MAX
, when not
defined, to a certain large constant, which may complicate reviewing the
code (essentially it then becomes an application-imposed limit).
The actual value of PATH_MAX
is not defined by the standard and differs on
different systems, common values are 4096 on Linux and 1024 on macOS.
In summary, there is no (exactly, always, at compile time) known limit on
the entire path name length, neither on Windows nor on Unix (POSIX). The
actual limits imposed definitely differ between main platforms on which R
runs (Linux, macOS, Windows) and there may be some variation even on a
single machine (on Windows there definitely is, on Unix POSIX allows it).
Declaring long-path awareness
For R and packages to work with paths longer than 260 characters on Windows,
when it is allowed in the system, R needs to be made long-path aware and
declare this to Windows. E.g. Python already does that and the Python
installer offers to enable long paths in the system.
To declare it, one sets longPathAware
to true
in manifests of all R
front-ends (Rterm.exe
, Rgui.exe
, etc.), so in the same place where R
opts in for UTF-8 as the system and native encoding. That this is done at
process level means that applications embedding R would have to do it as well
to get the support. Once R does it (R-devel already does), the packages and
libraries it uses will also receive long paths, so, they should be made
long-path aware, but could hardly without testing.
To resolve the chicken-and-egg problem, there is the system setting of long
paths by Windows. By default, this is still disabled. It can be enabled by
enthusiast users, people who really need it for specific applications and
choose to take the risk of running into problems in selected
packages/libraries, and by developers of those packages, who could hardly
make them long-path aware without being able to test and debug. The setting
is in the registry, under
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem]
, field
LongPathsEnabled
. It can also be controlled by Group Policy (“Enable Win32
long paths”).
Ensuring long-path awareness
The key part of making an application long-path aware is rewriting the code
without an assumption that there is a fixed maximum length of an entire path
name. Such an assumption may have lead to static allocation of buffers for
paths, to limited checking of return values from system functions, to
limited buffer-overflow checking when constructing path names, and to now
possibly unhelpful validity checking of paths given by user (printing
warnings/errors about paths being too long).
In my view, it would make sense to get rid of this assumption in all code,
not only in Windows-specific parts. An obvious result of such a rewrite is
that the code will never or almost never use PATH_MAX
nor MAX_PATH
macros.
In addition, on Windows there is number of system functions and components
which do not support long paths even though their API would allow it. It is
necessary to find them and replace them by modern API. Not always the
limitations are documented, so we are stuck with testing.
This also may be a natural opportunity to replace calls to deprecated
Windows API functions by more recent ones, even when the old ones support
long paths, because of the necessity to rewrite the code, anyway. Increased
code complexity coming with this change may require local refactoring.
Figuring out the required buffer size
Most Windows API functions returning paths accept a pointer to a user buffer
and the buffer size. When the size is sufficient, they fill in the buffer
and return the number of bytes used (excluding the terminator). When the
buffer size is too small, they return the number of bytes needed (including
the terminator). Unicode versions of the functions do the same but the unit
is UCS-2 (16-bit) words. So, one can call the function twice, first time to
find out the needed buffer size and second time with sufficiently large
buffer.
Old code like this (excluding error handling):
char cwd[MAX_PATH];
GetCurrentDirectory(MAX_PATH, cwd);
can be changed into:
char *cwd;
DWORD res = GetCurrentDirectory(0, NULL);
cwd = (char *)malloc(res);
GetCurrentDirectory(res, cwd);
One could try to optimize the code by using a non-empty buffer already
during the first call, so that in “most cases” only one call to
GetCurrentDirectory
would suffice. The downside would be increased code
complexity and complicated testing: longer paths would be rare, and hence
the code path would rarely be tested. The initial size could indeed even be
MAX_PATH
.
While error handling is excluded from the example, calling the function
twice comes with a (theoretical, but still) risk that the external
conditions would change in between, in this case another thread could change
the current working directory of the process to a value requiring a longer
buffer, so in theory even the second call could fail due to insufficient
buffer size.
One needs to be careful when checking the return values of such functions,
because there may be slight variations in semantics. Some functions return
the required buffer size without the terminator, such as DragQueryFile
.
This matches the behavior of e.g. C snprintf
function.
Some Windows API calls already return a dynamically allocated result, e.g.
wchar_t *mydocs;
SHGetKnownFolderPath(&FOLDERID_Documents, KF_FLAG_CREATE, NULL, &mydocs);
// copy mydocs
CoTaskMemFree(mydocs);
There finding the result length is easy (e.g. wcslen()
). One can allocate
a buffer for the result in the preferred way, copy it, and free the
original using the correct free function following the documentation of the
specific API function (allocation is discussed later below).
There are Windows API calls which do not return the required buffer size,
but only return an error signalling that the provided buffer was not large
enough. One then needs to call the function with several times, increasing
the buffer size. This example is for GetModuleFileName
:
DWORD size = 1;
char *buf = NULL;
for(;;) {
buf = (char *)malloc(size);
if (!buf)
return NULL;
DWORD res = GetModuleFileName(NULL, buf, size);
if (res > 0 && res < size) /* success */
break;
free(buf);
if (res != size) /* error */
return NULL;
size *= 2; /* try again with 2x larger buffer */
}
POSIX getcwd()
functions is another example, where one needs to iterate to
find out the required buffer size, even though some extensions allow to
return a dynamically allocated result.
Iterating is not a suitable solution in all such cases. For instance,
GetOpenFileName
function opens a dialog asking the user to select a file
to be opened. The caller provides a buffer for the file name and the size.
The function reports an error if the buffer was too small. Right, the
application could increase the buffer size, open the dialog again, and ask
the user again to make the choice. This would unlikely be practical and
using a hard-coded large limit is probably better for most uses. There is
probably also a limit to how long path would a user normally be willing to
select manually.
Dynamic allocation
While it is natural to use dynamic allocation for paths given there is no
useful upper limit on their length, introducing dynamic allocation where it
hasn’t been before has to be done with care.
Using malloc()
requires checking for a memory allocation failure and
deciding what to do when it happens: map it to error codes returned by the
function at hand, or throw an R error. Throwing an R error requires
additional care: if this introduces a possible R error in a function where
it wasn’t possible before (so at any call site), it may be introducing also
a resource leak (e.g. some open file or another dynamically allocated
object not arranged to be released on a long jump). If in between the
malloc()
and free()
calls there is any call to R API, there is a risk of
a long jump there, and the buffer allocated by malloc()
hence should be
arranged to be freed if that happens. There is API to do that, both
internally in R and public for packages, but it may be tedious to handle all
cases.
Another problem of introducing malloc()
is releasing the memory by the
caller. If a function previously returned a pointer to a statically
allocated buffer and we change it to return memory allocated by malloc()
,
the callers will have to know to release it, and will have to have access to
the correct matching function to free it. This is easily possible only for
rarely used or internal functions.
An example of a function changed this way on Windows is getRUser()
. It
now returns memory that should be freed using freeRUser()
function by R
front-ends and embedding applications. Older applications would not know to
free the memory, because a statically allocated buffer was used before, but
this function is normally called just once during R startup, so the leak is
not a problem. malloc()
was the choice in startup code as R heap is not
yet available.
However, in typical package code as well as often in base R itself, when R
is already running, it is easier to use R_alloc
than malloc
for
temporarily created buffers. Introducing R_alloc
in these cases usually
doesn’t require the callers to be modified: the memory is automatically
freed at the end of .Call
/.External
or can be managed explicitly by
vmaxset/vmaxget
in stack-based manner. Care has to be taken when there is
a risk the function modified will be called a large number of times before
the cleaning would take place. Also, there must not be an undesirable
cleanup using vmaxset
before the buffer is to be used.
R_alloc
introduces allocation from the R heap, and this means potentially
also a garbage collection. Therefore, care must be taken whether this is
safe to introduce, whether it would not introduce PROTECT errors. In
theory, R_alloc
also introduces a possible long-jump, because of a
potential allocation error. However, memory allocated by R_alloc
gets
cleaned on long jumps (the allocation stack depth is restored at the
corresponding contexts), so one does not have to worry about memory leaks.
In base R, calls to Windows API have been mostly rewritten to dynamic
allocation, using malloc
in startup code and R_alloc
elsewhere. Despite
the discussion above, deciding on which function to allocate memory to use
hasn’t always been hard: often R_alloc
has already been used, so wasn’t
newly introduced. But some static allocation remains.
Static allocation
In some cases, changing existing code for dynamic allocation of paths may
still seem overwhelming or too intrusive. It may be easier, in some cases
at least as a temporary solution, to give up on supporting arbitrarily long
paths, but instead impose an application-specific limit (much larger than
260 bytes on Windows). It is still necessary to handle things that weren’t
handled in code that assumed a length limit on any existing path.
Compared to dynamic allocation, one does not have to worry about introducing
garbage collection (PROTECT errors) and resource leaks (the client not
freeing the memory). But, there is still an issue of introducing error
paths, and hence potential resource leaks.
Unlike dynamic allocation, one needs to carefully protect against buffer
overflows and detect when a too-long path would arise e.g. from
concatenation. One needs to report that as an error rather than corrupting
memory or silently truncating. Also, the code may become complicated by
having to deal with multiple path-length limits when the OS API introduces
one and the application another.
In base R, static allocation was still used for few widely called utility
functions (where changing/reviewing the callers would be too difficult), for
incorporated external code where the change would complicate maintenance,
where one could not find the buffer size, anyway, and in some code used also
on Unix, where PATH_MAX
is usually large enough so that it does not cause
trouble.
Functions to be avoided
Some code has to be rewritten to use different API to support long paths.
Only several examples are given here to illustrate the problem.
An old POSIX function getwd()
(removed from the standard in 2008) doesn’t
allow to specify the size of the user buffer. The buffer needs to be at
least of size PATH_MAX
and the function returns an error if the path is
longer than PATH_MAX
. Another example is realpath
. These functions in
their old form are broken by design, because in current POSIX, PATH_MAX
may not even be defined, or may be a number too large to allocate a buffer
of that size, etc. Still, such functions are rare on both Unix and Windows.
Unfortunately, even calls which have semantics that would allow supporting
long paths sometimes do not support them on Windows.
For example, to locate the “Documents” folder, R previously used
SHGetSpecialFolderLocation/SHGetPathFromIDList
, but to support long paths,
this was changed to SHGetKnownFolderPath
, because SHGetPathFromIDList
does not support long paths. This illustrates that such a limitation
sometimes exists even when the API already returns a dynamically allocated
result.
GetFullPathNameA
(the ANSI version) does not work with long paths, but
GetFullPathNameW
does. Hence, calls to the ANSI version need to be
replaced by a conversion and a call the Unicode version. This doesn’t make
much sense, because the ANSI version should be doing just that, and because
the API would allow supporting long paths, as the buffer size is accepted
and real size signalled. Still at least it is documented.
Many API functions document the limit for the ANSI version and refer to the
Unicode version to overcome it, but that seems surprising (or perhaps
outdated) given the new support for UTF-8 and recommendation to use the ANSI
functions. Often the ANSI functions happen to work with long paths (when
opted in). For example, GetShortPathName
does, while it is documented to
have that limitation in the ANSI version as well.
The old dialog for choosing a directory SHBrowserForFolder
does not
support long paths (it is used in Rgui) and had to be replaced by
IFileOpenDialog
, which required more than several lines of code.
Directory traversal
R internally uses POSIX opendir/readdir/closedir
functions for listing
files in a directory. These are not available on Windows directly, but R has
been using MinGW-W64 implementations, both the ANSI and the Unicode
variants.
It should be said here that there is also a limit on the length of an
individual file. Luckily, this limit is about the same on all systems where
R runs and it hasn’t changed (at least not recently). So, it is not a
problem that these functions allocate a single file name statically
(d_name
).
However, the MinGW-W64 (in version 10) implementation of these functions use
GetFullPathName
on a statically allocated buffer of PATH_MAX
characters;
they use it on the input path used to start the search. So, R now has its
own re-implementation of a subset of the functionality of
opendir/readdir/closedir
which does support long paths.
The functions for directory traversal also had to be re-factored not to make
assumptions about a limit for the full paths that may exist in the system.
Such functions internally need to keep appending directory names to build
the currently visited path. This previously used a statically allocated
buffer, but now uses a dynamically allocated string buffer, which is
automatically expanded if needed.
Checking of return values
An example to illustrate the need for reviewing old code which assumed that
no path could be longer than MAX_PATH
is from the implementation of
Sys.which
:
int iexts = 0;
const char *exts[] = { ".exe" , ".com" , ".cmd" , ".bat" , NULL };
while (exts[iexts]) {
strcpy(dest, exts[iexts]); // modifies fl
if ((d = SearchPath(NULL, fl, NULL, MAX_PATH, fn, &f))) break;
iexts++ ;
}
The loop tries to find an executable on PATH using different suffixes. A
non-zero exit value of SearchPath
is taken as a success. The
function returns zero on error. It returns a value larger than
nBufferLength
(which received the value of MAX_PATH) to indicate that the
buffer wasn’t large enough, but that wasn’t checked in the old code as it
was assumed to be impossible.
So, when there is a very long path on PATH, say at the beginning of it,
Sys.which()
would fail for files that in fact were on PATH. It doesn’t
fail in R 4.2.2 and earlier, because Windows hides such long path components
from R, SearchPath
skips it. But it would fail in R-devel on system with
enabled long paths.
Checking of path lengths
Given that there is no known limit on the entire path length in the system,
it is questionable whether preventive checks make sense, and particularly so
with the MAX_PATH
limit on Windows. It is true that, unless the long
paths are enabled in the system, even R-devel would be prone to this limit,
but as described earlier, it is only some functions in some cases that are
prone to it, some other functions work. So, an error may be premature and a
warning may be confusing. Certainly the checks make sense if an application
decides to impose its own limit: it is needed to protect static buffers on
input from overflow.
Limitations
Long path support in Windows is only available in Windows 10 since version
1607 (released in 2016). On older systems, R would still be subject to the
MAX_PATH
limit.
Windows applications (“Win32”) cannot be started with the current directory
being the long path, even when the long path support is enabled. This quite
significantly restricts potential use of long paths. In R package
development, one would easily run into this when checking or building
packages, which in turn often executes external commands. This also means
that testing the long path support is difficult.
Some Windows components still do not support long paths. Hopefully this will
change over time, but it is already over 6 years since the feature has been
released. For example it is not possible to print a document to a file with
a long path — I’ve ran into this while testing different functions of Rgui
with long paths, and I didn’t find alternative API. After all, several
Windows applications I tried had the same limit.
Inevitably, a number of existing applications would not support long paths,
and some may be used together with R, so R supporting them would not help.
As noted before, the feature in base R is to be treated as experimental
particularly because packages have not yet been updated. While it seems
there is no more than 100 CRAN and Bioconductor packages using PATH_MAX
(or MAX_PATH
) constants in their code, it is not clear how many would be
affected in bad ways. It is not easily possible to “run checks” for all
CRAN/Bioconductor packages to test that, because of the limitations in
executing from paths with the long name. So, the level on the long path
support and testing in packages will be mostly left to manual work.
Recommendations
I offer my recommendations based on reading about this problem and
implementing long-path support in base R.
Work-arounds
Users who run into the problem of long paths when using an R package on
already released versions of R should ideally first check whether the
package allows to influence the length of the path: whether it can be told
where to create files or how to name them.
If not, or if that is already minimal or default, it is worth trying to use
a drive mapping (subst
command) to get rid of any directory prefix. After
all, the author of the package probably tested it in some directory,
probably without long paths enabled, so this should create a setup that is
not more limiting.
Finally, if that does not help, try to make sure that 8.3 names are enabled
(to confuse matters, they are sometimes also called “short names”) for the
drive and directories involved (see dir /x
command, fsutil file setshortname
). Try to make the package use the 8.3 name variants; it is
even possible to set them manually, so influence their length further. How
hard would it be to make the package use them would depend on the situation:
it might happen automatically, it might work by specifying those to the
package functions in the short form, or it might not work at all when the
package intentionally normalizes paths or otherwise expands short names.
Use reasonably short names
Path length in practice is a shared resource, different components of the
path are named by different entities and software. In my example
msys64\home
comes from Msys2 conventions, tomas
is my user name,
ucrt3\svn
was my local decision on the system ucrt3\r_packages
is how a
subversion repository is structured, pkgcheck\CRAN
was a design decision
in package checks scripts (CRAN
is indeed a name of the package
repository), ADAPTS
is the name of the package, RtmpKWYapj
is named by R
(a temporary session directory), and finally
gList.Mast.cells_T.cells.follicular.helper_T.cells.CD4.memory.activated_T.cells.CD4.memory.resting_T.cells.CD4.naive_T.cells.C_Plasma.cells_B.cells.memory_B.cells.naive.RData.RData
is a name created by the package.
Path length being a shared resource, responsible parties would choose
reasonably shallow nesting level and particularly reasonably short names of
the components, of the files and directories. This example is an extreme
case where clearly the file name takes unfairly too much. The file name
should be constant wrt to the size of the input. Someone might argue that my
prefix was also a bit too deep.
Despite the long path support in Windows and efforts like this, it will take
“at least” very long before one could reliably rely on paths longer than 260
characters on Windows. Prevention will thus probably remain the key part of
the solution for a long time.
Write code robust to arbitrarily long names
According to the current standards and implementations, there is no (known,
reasonably small) limit on current systems for the length of the entire path
name.
At a minimum, code should make it clear when it is imposing its own limit on
path name length. It should be robust to paths longer than that: report an
error or perhaps skip them, but definitely do not let the code crash or
silently truncate. Any self-imposed limit should ideally be at least what
PATH_MAX
is on Linux today (4096).
Still, in most cases it seems natural to use dynamic allocation and support
path names of arbitrarily long names. It would probably be a natural
solution for new code.
Make packages long-path aware
It makes sense to first review all uses of MAX_PATH
and PATH_MAX
in the
code. This identifies places that need to be rewritten to support long
paths. Ideally these constants would only be used with API that explicitly
depends on them (e.g. realpath
, very rare). In cases when the limit is
application-imposed, they should be replaced by a different constant to make
that clear.
I would recommend modifying the code such that the same code path is taken
for short and long names. That way, the code would get tested using the
currently available tests and by common regular use. Only optimize if ever
needed, which would probably be rare in file-system operations, but is not
impossible. Ideally there would be a switch to use the long path while
testing, e.g. by setting the initial size to a very small value when
iterating to find the required buffer size.
Testing is essential to find any remaining problems, including limitations
in the used libraries and in Windows itself. One cannot rely on the
documentation. Also, it is of course easy to overlook problems without
testing, even when the code attempts to check path lengths. I’ve initially
seen a lot of crashes of base R when enabled long paths.
To check an R package, one may run R CMD check --output=DIR
to select an
output directory, hence avoid running from a long directory. One may start
R in a short directory and then change the current working directory to a
long one when that helps the testing. One should now be able to install
packages into a long directory, both from source and from binary versions.
Bash in Msys2 as well as cmd.exe and Powershell can work with long
directories.
Summary
Updating R to support long paths on Windows took a bit over a month of work,
changed about 4300 lines (added or deleted) in 70 files. So, the investment
was quite large and this comes with a risk of introducing bugs. Bug reports
on suspicious changes in behavior of file-system operations, on Windows as
well as on Unix, are particularly welcome, and sooner is better so that they
could be fixed before the 4.3 release.
Some of the Windows-specific code has been updated on the way to avoid using
deprecated functions, so they may be some maintenance benefit even
regardless of long paths.
Windows Server 2016 was finally released last week, meaning we can finally lift the idiotic 260 characters limitation for NTFS paths. In this post I’ll show you how to configure the «Enable Win32 long paths» setting for the NTFS file system, by a Group Policy Object (a GPO), and «LongPathsEnabled» in the Windows registry. This is still required for Windows Server 2022 and Windows Server 2019.
Maximum Path Length Limitation (MAX_PATH) in Windows Server
Microsoft writes about the Maximum Path Length Limitation on MSDN, and they write:
Maximum Path Length Limitation
In the Windows API (with some exceptions discussed in the following paragraphs), the maximum length for a path is MAX_PATH, which is defined as 260 characters. A local path is structured in the following order: drive letter, colon, backslash, name components separated by backslashes, and a terminating null character. For example, the maximum path on drive D is «D:\some 256-character path string<NUL>» where «<NUL>» represents the invisible terminating null character for the current system codepage. (The characters < > are used here for visual clarity and cannot be part of a valid path string.)
Microsoft Developer Network: Naming Files, Paths, and Namespaces
In the past, this 260 characters limitation caused errors and discomfort. Especially in the web hosting branche where you sometimes have to be able to save long file names. This resulted in (too) long paths and thus errors.
For example, have a look at this WordPress #36776 Theme update error Trac ticket. Which was a duplicate of #33053 download_url() includes query string in temporary filenames.
Fortunately, this limitation can now be unset and removed. Per default the limitation still exists though, so we need to set up a Local Group Policy. Or a Group Policy (GPO) in my case, since my server is in an Active Directory domain network.
In this post you’ll learn how to set up a GPO to enable NTFS long paths in Windows Server 2016 and Windows Server 2022/2019 using the LongPathsEnabled registry value.
Enabling «Long Paths» doesn’t magically remove the 260 character limit, it enables longer paths in certain situations. Adam Fowler has a bit more information about this is. Or are you wondering how to increase the configured maxUrlLength value in Windows Server & IIS? This’ll fix an IIS error «The length of the URL for this request exceeds the configured maxUrlLength value«.
But first things first.
You need to be able to set up this GPO using administrative templates (.admx) for Windows Server 2016. Because, in my situation, my Active Directory server is Windows Server 2012 R2 and doesn’t provide GPO settings for 2016.
Download Administrative Templates (.admx) for Windows 10 and Windows Server 2016
If you are, as me, on Windows Server 2012 R2 (at the time of this writing), you need administrative templates (.admx
files) for Windows Server 2016 to configure 2016 specific Group Policy Objects. The same goes for Windows Server 2022 and 2019.
These few steps help you setting them up in your environment.
Download and install administrative templates for Windows Server 2016 in your Windows Server 2012 R2 Active Directory
Folow these steps:
- Download Windows 10 and Windows Server 2016 specific administrative templates — or
.admx
files. - Install the downloaded
.msi
file Windows 10 and Windows Server 2016 ADMX.msi on a supported system: Windows 10 , Windows 7, Windows 8.1, Windows Server 2008, Windows Server 2008 R2, Windows Server 2012, Windows Server 2012 R2. You also need user rights to run the Group Policy Management Editor (gpme.msc
), or the Group Policy Object Editor (gpedit.msc
). But that’s for later use. - The administrative templates are installed in
C:\Program Files (x86)\Microsoft Group Policy\Windows 10 and Windows Server 2016
, or whatever directory you provided during the installation. Copy over the entire folder PolicyDefinitions to your Primary Domain Controller’sSYSVOL\domain\Policies
directory. - Verify you’ve copied the folder, and not just the files. The full path is:
SYSVOL\domain\Policies\PolicyDefinitions
. This is explained in Microsoft’s Technet article Managing Group Policy ADMX Files Step-by-Step Guide.
That’s it, you now have Group Policy Objects available for Windows Server 2016. Let’s enable Win32 long paths support now.
Configure «Enable Win32 long paths» Group Policy
Learn how to set up WMI filters for Group Policy.
Now that you have your Windows Server 2016 Group Policy Objects available, it’s time to setup a GPO to enable NTFS long path support. Create the GPO in your preferred location, but be sure to target it on Windows Server 2016 only.
Please note that the GPO is called Enable Win32 long paths, not NTFS.
Enabling Win32 long paths will allow manifested win32 applications and Windows Store applications to access paths beyond the normal 260 character limit per node on file systems that support it. Enabling this setting will cause the long paths to be accessible within the process.
Start your Group Policy Management console and click through to the location where you want to add the GPO. Create a new GPO: Create a GPO in this domain, and Link it here...
, and provide your GPO with a relevant name.
In the Settings tab, right click and choose Edit…. Now under Computer Configuration in the Group Policy Management Editor, click through to Policies > Administrative Templates > System > Filesystem. Configure and enable the Setting Enable Win32 long paths.
!’This screen configures the GPO «Enable Win32 long paths»‘
This is all you have to do to create the Group Policy for long Win32 paths. All that is left is to run gpupdate
in an elevated cmd.exe
command prompt.
Verify LongPathsEnabled registry value
If needed, you can use the following cmd.exe or PowerShell commands to verify the LongPathsEnabled registry value is set correctly:
C:\>reg query HKLM\System\CurrentControlSet\Control\FileSystem /v LongPathsEnabled
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem
LongPathsEnabled REG_DWORD 0x1
PS C:\> (Get-ItemProperty "HKLM:System\CurrentControlSet\Control\FileSystem").LongPathsEnabled
1
Don’t forget about your Windows Server 2022 and Windows Server 2019 servers, they still require this LongPathsEnabled registy value.
Use PowerShell to configure «Enable Win32 long paths»
If your servers are not in an Active Directory network and you want to enable the «Win32 long paths» (or LongPathsEnabled registry value) you can use PowerShell for the job. Here is a small PowerShell script that adds the registry value if the script is run on an supported Windows Server version.
#
# Add LongPathsEnabled registry value using PowerShell
#
function Test-RegistryValue {
param (
[parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[string]$Path,
[parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[string]$Value,
[switch]$ShowValue
)
try {
$Values = Get-ItemProperty -Path $Path | select-object -ExpandProperty $Value -ErrorAction Stop
if ($ShowValue) {
$Values
}
else {
$true
}
}
catch {
$false
}
}
# Windows Server 2016 build is 10.0.14393
[version]"10.0.14393"
[version] $winver = $(Get-CimInstance -Namespace root\cimv2 -Query "SELECT Version FROM Win32_OperatingSystem").Version
if($winver -ge [version]"10.0.14393") {
if ($(Test-RegistryValue "HKLM:System\CurrentControlSet\Control\FileSystem" LongPathsEnabled) -eq $false) {
New-ItemProperty -Path "HKLM:System\CurrentControlSet\Control\FileSystem" -Name LongPathsEnabled -PropertyType DWORD -Value 1 -Force
}
}
This script contains a function that tries to look up if the requested registry path / value already exists. If it returns false, New-ItemProperty
is executed to add LongPathsEnabled with value 1 to the Windows. A reboot afterwards is required.
Enable Win32 long paths in Windows 11 and Windows 10 (Bonus!)
If your client computer running Windows 11 or Windows 10 is not in an Active Directory and/or does not have the above mentioned GPO active, you can enable it yourself in a local security policy (LSP). You do need administrator privileges to follow the next steps:
- As an administrator, start
gpedit.msc
for example in an elevated PowerShell terminal venster or through Start (Select the as administrator option). Press enter and Group Policy Editor opens. - Go to Local Computer Policy -> Computer Configuration -> Administrative Templates -> System -> Filesystem, then enable the Enable Win32 long paths option.
- Restart your computer.
Of course you can add the registry value manually as well. Want to silently import .reg file in your Windows registry?
Did you like Enable NTFS long paths GPO in Windows Server 2022, 2019 and Windows Server 2016?
Then please, take a second to support Sysadmins of the North and donate!
Paypal
Your generosity helps pay for the ongoing costs associated with running this website like coffee, hosting services, library mirrors, domain renewals, time for article research, and coffee, just to name a few.
Quick Links
-
Windows Doesn’t Accept Long Paths by Default
-
Home Users: Remove the 260 Character Path Limit by Editing the Registry
-
Download Our One-Click Registry Hack
-
Pro and Enterprise Users: Remove the 260 Character Path Limit with the Local Group Policy Editor
Summary
- Before Windows 95, file names could only be 8 characters long, but now Windows has a 260 character limit for the full path of a file.
- The Windows 10 Anniversary Update allowed users to abandon the 260 character limit, but some older 32-bit applications may not support longer paths.
- Home users can remove the path limit by editing the registry, while Pro and Enterprise users can use the Local Group Policy Editor to disable the limit.
Windows Doesn’t Accept Long Paths by Default
Before Windows 95, Windows only allowed file names that were eight characters long, with a three-character file extension — commonly known as an 8.3 filename. Windows 95 abandoned that to allow long file names, but still limited the maximum path length (which includes the full folder path and the file name) to 260 characters. That limit has been in place ever since. If you’ve ever run into this limit, it was probably when you were trying to copy deep folder structures into other folders, such as when copying the contents of a hard drive to a folder on another drive. The Windows 10 Anniversary Update finally added the option to abandon that maximum path length.
There is one caveat. This new setting won’t necessarily work with every application out there, but it will work with most. Specifically, any modern applications should be fine, as should all 64-bit applications. Older 32-bit applications need to be manifested in order to work, which really just means that the developer has indicated in the application’s manifest file that the application supports longer paths. Most popular 32-bit apps should experience no problem. Still, you don’t risk anything by trying the setting out. If an application doesn’t work, the only thing that will happen is that it won’t be able to open or save files that are saved in places where the full path exceeds 260 characters.
Home Users: Remove the 260 Character Path Limit by Editing the Registry
If you have a Windows Home edition, you will have to edit the Windows Registry to make these changes. You can also do it this way if you have Windows Pro or Enterprise, but feel more comfortable working in the Registry than Group Policy Editor. (If you have Pro or Enterprise, though, we recommend using the easier Group Policy Editor, as described in the next section.)
Registry Editor is a powerful tool and misusing it can render your system unstable or even inoperable. This is a pretty simple hack and as long as you stick to the instructions, you shouldn’t have any problems. That said, if you’ve never worked with it before, consider reading about how to use the Registry Editor before you get started. And definitely back up the Registry (and your computer!) before making changes.
To get started, open the Registry Editor by hitting Start and typing «regedit.» Press Enter to open Registry Editor and give it permission to make changes to your PC.
In the Registry Editor, use the left sidebar to navigate to the following key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
On the right, find a value named LongPathsEnabled
and double-click it. If you don’t see the value listed, you’ll need to create it by right-clicking the FileSystem
key, choosing New > DWORD (32-bit) Value, and then naming the new value LongPathsEnabled
.
In the value’s properties window, change the value from 0 to 1 in the «Value data» box and then click OK.
You can now close Registry Editor and restart your computer (or sign out of your account and sign back on). If you ever want to reverse the changes, just head back to the LongPathsEnabled
value, and change it from 1 back to 0.
Download Our One-Click Registry Hack
If you don’t feel like diving into the Registry yourself, we’ve created two downloadable registry hacks you can use. One hack removes the 260-character path limit, and the other hack restores the default limit. Both are included in the following ZIP file. Double-click the one you want to use, click through the prompts, and then restart your computer.
Long Path Names Hacks
These hacks are really just the FileSystem
key, stripped down to the LongPathsEnabled
value we described above, and then exported to a .REG file. Running the «Remove 260 Character Path Limit» hack sets the LongPathsEnabled
value to 1. Running the «Restore 260 Character Path Limit (Default)» hack sets the value back to 0. And if you enjoy fiddling with the Registry, it’s worth taking the time to learn how to make your own Registry hacks.
Pro and Enterprise Users: Remove the 260 Character Path Limit with the Local Group Policy Editor
If you’re using Windows 10 Pro or Enterprise, the easiest way to disable the new app install notifications is by using the Local Group Policy Editor. It’s a pretty powerful tool, so if you’ve never used it before, it’s worth taking some time to learn what it can do. Also, if you’re on a company network, do everyone a favor and check with your admin first. If your work computer is part of a domain, it’s also likely that it’s part of a domain group policy that will supersede the local group policy, anyway.
In Windows 10 Pro or Enterprise, hit Start, type gpedit.msc, and press Enter.
In the Local Group Policy Editor, in the left-hand pane, drill down to Computer Configuration > Administrative Templates > System > Filesystem. On the right, find the «Enable win32 long paths» item and double-click it.
Select the «Enabled» option and then click «OK» in the properties window that opens.
You can now exit the Local Group Policy Editor and restart your computer (or sign out and back in) to allow the changes to take effect. If at any time you want to reverse the changes, just follow the same procedure and set that option back to «Disabled» or «Not Configured.»
The maximum path limit may not be something you’ve ever run into, but for some people, it can certainly be the occasional frustration. Windows 10 has finally added the ability to remove that limit. You just have to make a quick change to the Registry or Group Policy to make it happen.