26

I was watching FlyTech's video, and when he made a pair illegal folders (one named "<" and the other named ">"), it made Windows think the folder containing them was corrupted. This behavior did not appear to occur for any of the other folders with illegal characters.

I was wondering why this specific combination makes Windows think that the containing folder is corrupted. I searched a little, and could not find why this might be. Could anyone explain?

inyourface3445
  • 413
  • 2
  • 10
  • 3
    Conceivably, imperfectly written script code (for powershell, cmd, if installed also unix-like shells...) getting hold of these files could end up causing these characters to be interpreted as redirections and cause all kinds of chaos.... – rackandboneman Nov 04 '22 at 22:52
  • 2
    if you like these sorts of things, then you might also like to look into what happens when you create folders with reserved names like "prn" and "con" :-) – Aaron F Nov 06 '22 at 01:24

4 Answers4

46

The error isn't caused by angle brackets exactly, or by having two of them – instead it occurs when 1) a file name contains wildcard characters in its name, and 2) the wildcard would match a previously seen file, which results in Windows thinking that the folder search doesn't advance forwards like it should.

First, as far as I understand, listing a directory on Windows is done by wildcard expansion (the opposite of how it would be done on Linux). To expand a wildcard pattern, you start by calling FindFirstFile() with the initial pattern, then repeat FindNextFile() while NTFS finds matching files one-by-one. To list the entire directory, you do the same with * as the pattern.

Second, both < and > (as well as ") are actually treated as wildcards in the deeper parts of Windows file-handling code – they behave like the historical MS-DOS wildcard variants of * and ?. (For example, > aka DOS_STAR matches all characters up until the file extension.) The publicly available .NET source code contains a description of the algorithm, which is identical to the one found in leaked Windows NT kernel source.

So it's not just the angle brackets, but also " ? * that could be used to trigger this error – as long as they're used in combination with another file name that would be sorted before the wildcard, if the sorting is done by Unicode value (which is the order enforced by NTFS).

For example, you would also get the "folder corrupted" error if you had items named foo( and foo*. There is nothing special about the ( here, except that it goes before * in Unicode – while a character that sorts after * such as foo+ would not trigger the error. (You can open "Character Map" via charmap.exe if you want to see the Unicode positions of these characters.)

Similarly, a directory containing [foo<, foo=] or [foo?, fooo] would not trigger this situation, but a directory containing [foo=, foo>] or [foo+, foo?] would.

So if I understand everything correctly, what seems to happen is:

  1. The directory has items [foo(, foo*], with NTFS enforcing this exact order.
  2. Kernel asks NTFS "Get first item, starting at *".
  3. NTFS finds and returns foo(.
  4. Kernel asks NTFS "Get next item, continuing at foo(".
  5. NTFS finds foo( (exact match) and returns the next item foo*.
  6. Kernel asks NTFS "Get next item, continuing at foo*".
  7. NTFS finds foo* – which is recognized as a wildcard and matches foo( first, therefore the next item is foo* again – so an error is raised.

As > is handled similarly to the * wildcard, a folder named ">" causes the same problem by matching the previous "<" item before itself.

u1686_grawity
  • 426,297
  • 64
  • 894
  • 966
  • 3
    There’s an additional limitation here, the files have to be created via the POSIX filename namespace (which allows any character except `/`, null bytes, or trailing spaces in filenames), as they are explicitly disallowed when creating a file via the Win32 filename namespace (which enforces the usual Windows reserved character list, plus the above restrictions about null bytes or trailing spaces). In practice, this means that tools using the ‘classic’ Win32 API cannot create files that cause this issue. – Austin Hemmelgarn Nov 05 '22 at 14:38
  • @AustinHemmelgarn I don't know it that would be necessary. The question is to know where those limitations are enforced. Calling native API might be enough (for instance, calling native API, in the registry, you could create a key that would contain a \0 in the name, because the strings in native API are not null terminated) – user1532080 Nov 05 '22 at 17:08
  • 4
    @user1686 While your explanation might be overall correct (and I don't know that it is), I very much doubt that it's actually a kernel issue. While the Win32 API has FindFirst and FindNext MS-DOS style calls, from the top of my head, this is not the case in the native and kernel API. See https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-zwquerydirectoryfile typically, you wouldn't want to keep transitioning in & out of kernel for every single file. You can also look at NtQueryDirectoryFile which is the Native API equivalent of ZwQueryDirectoryFile . – user1532080 Nov 05 '22 at 17:18
  • @user1686 Sorry I misread, you say "kernel" and "ntfs", so you don't mean transition between kernel and userspace. However I still doubt that NTFIS (NT Installable File System) has an interface style "find first find next", more likely "find the first XXX many"/"fill this buffer as much as you can". Now if I'm right, that only means it's the NTFS IFS driver that gets puzzled, not the kernel. – user1532080 Nov 05 '22 at 17:26
  • @user1532080: The second half of the post is mostly guesswork after experimenting with various kinds of garbage on a VHDX. I wasn't able to find where the error is raised, exactly – I've mostly skimmed through XP's source for FindNext and bits of the NTFS code, but there are probably several layers in-between that I missed (I didn't look inside NtQueryDirectoryFile itself yet, only saw that FindNext calls it). – u1686_grawity Nov 05 '22 at 18:47
  • 2
    POSIX filename: MacOS X _does_ allow trailing spaces. All it doesn't allow is null bytes and slash characters, and I'm quite sure it needs to be valid UTF-8. And for historical/hysterical reasons, colon and slash are exchanged in the UI. So you can create a folder that is apparently called test/test, but ls lists it as test:test. – gnasher729 Nov 06 '22 at 10:14
  • 2
    This seems very much like a *what the bleep* kind of bug. If I naively implemented `FindFirstFile`, `FindNextFile`, and `FindClose` in terms of `NtQueryDirectoryFile` (which has to work because POSIX APIs) such a bug would have no space to exist. – Joshua Nov 07 '22 at 16:04
  • Shouldn't this answer at least say something about redirection so he can search for it next time? – kmort Nov 07 '22 at 20:01
  • 1
    @kmort: All it could say about that is "Redirection is not involved here." – u1686_grawity Nov 07 '22 at 20:53
  • Lol. :-) I guess you're right, it's not directly involved.... but it's the reason the characters are disallowed. May help further understanding. But not required, of course. :-) – kmort Nov 07 '22 at 21:00
3

Characters < and > belong to "reserved characters" group, which must not be used to name a file or directory in Windows.

https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file

The following reserved characters are not valid to name a file or directory:

< (less than)
> (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)
metablaster
  • 242
  • 1
  • 8
  • 3
    Windows won't allow these characters in file names, so if it finds one, it assumes it got there due to corruption. – gronostaj Nov 04 '22 at 13:25
  • 5
    My question though is why those charters break windows. Alo in fly's video, the other charters do observe the same effect only the "<", ">" characters observe this effect. Two folders named "J>" and "L<" will not brake widows. – inyourface3445 Nov 04 '22 at 13:45
1

As noted in another answer, these characters are in the reserved characters group:

https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file

The following reserved characters are not valid to name a file or directory:

< (less than)
> (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)

There are very specific reasons for each of these. There are ways around them, but each has a fundamental function in command processing:

  • < (less than) - Input from a file
  • > (greater than) - Output to a file
  • : (colon) - Drive identifier (e.g., C:)
  • " (double quote) - Quote a file name that includes spaces
  • / (forward slash) - POSIX folder/directory separator
  • \ (backslash) - Windows folder/directory separator
  • | (vertical bar or pipe) - Use output of one process as input to another process
  • ? (question mark) - Single character wildcard
  • * (asterisk) - Multiple character wildcard
0

This confusing error may take many shapes, such as :

  • Error code 0x80070570
  • Error code 0x570
  • Error code 1392 (570 in hex)
  • ERROR_FILE_CORRUPT
  • Message "The file or directory is corrupted and unreadable"

A typical error message is :

enter image description here

In all cases, this means that the file or folder could not be accessed, although it exists in the NTFS Master File Table (MFT).

The other answers have well explained that some characters are illegal within a file-name.

If one manages to create a file containing these illegal characters, usually by using Linux for which these characters are legal, then we have here an anomaly : On the one hand the file/folder exists in the file-table, but on the other hand any attempt to open it is refused because its name doesn't pass the verification checks.

Faced with this contradiction, the Windows kernel gives up and returns the above error code, saying that something is corrupted, and it is up to the user to fix the bad file-entry.

This error message could have been worded more clearly, but what it means is that there is a contradiction between the contents of the file-table and the data on the disk, which for Windows means "corruption".

The message doesn't necessarily mean that the folder containing a file with such a name is corrupted. This generic message will be issued for a file and also for a folder whose name contained illegal characters.

harrymc
  • 455,459
  • 31
  • 526
  • 924