Violating the limitations of file systems * NIX

Violating the limitations of file systems * NIX

It all started with the fact that a friend showed an amazing artifact: there were two files with the same name on the flash drive in the same directory. The solution, of course, is simple: the camera is to blame for everything, which may have less checks at the time of recording the frame.

This precedent prompted the search for answers to several questions. Is it possible to trick a computer’s operating system and break the ban on file systems? And if it works, how will the OS react?

A short tour of file systems and a set of experiments is waiting for you under the executioner.


Most of the restrictions on file and directory names are self-explanatory.

  • File name length is limited. The structure that stores information about a file is not infinite.
  • The length of the absolute file path is limited. A little less obvious, but the reason is identical: the size of the buffer in which the absolute file path is stored is also not infinite.
  • The delimiter in the path, the symbol “/”, cannot be used.
  • The null character cannot be used – it is a sign of the end of the line.
  • There can be two files with the same name inside the same directory.

Overfilling file names is not so interesting, because usually the size of the fields is prescribed by a separate integer field. If it is overfilled, the structure will lose its integrity. But add a separator to the name or make two files with the same name… Can you guess?


Everything is a file


The *NIX family of operating systems has a good idea that is described by the words “Everything is a file”. Because of this abstraction, virtually all I/O operations use a single interface consisting of the following system calls. This is not a complete list.

  • open(2) — opening a resource for reading or writing.
  • read(2) – reading bytes from the resource.
  • write(2) – write bytes to the resource.
  • close (2) – Closing the resource.

Pioneers of computer science are smart people who solved unprecedented problems. This is a great heritage, about which questions sometimes arise. For example, why is the system call to create a file called creat(2) and not create(2)? Small informative texts and thoughts on the topics of future articles are in my Telegram channel.

These calls allow access to almost all resources of the operating system.

  • To individual files and directories located on the drive.
  • Directly to the byte representation on the drive (files /dev/sd* and /dev/nvme*).
  • To an external device: printer, mouse, keyboard (files /dev/input/, /dev/mouse*).
  • Even the Berkeley sockets used to access the Internet support the listed operations. Only the opening occurs using the socket(2) system call, and not through open(2).

But we will not dive into the full depth of this abstraction, let’s look at the very beginning. A directory is also a file. Yes, it can be read by the cat utility. But not in modern systems.

root@ubuntu-2204:~# cat /tmp
cat: /tmp: Is a directory
root@ubuntu-2204:~#

In Linux, this trick does not work, but in FreeBSD up to version 12, you can display the directory.

root@freebsd-11:~ # cat /tmp
X...X .X11-unix<X .XIM-unix<X  .ICE-unix<X
PERL5_DEFAULT\local.UgDju9Hno\dmail.RsZJCJr4Yt6\Hmail.RsygwgsVOYB4\(write-422194.oZxxhash-737aeb.oZbench-fb2faf.lz4io-a10809.olz4cli-b2a9d6.o
root@freebsd:~ #

The output is not just a set of non-printable characters and file names, but a textual representation

content

directory. A directory is an array of structures

dirent

which looks like this:

struct dirent {
   __uint32_t d_fileno;               /* file number of entry */
   __uint16_t d_reclen;               /* length of this record        */
   __uint8_t  d_type;               /* file type, see below */
   __uint8_t  d_namlen;               /* length of string in d_name */
#ifdef _POSIX_SOURCE
   char    d_name[255 + 1];               /* name must be no longer than this */
#else
#define MAXNAMLEN       255
   char    d_name[MAXNAMLEN        + 1];  /* name must be no longer than this */
#endif
};

For reading directories, there are separate functions that parse the contents of the directory of the dirent structure:

opendir(3), readdir(3) and closedir(3)

. But for academic purposes, you can do without them:

#include <stdio.h> // printf(3)
#include <sys/types.h> // open(2)
#include <sys/stat.h> // open(2)
#include <fcntl.h>  // open(2)
#include <errno.h>  // errno
#include <string.h> // strerror(3)
#include <unistd.h> // close(2)
#include <dirent.h>
#define BUF_SIZE 4096
int main(int argc, char* argv[])
{
        // Открываем на чтение. Без флага O_DIRECTORY будет ошибка.
        int fd = open(argv[1], O_DIRECTORY | O_RDONLY);
        if(fd < 0) {
                printf("Cannot open %s: %s\n", argv[1], strerror(errno));
                return 1;
        }
        unsigned char buf[BUF_SIZE];
        while(1) {
                struct dirent dir;
                /* Считываем по одному полю */
                ssize_t count = read(fd, &dir.d_fileno, sizeof(dir.d_fileno));
                /* Обработка ошибок один раз для читаемости */
                if(count < 0) {
                        printf("Read error: %s\n",  strerror(errno));
                        return 2;
                }
                if(count == 0) break;
                count += read(fd, &dir.d_reclen, sizeof(dir.d_reclen));
                count += read(fd, &dir.d_type, sizeof(dir.d_type));
                count += read(fd, &dir.d_namlen, sizeof(dir.d_namlen));
                count += read(fd, &dir.d_name, dir.d_reclen - count);
                dir.d_name[dir.d_namlen] = '\0';
                printf("Entry:\n");
                printf("\td_fileno: %u\n", dir.d_fileno);
                printf("\td_reclen: %u (read: %zd)\n", dir.d_reclen, count);
                printf("\td_type: %u\n", dir.d_type);
                printf("\td_namelen: %u\n", dir.d_namlen);
                printf("\td_name: %s\n", dir.d_name);
        }
        close(fd);
        return 0;
}

Yes, this also works:

root@freebsd:~ # ./a.out /tmp
Entry:
        d_fileno: 11556864
        d_reclen: 12 (read: 12)
        d_type: 4
        d_namelen: 1
        d_name: .
Entry:
        d_fileno: 2
        d_reclen: 12 (read: 12)
        d_type: 4
        d_namelen: 2
        d_name: ..
Entry:
        d_fileno: 11556869
        d_reclen: 20 (read: 20)
        d_type: 4
        d_namelen: 9
        d_name: .X11-unix
Entry:
        d_fileno: 11556870
        d_reclen: 20 (read: 20)
        d_type: 4
        d_namelen: 9
        d_name: .XIM-unix
Entry:
        d_fileno: 11556871
        d_reclen: 20 (read: 20)
        d_type: 4
        d_namelen: 9
        d_name: .ICE-unix
Entry:
        d_fileno: 11556872
        d_reclen: 20 (read: 20)
        d_type: 4
        d_namelen: 10
        d_name: .font-unix
Entry:
        d_fileno: 11556915
        d_reclen: 408 (read: 408)
        d_type: 8
        d_namelen: 13
        d_name: PERL5_DEFAULT

By the way, there is nothing about writing in the functions for working with directories. But if we have a common interface, then we will try to open the directory for recording and break something.

#include <stdio.h> // printf(3)
#include <sys/types.h> // open(2)
#include <sys/stat.h> // open(2)
#include <fcntl.h>  // open(2)
#include <errno.h>  // errno
#include <string.h> // strerror(3)
#include <unistd.h> // close(2)
int main(int argc, char* argv[])
{
        /* Сперва откроем на запись, вдруг прокатит */
        int fd = open(argv[1], O_DIRECTORY | O_WRONLY);
        if(fd < 0) {
                printf("Cannot open for write %s: %s\n", argv[1], strerror(errno));
                return 1;
        }
        return 0;
}

Compile, run.

root@freebsd:~ # cc write.c 
root@freebsd:~ # ./a.out /tmp
Cannot open for write /tmp: Is a directory
root@freebsd:~ #

It didn’t work out. I tested this code on FreeBSD 11, which was released in 2016. Maybe there are no such restrictions in canonical UNIX? Let’s rewrite the code to K&R syntax. There is no O_WRONLY constant either, so we look into

documentation

and choose the number 1.

int main(argc, argv) 
    int argc;
    char* argv[];
{
  int fd;
  fd = open(argv[1], 1);
  if(fd < 0) perror("open");
  return 0;
}

And let’s start. To test such hypotheses, you can use the PDP-11/70 JavaScript emulator, which

available

in the browser. It also contains Unix v5 and BSD2.11 images. Let’s run the Unix v5 code.

# cc a.c
# ./a.out /tmp
open: Is a directory

It was expected. Moreover, attempts to look for something similar to the dirent structure on the drive are unsuccessful. The dirent structure is

abstraction

, independent of the file system used. The kernel “lays down” similar structures when requested, so nothing can be written to them.

Ok, we can’t change the contents of the directory from the user’s location. Then let’s think of a prank!

Do not trust user data


If we cannot introduce incorrect data through the operating system, we will do so without its knowledge. Let’s open the disk in the Windows HEX editor, for example. Of course, I raved about Windows, but the main idea remains:

  1. let’s create a “disk” in the form of a 50 MB file,
  2. mark the file system inside the file,
  3. mount the contents of the file to the mnt directory,
  4. let’s create a file with the name 12345678,
  5. unmount the “disk”,
  6. change the file name using the HEX editor,
  7. mount the “disk” and check the success of the trick.

Thanks to the idea of ​​”everything is a file”, there is no difference between a disk file and a normal file for the operating system. This allows you to create a file system not on a disk partition, but in a regular file. And then mount this file as if it were a disk.

We prepare:

# Создаем файл
dd if=/dev/zero of=img bs=1M count=50
# Создаем файловую систему
mkfs.ext4 img
# Монтируем диск
mkdir mnt
mount img mnt
# Пустой файл
touch mnt/12345678
# Отмонтируем
umount mnt
# Переименовываем 12345678 в 87654321
sed -i 's/12345678/87654321/g' img

I used the non-interactive text editor sed for renaming, it does the job. We install:

void@ubuntu-2204:~# mount img mnt
void@ubuntu-2204:~# ls -l mnt
ls: reading directory 'mnt': Bad message
total 0

Oho,

bad gates

bad message A more detailed explanation can be found in the core loop thread:

[36446.876771] EXT4-fs error (device loop1): htree_dirblock_to_tree:1080: inode #2: comm ls: Directory block failed checksum

Logically. File systems work with hardware that, due to its physical nature, can fail. Checksums allow errors to be detected and reported. Use the e2fsck utility to check the file system.

void@ubuntu-2204:~# e2fsck img 
e2fsck 1.46.5 (30-Dec-2021)
img contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory inode 2, block #0: directory passes checks but fails checksum.
Fix<y>? yes
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information
img: ***** FILE SYSTEM WAS MODIFIED *****
img: 13/12800 files (7.7% non-contiguous), 1840/12800 blocks
void@ubuntu-2204:~#

Install and check again:

void@ubuntu-2204:~# ls -l mnt/
total 16
-rw-r--r-- 1 void root     0 Mar  5 18:44 87654321
-rw-r--r-- 1 void root     0 Mar  5 18:31 blablabla
drwx------ 2 void root 16384 Mar  5 18:26 lost+found

It worked. Then we will try the first test: we will create two files and rename them to the same name.

void@ubuntu-2204:~/mnt# ls -li
total 24
12 -rw-r--r-- 1 void root     2 Mar  5 19:26 1.txt
13 -rw-r--r-- 1 void root    20 Mar  5 19:26 2.txt
11 drwx------ 2 void root 16384 Mar  5 18:26 lost+found
void@ubuntu-2204:~/mnt# cat 1.txt 
1
void@ubuntu-2204:~/mnt# cat 2.txt 
2222222222222222222
void@ubuntu-2204:~/mnt# 

We rename and start recalculation of the checksum:

void@ubuntu-2204:~# e2fsck img
e2fsck 1.46.5 (30-Dec-2021)
img contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Duplicate entry 'a.txt' found.
        Marking ??? (2) to be rebuilt.
Directory inode 2, block #0: directory passes checks but fails checksum.
Fix<y>? yes
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Entry 'a.txt' in / (2) has a non-unique filename.
Rename to a.txt~0<y>? no
Pass 4: Checking reference counts
Pass 5: Checking group summary information
img: ***** FILE SYSTEM WAS MODIFIED *****
img: ********** WARNING: Filesystem still has errors **********
img: 13/12800 files (7.7% non-contiguous), 1842/12800 blocks
void@ubuntu-2204:~# mount img mnt/

The e2fsck utility corrects the checksum and finds a unique name. We, in turn, refuse to rename and mount the “disk”. What do we see?

void@ubuntu-2204:~# ls -li mnt
total 24
12 -rw-r--r-- 1 void root     2 Mar  5 19:26 a.txt
12 -rw-r--r-- 1 void root     2 Mar  5 19:26 a.txt
11 drwx------ 2 void root 16384 Mar  5 18:26 lost+found
void@ubuntu-2204:~# cat mnt/a.txt 
1

The system driver issues a small anomaly: a file with a lower numerical identifier, the inode number, is displayed twice. At the same time, all requests from the command line interpreter interact with the first file.

void@ubuntu-2204:~# rm mnt/a.txt 
void@ubuntu-2204:~# ls -l mnt/
ls: cannot access 'mnt/a.txt': No such file or directory
total 16
? -????????? ? ?    ?        ?            ? a.txt
drwx------ 2 void root 16384 Mar  5 18:26 lost+found
void@ubuntu-2204:~# cat mnt/a.txt 
cat: mnt/a.txt: No such file or directory
void@ubuntu-2204:~# umount mnt 
void@ubuntu-2204:~# mount img mnt/
void@ubuntu-2204:~# ls -li mnt/
total 20
13 -rw-r--r-- 1 void root    20 Mar  5 19:26 a.txt
11 drwx------ 2 void root 16384 Mar  5 18:26 lost+found
void@ubuntu-2204:~# cat mnt/a.txt 
2222222222222222222

If you delete a file, its “namesake” will go into the state of Schrödinger’s cat. The file is there, the name autocomplete works, but there is no file. After remounting the file system, the second file is located without problems.

Now let’s do the scary thing and add a separator character:

void@ubuntu-2204:~# cat >mnt/split-me.txt 
Hello, Habr!
void@ubuntu-2204:~# umount mnt
void@ubuntu-2204:~# sed -i 's/split-me.txt/split\/me.txt/g' img
void@ubuntu-2204:~# e2fsck img 
e2fsck 1.46.5 (30-Dec-2021)
img contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'split/me.txt' in ??? (2) has illegal characters in its name.
Fix<y>? no
Directory inode 2, block #0: directory passes checks but fails checksum.
Fix<y>? yes
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information
img: ***** FILE SYSTEM WAS MODIFIED *****
img: ********** WARNING: Filesystem still has errors **********
img: 12/12800 files (8.3% non-contiguous), 1841/12800 blocks

The check finds such an error, but does not consider it critical. OK, let’s mount:

void@ubuntu-2204:~# ls -l mnt/
ls: reading directory 'mnt/': Input/output error
total 16
drwx------ 2 void root 16384 Mar  5 18:26 lost+found
void@ubuntu-2204:~# ls -l mnt/split/me.txt
ls: cannot access 'mnt/split/me.txt': No such file or directory
void@ubuntu-2204:~# mkdir mnt/split
void@ubuntu-2204:~# cat mnt/split/me.txt
cat: mnt/split/me.txt: No such file or directory
void@ubuntu-2204:~# echo Wow > mnt/split/me.txt
void@ubuntu-2204:~# cat mnt/split/me.txt 
Wow

The result does not make you wait a long time.

  • The display of directory contents is partially broken. Objects with an inode number greater than that of the “broken” file will not be included in the output.
  • The “cracked” file itself is not accessible.

On the one hand, this is an example of good fault tolerance. On the other hand, this problem is not a day or two, but 45 years old, if we take the release of UNIX v7 as a starting point.

The article is devoted to the *NIX family of operating systems. But in the process of writing it became interesting how such situations are handled by “windowed” file systems.

You may also be interested in these texts:

→ Figma has closed Dev Mode: Workarounds and their overview
→ Mini-PC for “heavy” and not so tasks: 5 models of early spring 2024
→ 10nm technology and 6 GHz: Intel sets new records for chip performance. What’s new?

exFAT

Taking NTFS seems excessive. But the article started because of a bug on the flash drive of the camera. The file system from Microsoft is used there – exFAT. It is the successor of FAT, designed for removable media. A great candidate. Although for Ubuntu 22.04 you need to install exfat-fuse.

Now you can create an image as before:

void@ubuntu-2204:~# dd if=/dev/zero of=img bs=1M count=50
50+0 records in
50+0 records out
52428800 bytes (52 MB, 50 MiB) copied, 0.024757 s, 2.1 GB/s
void@ubuntu-2204:~# mkfs.exfat img
exfatprogs version : 1.1.3
Creating exFAT filesystem(img, cluster size=4096)
Writing volume boot record: done
Writing backup volume boot record: done
Fat table creation: done
Allocation bitmap creation: done
Upcase table creation: done
Writing root directory entry: done
Synchronizing...
exFAT format complete!<br>void@ubuntu-2204:~# mount img mnt
void@ubuntu-2204:~# cat > mnt/1.txt
111111111111111111111111
void@ubuntu-2204:~# cat > mnt/2.txt 
222222222222222222222

sed will fail to replace the file name: exFAT uses UTF-16 encoding for names. OK, let’s use the Python interpreter and write the following script:

import sys
file, pattern, replace = sys.argv[1:4]
pattern = pattern.encode("utf-16le")
replace = replace.encode("utf-16le")
with open(file, "rb") as f:
    raw_data = f.read()
raw_data = raw_data.replace(pattern, replace)
with open(file, "wb") as f:
    f.write(raw_data)

Memory inefficient, but a 50 MB image is a problem. We are replacing:

void@ubuntu-2204:~# python3 replace.py img 2.txt 1.txt
void@ubuntu-2204:~# mount img mnt
void@ubuntu-2204:~# ls -li mnt/
total 8
10 -rwxr-xr-x 1 void root 25 Mar  5 20:48 1.txt
10 -rwxr-xr-x 1 void root 25 Mar  5 20:48 1.txt

Ubuntu’s response has not changed: only the first file is available. Similarly with the UNIX separator. But the Windows separator creates confusion, but displays the file:

void@ubuntu-2204:~# ls -l mnt/
ls: cannot access 'mnt/2\txt': No such file or directory
total 4
-rwxr-xr-x 1 void root 25 Mar  5 20:48  1.txt
-????????? ? ?    ?     ?            ? '2\txt'
void@ubuntu-2204:~# ls -l mnt/2\\txt 
ls: cannot access 'mnt/2\txt': No such file or directory

Let’s check it on Windows and write it to a flash drive.

dd if=img of=/dev/<здесь мой диск> bs=4M

Notifications in Windows 10.

Error correction cannot be ignored.

Windows immediately detects the drive and offers to fix errors. Error correction is an offer you can’t refuse. Because otherwise it will not be possible to view the contents of the drive in the graphical interface.

Console top.

The command line will show the first files, but then hang on the “broken” file. However, it is possible to read files nearby.

Abstractions for *NIX file systems are not only a universal interface, but also excellent resistance to errors. What do you think about the artifacts in question? Share your opinion in the comments!

Related posts