Inside FAT: File Recovery
Read about the process of data recovery from a FAT disk and the algorithm used for file recovery. Now when we found the file system, we can start analyzing its records. Our goal is identifying addresses of the physical sectors on the disk that contain data belonging to a deleted file. In order to do that, a data recovery algorithm will scan the file system and enumerate its records.
- Search algorithm and the structure of the FAT partition table
- Recovering Files
- Conclusion
- Questions and answers
- Comments
Search algorithm and the structure of the FAT partition table
In the first part of the article, we took a look at the search algorithm and the structure of the FAT partition table. In the FAT system, each file and directory has a corresponding record in the file system, a so-called directory entry. Directory entries contain information about the file including its name, attributes, initial address and length.
The content of a file or directory is stored in data blocks of equal length. These data blocks are called clusters. Each cluster contains a certain number of disk sectors. This number is a fixed value for each FAT volume. It’s recorded in the corresponding file system structure.
The tricky part is when a file or directory contains more than a single cluster. Subsequent clusters are identified with data structures called FAT (File Allocation Table). These structures are used to identify subsequent clusters that belong to a certain file, and to identify if a particular cluster is occupied or available.
Before analyzing the file system, it is essential to identify the three system areas.
- The first area is reserved; it contains essential information about the file system. In FAT12 and FAT16, this area is one sector long. FAT32 can use more than one sector. The size of this area is specified in the boot sector.
- The second area belongs to the FAT system, and contains primary and secondary structures of the file system. This area immediately follows the reserved area. Its size is defined by the size and number of FAT structures.
- Finally, the last area contains the actual data. The content of files and directories is stored in this particular area.
When analyzing the file system, the FAT area will be of principal interest. It is this area that contains information on files’ physical addresses on the disk.
Figure 1. Physical structure of the FAT file system.
When analyzing the file system, it is essential co correctly determine the three system areas. The reserved area always begins at the very beginning of the file system (sector number 0). The size of this area is specified in the boot sector. In FAT12 and FAT16 the size of this area is exactly one sector. In FAT32, this area may occupy several sectors.
The FAT area immediately follows the reserved area. The FAT area contains one or more FAT structures. The size of this area is calculated by multiplying the number of FAT structures by the size of each structure. These values are also stored in the boot sector.
Recovering Files
We’re finally close to recovering our first file. Let’s assume the file has been just recently deleted, and no part of the file was overwritten with other data. This means that all clusters previously used by this file are now marked as available.
It is important to note that the system also can erases the corresponding FAT records. This means that we can get information about the file’s initial address, its attributes and size, but have no way to obtain data on any subsequent clusters.
At this point, we cannot recover the entire list of clusters that belong to the deleted file. However, we can still try to recover the file’s content by reading the first cluster. If the file is reasonably small and fits into a single cluster, great! We’ve just recovered the file. If, however, the file is larger than the size of a single cluster, we have to develop an algorithm to recover the rest of the file.
The FAT system offers no easy way to determine which clusters belong to a deleted file, so this task is always a bit of a guessing game. The simplest way is just reading the clusters following the initial one, ignoring whether or not these clusters are occupied by other files. However silly it may sound, this is the only method available if no file system is available or if the file system is empty (e.g. after formatting the disk).
The other method is more sophisticated, only reading information from clusters that aren’t occupied with data belonging to other files. This method takes into account information on clusters occupied by other files as specified in the file system.
It is logical to assume that the second method yields better results compared to the first one (assuming that the file system is available and not empty). The second method can even recover some fragmented files.
Figure 2 shows three different scenarios of recovering a file occupying 6 clusters of the file system. The file size is 7094 bytes; cluster size is 2048 bytes. This means that the deleted file initially occupied 4 clusters. In addition, we know the address of the initial cluster (cluster 56). For demonstration purposes, clusters that used to belong to the deleted files are colored blue. Red color marks clusters occupied with other data, while empty clusters are filled white.
Figure 2. Three file recovery scenarios.
- In scenario 2.A, the file occupies 4 subsequent clusters (that is, the file is not fragmented). In this case, the file can be recovered correctly by either algorithm. Both algorithms will correctly read clusters 56 through 59.
- In scenario 2.B, the file was fragmented and stored in 3 fragments. Clusters 57 and 60 are used by other file. In this scenario, the first algorithm will recover clusters 56 through 59, which will return a corrupted file. The second method will correctly recover clusters 56, 58, 59 and 61.
- In the final scenario 2.C, the deleted file was also fragmented (same clusters as in 2.B). However, clusters 57 and 60 are not used by any other file. In this scenario, both algorithms will recover clusters 56 through 59, both returning a corrupted file.
As we can see, neither method is perfect, but the second algorithm offers a higher chance of successful recovery compared to the first one. This method is used in Hetman Partition Recovery.
In our simple scenario we assumed that all parts of the file are still available and not overwritten with other data. In real life, this is not always the case. If some parts of a file are taken by other files, no algorithm will be able to recover the file completely.
Conclusion
As we could see, the FAT file system is pretty simple. This is exactly the reason it is so widely used by manufacturers of CompactFlash, Memory Stick and xD-Picture Card. However, when it comes to data recovery, FAT is not the best file system around. With a certain level of fragmentation, some files may or may not be recovered regardless of which approach is chosen.
Fortunately, Windows no longer uses the FAT system to format internal hard drives. The use of a much newer and more sophisticated file system, the NTFS, is enforced when formatting a system disk, and highly recommended when formatting other partitions. The NTFS is a much more complicated file system. NTFS internals will be discussed in the upcoming article.
What are some common applications and devices that use the FAT file system?
How does the FAT file system compare to other file systems, such as NTFS and exFAT?
FAT (File Allocation Table) is an older file system that was originally designed for small disks with limited storage capacity. It is still used in some embedded systems and USB flash drives, but has been largely replaced by other more modern file systems such as NTFS and exFAT. NTFS (New Technology File System) is a modern file system that provides better support for large disks, security, and data recovery. It is the default file system used by Windows operating systems. exFAT (Extended File Allocation Table) is a newer file system designed to replace FAT32. It is optimized for flash drives and other removable storage devices, and provides improved performance, reliability, and scalability. It is also supported by most modern operating systems.
What is the difference between FAT12, FAT16, and FAT32 file systems?
FAT12 is an older file system used in floppy disks and other small storage devices. It is limited to a maximum volume size of 32MB, and a maximum file size of 2GB. FAT16 was the standard file system for DOS and Windows 3.x operating systems. It supports larger volumes and file sizes than FAT12, up to 2GB and 4GB respectively. FAT32 is an improved version of FAT16 that was introduced with Windows 95 OSR2. It supports larger volumes and file sizes up to 2TB and 4GB respectively. It also has improved cluster sizing and disk space utilization.
How can data be recovered from a corrupted FAT file system?
Data can be recovered from a corrupted FAT file system by using a data recovery software. These programs are designed to scan the file system and extract any recoverable data from it. Additionally, some data recovery programs can repair corrupted file systems, allowing for the recovery of data that would otherwise be lost.
How does the FAT file system handle file fragmentation?
The FAT file system uses a technique called cluster-based file allocation to manage file fragmentation. Clusters are groups of sectors that are used to store data. When a file is created, the FAT file system will allocate a number of clusters to store the file. If the file size is larger than the available clusters, it will be split into multiple parts and stored in different clusters. The FAT file system keeps track of which clusters are used to store each part of the file. When the file is accessed, the FAT file system will read the clusters in order and assemble the file.
Read about recovering deleted files in FAT file system. If you do have any questions, don't hesitate to contact our technical support service - we will be happy to help you.