Alistair Delva | beaee83 | 2021-02-24 11:27:23 -0800 | [diff] [blame] | 1 | % Copyright 2003,2005,2007 Alain Knaff. |
| 2 | |
| 3 | % This documentation is for Mtools which is a collection of tools to |
| 4 | % allow Unix systems to manipulate MS-DOS files. |
| 5 | |
| 6 | % Permission is granted to copy, distribute and/or modify this document |
| 7 | % under the terms of the GNU Free Documentation License, Version 1.3 or |
| 8 | % any later version published by the Free Software Foundation; with no |
| 9 | % Invariant Sections, with no Front-Cover Texts, and with no Back-Cover |
| 10 | %Texts. A copy of the license is included in the section entitled |
| 11 | % ``GNU Free Documentation License''. |
| 12 | |
| 13 | \documentclass[a4paper,12pt]{article} |
| 14 | |
| 15 | \usepackage[T1]{fontenc} |
| 16 | \usepackage[latin1]{inputenc} |
| 17 | \usepackage{pslatex} |
| 18 | \usepackage[pdfpagemode=None,colorlinks]{hyperref} |
| 19 | |
| 20 | \author{Alain Knaff} |
| 21 | \title{How mformat-3.9.10 and above calculates needed FAT size} |
| 22 | |
| 23 | \begin{document} |
| 24 | |
| 25 | \maketitle |
| 26 | |
| 27 | This small document explains the formula used by {\tt mformat.c} to |
| 28 | figure out fat size and number of clusters. Due to the way that |
| 29 | filesystem parameters are stored in the boot sector, we can afford to |
| 30 | have a FAT that is larger than need to be to accommodate the clusters |
| 31 | present on disk, but under no circumstances can we have one that is |
| 32 | too small. |
| 33 | |
| 34 | In this discussion, we use the following variable names: |
| 35 | |
| 36 | \begin{tabular}{|r|p{12cm}|} |
| 37 | |
| 38 | \hline |
| 39 | |
| 40 | $fatNybls$& |
| 41 | Number of nubbles (4 bit unit per FAT). This is 3 for FAT12, 4 for |
| 42 | FAT16, and 8 for FAT16\\ |
| 43 | |
| 44 | \hline |
| 45 | |
| 46 | $numClus$& |
| 47 | Number of clusters on the disk\\ |
| 48 | |
| 49 | \hline |
| 50 | |
| 51 | $clusSiz$& |
| 52 | Size of a cluster, expressed in sectors\\ |
| 53 | |
| 54 | \hline |
| 55 | |
| 56 | $secSiz$& |
| 57 | Size of a sector, in bytes. Size of a sector in nybbles is $secSiz$ * 2\\ |
| 58 | |
| 59 | \hline |
| 60 | |
| 61 | $nfats$& |
| 62 | Number of FAT copies, usually two\\ |
| 63 | |
| 64 | \hline |
| 65 | |
| 66 | $remSects$& |
| 67 | ``Remaining sectors'', after number of boot sectors and root directory |
| 68 | have been accounted for\\ |
| 69 | |
| 70 | \hline |
| 71 | |
| 72 | $fatLen$& |
| 73 | Length of the FAT, in sectors\\ |
| 74 | |
| 75 | \hline |
| 76 | |
| 77 | |
| 78 | \end{tabular} |
| 79 | |
| 80 | \ \\ |
| 81 | |
| 82 | Taking into account both data and fat, each cluster takes up the |
| 83 | following number of nybbles (units of 4 bytes): |
| 84 | |
| 85 | |
| 86 | $$clusSiz * (2*secSiz) + nfats * fatNybls$$ |
| 87 | |
| 88 | This accounts for the data of the cluster ($clusSiz * secSiz$), |
| 89 | as well as for the space taken up by its descriptor. |
| 90 | |
| 91 | The space taken up by all clusters together, plus the space taken by |
| 92 | descriptors for clusters 0 and 1 ($2*nfats*fatNybls$) should be less |
| 93 | than what is available. |
| 94 | |
| 95 | Additional sectors may be lost due to slack (you have to use a full |
| 96 | FAT sector, you also have to use a full cluster in the data |
| 97 | area). Thus, an {\em upper bound} for the number of clusters is as |
| 98 | follows: |
| 99 | |
| 100 | $$ |
| 101 | numClus \le {2*remSect*secSiz - 2*nfats*fatNybls \over |
| 102 | 2*clusSiz*secSiz + nfats*fatNybls} |
| 103 | $$ |
| 104 | |
| 105 | |
| 106 | $$ |
| 107 | numClus \le {(remSect+2*clusSiz)*2*secSiz \over |
| 108 | 2*clusSiz*secSiz + nfats*fatNybls} - 2 |
| 109 | $$ |
| 110 | |
| 111 | |
| 112 | These will take up at most the following space in one copy of the FAT |
| 113 | (we have to round up, because even a half-full fat sector must be |
| 114 | taken in its entirety) |
| 115 | |
| 116 | $$ |
| 117 | fatLen \le \left\lceil { (numClus+2)*fatNybls \over secSiz * 2 } \right\rceil |
| 118 | $$ |
| 119 | |
| 120 | |
| 121 | $$ |
| 122 | fatLen \le \left\lceil { |
| 123 | \left( { 2*(remSect+2*clusSiz)*secSiz \over |
| 124 | 2*clusSiz*secSiz + nfats*fatNybls} \right) * fatNybls \over |
| 125 | 2*secSiz |
| 126 | } \right\rceil |
| 127 | $$ |
| 128 | |
| 129 | |
| 130 | $$ |
| 131 | fatLen \le \left\lceil { |
| 132 | (remSect+2*clusSiz)* fatNybls \over |
| 133 | 2*clusSiz*secSiz + nfats*fatNybls |
| 134 | } \right\rceil |
| 135 | $$ |
| 136 | |
| 137 | The space left after FAT sector has been deduced is now {\em less than |
| 138 | or equal} to what would be needed for the data area of the clusters |
| 139 | (including fractional clusters), which is good, as we may have under |
| 140 | no circumstances {\em more} clusters in the data area than in the FAT. |
| 141 | An important point in this calculation is that we based the needed FAT |
| 142 | size on a {\em fractional} number of clusters, rather than a rounded |
| 143 | down amount of clusters. Indeed, using a rounded down number could |
| 144 | have exposed us to a situation where we had an {\em almost enough} |
| 145 | space for one more cluster (i.e. not enough space for data + FAT, but |
| 146 | enough for data alone). This situation, combined with a situation |
| 147 | where the last FAT sector was flush full could have lead to a |
| 148 | situation where $numClus$ would become too large for the FAT to |
| 149 | accommodate. I think this is clearer with an example: |
| 150 | \begin{itemize} |
| 151 | \item $remSect=4127$, $clusSiz=1$, $nfats=1$ |
| 152 | \item (Non rounded down) $numClus={(4127+2)*(1024) \over 1032} - |
| 153 | 2 =4094.992$ |
| 154 | \item Rounded down: 4094 clusters |
| 155 | \item These fit into 16 FAT sectors, exactly |
| 156 | \item ... leaving us 4095 clusters, which is one to many (because |
| 157 | these 4095 clusters would now need 17 FAT clusters) |
| 158 | \end{itemize} |
| 159 | |
| 160 | Keeping the fractional part (0.992) allows us to round {\em up} the |
| 161 | needed number of FAT sectors to 17, nicely solving this problem. |
| 162 | |
| 163 | The downside of counting the fractional part however is that we quite |
| 164 | often waste a sector in the really common situation where both $nfats$ |
| 165 | and $clusSiz$ are even, while $remSect$ is odd. An easy way to address |
| 166 | this is to subtract one from $remSect$ before application of the |
| 167 | formula, if this case is detected. Such operation carries no risk, as |
| 168 | the odd final sector cannot be used to make a full cluster. |
| 169 | |
| 170 | There is still a case however, where fatLen must be grown manually |
| 171 | after application of the formula: If numClus exceeds the maximum |
| 172 | number of clusters allowable for FAT12 or FAT16), we need to shrink |
| 173 | $numClus$ after application of the formula, and manually make the FAT |
| 174 | larger in order to take up any excess space. |
| 175 | |
| 176 | Also note that as we used upper bounds, we may waste a certain number |
| 177 | of sectors, which an exact calculation may not have wasted. However, |
| 178 | normally, we should not lose more than one sector per FAT copy. |
| 179 | |
| 180 | N.B. In its document at \url{http://www.microsoft.com/hwdev/download/hardware/fatgen103.pdf}, |
| 181 | Microsoft proposes a much simpler formula. However, this formula is |
| 182 | both wrong (i.e. occasionally it produces a smaller FAT than is |
| 183 | needed for the clusters on disk), less generic (only works for sector |
| 184 | sizes equal to 512), and less efficient (in case of FAT32, it may |
| 185 | waste up to 8 sectors!) |
| 186 | |
| 187 | The formula is the following (for FAT16): |
| 188 | $$ |
| 189 | fatLen \le \left\lceil { remSect \over 256 * clusSiz + nfats} \right\rceil |
| 190 | $$ |
| 191 | |
| 192 | Note that it doesn't account for the dummy clusters 0 and 1. Thus, if |
| 193 | we have 258 sectors remaining, with a cluster size of 1, and two FAT |
| 194 | copies, the Microsoft formula mistakenly assumes fatLen = 1. This |
| 195 | leaves 258 - 2 = 256 sectors for clusters, which yields 256 clusters. |
| 196 | However, those clusters do not fit into the FAT, as two clusters are |
| 197 | lost (0 and 1). However, to Micro\$ofts' credit, this is not actually |
| 198 | the formula they're using (tested with $remsect=160056$ and |
| 199 | $clusSize=4$), so this seems to be a documentation issue rather than a |
| 200 | genuine bug. |
| 201 | |
| 202 | In case of FAT32, Microsoft just halves the denominator. This is |
| 203 | wasteful, as only the $256*clusSiz$ part would need to be halved. |
| 204 | |
| 205 | If we assume 16777000, and a cluster size of 8, our formula would give |
| 206 | us: |
| 207 | $$fatLen = \left\lceil (16777000 + 16) * 8 \over 2 * 8 * 512 + 16 |
| 208 | \right\rceil = 16352$$ |
| 209 | This would leave $16777000-2*16352=16744296$ sectors for the clusters, |
| 210 | i.e. 2093037 clusters. The FAT descriptors for those 2093037 clusters |
| 211 | do indeed fit into our 16352 fat sectors. |
| 212 | |
| 213 | Microsoft, on the other hand, would get: $$fatLen = \left\lceil{ |
| 214 | 16777000 \over (256 * 8 + 2)/2} \right\rceil = 16368$$ This would only |
| 215 | leave $16777000-2*16368=16744264$, i.e. 2093033 clusters, thus wasting |
| 216 | 4 clusters. The FAT would be 28 sectors too big, i.e. more than the |
| 217 | mere 8 sectors announced by Microsoft! Unfortunately, I currently |
| 218 | don't have access to any sufficiently recent Windows to check out |
| 219 | whether this really happens in the code, or whether it is only a |
| 220 | documentation issue as well. |
| 221 | |
| 222 | Oh, and before somebody points it out: the formula in this document |
| 223 | occassionnally wastes sectors too, although not as much (Example: With |
| 224 | $remsect=685$, $clusSiz=1$ and $nfats=2$ our formula gives $fatLen=3$, |
| 225 | which leaves 679 clusters. However, we could use $fatLen=2$, leaving |
| 226 | 681 clusters. |
| 227 | |
| 228 | \end{document} |