Blame - fat_size_calculation.tex - platform/external/mtools

blob: f5189a303ee3ef792ec0feefb928c4c0396a22d1 [file] [log] [blame]

Alistair Delva	beaee83	2021-02-24 11:27:23 -0800	[diff] [blame]	1	% Copyright 2003,2005,2007 Alain Knaff.
				2
				3	% This documentation is for Mtools which is a collection of tools to
				4	% allow Unix systems to manipulate MS-DOS files.
				5
				6	% Permission is granted to copy, distribute and/or modify this document
				7	% under the terms of the GNU Free Documentation License, Version 1.3 or
				8	% any later version published by the Free Software Foundation; with no
				9	% Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
				10	%Texts. A copy of the license is included in the section entitled
				11	% ``GNU Free Documentation License''.
				12
				13	\documentclass[a4paper,12pt]{article}
				14
				15	\usepackage[T1]{fontenc}
				16	\usepackage[latin1]{inputenc}
				17	\usepackage{pslatex}
				18	\usepackage[pdfpagemode=None,colorlinks]{hyperref}
				19
				20	\author{Alain Knaff}
				21	\title{How mformat-3.9.10 and above calculates needed FAT size}
				22
				23	\begin{document}
				24
				25	\maketitle
				26
				27	This small document explains the formula used by {\tt mformat.c} to
				28	figure out fat size and number of clusters. Due to the way that
				29	filesystem parameters are stored in the boot sector, we can afford to
				30	have a FAT that is larger than need to be to accommodate the clusters
				31	present on disk, but under no circumstances can we have one that is
				32	too small.
				33
				34	In this discussion, we use the following variable names:
				35
				36	\begin{tabular}{\|r\|p{12cm}\|}
				37
				38	\hline
				39
				40	$fatNybls$&
				41	Number of nubbles (4 bit unit per FAT). This is 3 for FAT12, 4 for
				42	FAT16, and 8 for FAT16\\
				43
				44	\hline
				45
				46	$numClus$&
				47	Number of clusters on the disk\\
				48
				49	\hline
				50
				51	$clusSiz$&
				52	Size of a cluster, expressed in sectors\\
				53
				54	\hline
				55
				56	$secSiz$&
				57	Size of a sector, in bytes. Size of a sector in nybbles is $secSiz$ * 2\\
				58
				59	\hline
				60
				61	$nfats$&
				62	Number of FAT copies, usually two\\
				63
				64	\hline
				65
				66	$remSects$&
				67	``Remaining sectors'', after number of boot sectors and root directory
				68	have been accounted for\\
				69
				70	\hline
				71
				72	$fatLen$&
				73	Length of the FAT, in sectors\\
				74
				75	\hline
				76
				77
				78	\end{tabular}
				79
				80	\ \\
				81
				82	Taking into account both data and fat, each cluster takes up the
				83	following number of nybbles (units of 4 bytes):
				84
				85
				86	$$clusSiz * (2secSiz) + nfats fatNybls$$
				87
				88	This accounts for the data of the cluster ($clusSiz * secSiz$),
				89	as well as for the space taken up by its descriptor.
				90
				91	The space taken up by all clusters together, plus the space taken by
				92	descriptors for clusters 0 and 1 ($2nfatsfatNybls$) should be less
				93	than what is available.
				94
				95	Additional sectors may be lost due to slack (you have to use a full
				96	FAT sector, you also have to use a full cluster in the data
				97	area). Thus, an {\em upper bound} for the number of clusters is as
				98	follows:
				99
				100	$$
				101	numClus \le {2remSectsecSiz - 2nfatsfatNybls \over
				102	2clusSizsecSiz + nfats*fatNybls}
				103	$$
				104
				105
				106	$$
				107	numClus \le {(remSect+2clusSiz)2*secSiz \over
				108	2clusSizsecSiz + nfats*fatNybls} - 2
				109	$$
				110
				111
				112	These will take up at most the following space in one copy of the FAT
				113	(we have to round up, because even a half-full fat sector must be
				114	taken in its entirety)
				115
				116	$$
				117	fatLen \le \left\lceil { (numClus+2)fatNybls \over secSiz 2 } \right\rceil
				118	$$
				119
				120
				121	$$
				122	fatLen \le \left\lceil {
				123	\left( { 2(remSect+2clusSiz)*secSiz \over
				124	2clusSizsecSiz + nfatsfatNybls} \right) fatNybls \over
				125	2*secSiz
				126	} \right\rceil
				127	$$
				128
				129
				130	$$
				131	fatLen \le \left\lceil {
				132	(remSect+2clusSiz) fatNybls \over
				133	2clusSizsecSiz + nfats*fatNybls
				134	} \right\rceil
				135	$$
				136
				137	The space left after FAT sector has been deduced is now {\em less than
				138	or equal} to what would be needed for the data area of the clusters
				139	(including fractional clusters), which is good, as we may have under
				140	no circumstances {\em more} clusters in the data area than in the FAT.
				141	An important point in this calculation is that we based the needed FAT
				142	size on a {\em fractional} number of clusters, rather than a rounded
				143	down amount of clusters. Indeed, using a rounded down number could
				144	have exposed us to a situation where we had an {\em almost enough}
				145	space for one more cluster (i.e. not enough space for data + FAT, but
				146	enough for data alone). This situation, combined with a situation
				147	where the last FAT sector was flush full could have lead to a
				148	situation where $numClus$ would become too large for the FAT to
				149	accommodate. I think this is clearer with an example:
				150	\begin{itemize}
				151	\item $remSect=4127$, $clusSiz=1$, $nfats=1$
				152	\item (Non rounded down) $numClus={(4127+2)*(1024) \over 1032} -
				153	2 =4094.992$
				154	\item Rounded down: 4094 clusters
				155	\item These fit into 16 FAT sectors, exactly
				156	\item ... leaving us 4095 clusters, which is one to many (because
				157	these 4095 clusters would now need 17 FAT clusters)
				158	\end{itemize}
				159
				160	Keeping the fractional part (0.992) allows us to round {\em up} the
				161	needed number of FAT sectors to 17, nicely solving this problem.
				162
				163	The downside of counting the fractional part however is that we quite
				164	often waste a sector in the really common situation where both $nfats$
				165	and $clusSiz$ are even, while $remSect$ is odd. An easy way to address
				166	this is to subtract one from $remSect$ before application of the
				167	formula, if this case is detected. Such operation carries no risk, as
				168	the odd final sector cannot be used to make a full cluster.
				169
				170	There is still a case however, where fatLen must be grown manually
				171	after application of the formula: If numClus exceeds the maximum
				172	number of clusters allowable for FAT12 or FAT16), we need to shrink
				173	$numClus$ after application of the formula, and manually make the FAT
				174	larger in order to take up any excess space.
				175
				176	Also note that as we used upper bounds, we may waste a certain number
				177	of sectors, which an exact calculation may not have wasted. However,
				178	normally, we should not lose more than one sector per FAT copy.
				179
				180	N.B. In its document at \url{http://www.microsoft.com/hwdev/download/hardware/fatgen103.pdf},
				181	Microsoft proposes a much simpler formula. However, this formula is
				182	both wrong (i.e. occasionally it produces a smaller FAT than is
				183	needed for the clusters on disk), less generic (only works for sector
				184	sizes equal to 512), and less efficient (in case of FAT32, it may
				185	waste up to 8 sectors!)
				186
				187	The formula is the following (for FAT16):
				188	$$
				189	fatLen \le \left\lceil { remSect \over 256 * clusSiz + nfats} \right\rceil
				190	$$
				191
				192	Note that it doesn't account for the dummy clusters 0 and 1. Thus, if
				193	we have 258 sectors remaining, with a cluster size of 1, and two FAT
				194	copies, the Microsoft formula mistakenly assumes fatLen = 1. This
				195	leaves 258 - 2 = 256 sectors for clusters, which yields 256 clusters.
				196	However, those clusters do not fit into the FAT, as two clusters are
				197	lost (0 and 1). However, to Micro\$ofts' credit, this is not actually
				198	the formula they're using (tested with $remsect=160056$ and
				199	$clusSize=4$), so this seems to be a documentation issue rather than a
				200	genuine bug.
				201
				202	In case of FAT32, Microsoft just halves the denominator. This is
				203	wasteful, as only the $256*clusSiz$ part would need to be halved.
				204
				205	If we assume 16777000, and a cluster size of 8, our formula would give
				206	us:
				207	$$fatLen = \left\lceil (16777000 + 16) * 8 \over 2 * 8 * 512 + 16
				208	\right\rceil = 16352$$
				209	This would leave $16777000-2*16352=16744296$ sectors for the clusters,
				210	i.e. 2093037 clusters. The FAT descriptors for those 2093037 clusters
				211	do indeed fit into our 16352 fat sectors.
				212
				213	Microsoft, on the other hand, would get: $$fatLen = \left\lceil{
				214	16777000 \over (256 * 8 + 2)/2} \right\rceil = 16368$$ This would only
				215	leave $16777000-2*16368=16744264$, i.e. 2093033 clusters, thus wasting
				216	4 clusters. The FAT would be 28 sectors too big, i.e. more than the
				217	mere 8 sectors announced by Microsoft! Unfortunately, I currently
				218	don't have access to any sufficiently recent Windows to check out
				219	whether this really happens in the code, or whether it is only a
				220	documentation issue as well.
				221
				222	Oh, and before somebody points it out: the formula in this document
				223	occassionnally wastes sectors too, although not as much (Example: With
				224	$remsect=685$, $clusSiz=1$ and $nfats=2$ our formula gives $fatLen=3$,
				225	which leaves 679 clusters. However, we could use $fatLen=2$, leaving
				226	681 clusters.
				227
				228	\end{document}