README 7.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198
  1. Notes on Filesystem Layout
  2. --------------------------
  3. These notes describe what mkcramfs generates. Kernel requirements are
  4. a bit looser, e.g. it doesn't care if the <file_data> items are
  5. swapped around (though it does care that directory entries (inodes) in
  6. a given directory are contiguous, as this is used by readdir).
  7. All data is currently in host-endian format; neither mkcramfs nor the
  8. kernel ever do swabbing. (See section `Block Size' below.)
  9. <filesystem>:
  10. <superblock>
  11. <directory_structure>
  12. <data>
  13. <superblock>: struct cramfs_super (see cramfs_fs.h).
  14. <directory_structure>:
  15. For each file:
  16. struct cramfs_inode (see cramfs_fs.h).
  17. Filename. Not generally null-terminated, but it is
  18. null-padded to a multiple of 4 bytes.
  19. The order of inode traversal is described as "width-first" (not to be
  20. confused with breadth-first); i.e. like depth-first but listing all of
  21. a directory's entries before recursing down its subdirectories: the
  22. same order as `ls -AUR' (but without the /^\..*:$/ directory header
  23. lines); put another way, the same order as `find -type d -exec
  24. ls -AU1 {} \;'.
  25. Beginning in 2.4.7, directory entries are sorted. This optimization
  26. allows cramfs_lookup to return more quickly when a filename does not
  27. exist, speeds up user-space directory sorts, etc.
  28. <data>:
  29. One <file_data> for each file that's either a symlink or a
  30. regular file of non-zero st_size.
  31. <file_data>:
  32. nblocks * <block_pointer>
  33. (where nblocks = (st_size - 1) / blksize + 1)
  34. nblocks * <block>
  35. padding to multiple of 4 bytes
  36. The i'th <block_pointer> for a file stores the byte offset of the
  37. *end* of the i'th <block> (i.e. one past the last byte, which is the
  38. same as the start of the (i+1)'th <block> if there is one). The first
  39. <block> immediately follows the last <block_pointer> for the file.
  40. <block_pointer>s are each 32 bits long.
  41. When the CRAMFS_FLAG_EXT_BLOCK_POINTERS capability bit is set, each
  42. <block_pointer>'s top bits may contain special flags as follows:
  43. CRAMFS_BLK_FLAG_UNCOMPRESSED (bit 31):
  44. The block data is not compressed and should be copied verbatim.
  45. CRAMFS_BLK_FLAG_DIRECT_PTR (bit 30):
  46. The <block_pointer> stores the actual block start offset and not
  47. its end, shifted right by 2 bits. The block must therefore be
  48. aligned to a 4-byte boundary. The block size is either blksize
  49. if CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified, otherwise
  50. the compressed data length is included in the first 2 bytes of
  51. the block data. This is used to allow discontiguous data layout
  52. and specific data block alignments e.g. for XIP applications.
  53. The order of <file_data>'s is a depth-first descent of the directory
  54. tree, i.e. the same order as `find -size +0 \( -type f -o -type l \)
  55. -print'.
  56. <block>: The i'th <block> is the output of zlib's compress function
  57. applied to the i'th blksize-sized chunk of the input data if the
  58. corresponding CRAMFS_BLK_FLAG_UNCOMPRESSED <block_ptr> bit is not set,
  59. otherwise it is the input data directly.
  60. (For the last <block> of the file, the input may of course be smaller.)
  61. Each <block> may be a different size. (See <block_pointer> above.)
  62. <block>s are merely byte-aligned, not generally u32-aligned.
  63. When CRAMFS_BLK_FLAG_DIRECT_PTR is specified then the corresponding
  64. <block> may be located anywhere and not necessarily contiguous with
  65. the previous/next blocks. In that case it is minimally u32-aligned.
  66. If CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified then the size is always
  67. blksize except for the last block which is limited by the file length.
  68. If CRAMFS_BLK_FLAG_DIRECT_PTR is set and CRAMFS_BLK_FLAG_UNCOMPRESSED
  69. is not set then the first 2 bytes of the block contains the size of the
  70. remaining block data as this cannot be determined from the placement of
  71. logically adjacent blocks.
  72. Holes
  73. -----
  74. This kernel supports cramfs holes (i.e. [efficient representation of]
  75. blocks in uncompressed data consisting entirely of NUL bytes), but by
  76. default mkcramfs doesn't test for & create holes, since cramfs in
  77. kernels up to at least 2.3.39 didn't support holes. Run mkcramfs
  78. with -z if you want it to create files that can have holes in them.
  79. Tools
  80. -----
  81. The cramfs user-space tools, including mkcramfs and cramfsck, are
  82. located at <http://sourceforge.net/projects/cramfs/>.
  83. Future Development
  84. ==================
  85. Block Size
  86. ----------
  87. (Block size in cramfs refers to the size of input data that is
  88. compressed at a time. It's intended to be somewhere around
  89. PAGE_SIZE for cramfs_readpage's convenience.)
  90. The superblock ought to indicate the block size that the fs was
  91. written for, since comments in <linux/pagemap.h> indicate that
  92. PAGE_SIZE may grow in future (if I interpret the comment
  93. correctly).
  94. Currently, mkcramfs #define's PAGE_SIZE as 4096 and uses that
  95. for blksize, whereas Linux-2.3.39 uses its PAGE_SIZE, which in
  96. turn is defined as PAGE_SIZE (which can be as large as 32KB on arm).
  97. This discrepancy is a bug, though it's not clear which should be
  98. changed.
  99. One option is to change mkcramfs to take its PAGE_SIZE from
  100. <asm/page.h>. Personally I don't like this option, but it does
  101. require the least amount of change: just change `#define
  102. PAGE_SIZE (4096)' to `#include <asm/page.h>'. The disadvantage
  103. is that the generated cramfs cannot always be shared between different
  104. kernels, not even necessarily kernels of the same architecture if
  105. PAGE_SIZE is subject to change between kernel versions
  106. (currently possible with arm and ia64).
  107. The remaining options try to make cramfs more sharable.
  108. One part of that is addressing endianness. The two options here are
  109. `always use little-endian' (like ext2fs) or `writer chooses
  110. endianness; kernel adapts at runtime'. Little-endian wins because of
  111. code simplicity and little CPU overhead even on big-endian machines.
  112. The cost of swabbing is changing the code to use the le32_to_cpu
  113. etc. macros as used by ext2fs. We don't need to swab the compressed
  114. data, only the superblock, inodes and block pointers.
  115. The other part of making cramfs more sharable is choosing a block
  116. size. The options are:
  117. 1. Always 4096 bytes.
  118. 2. Writer chooses blocksize; kernel adapts but rejects blocksize >
  119. PAGE_SIZE.
  120. 3. Writer chooses blocksize; kernel adapts even to blocksize >
  121. PAGE_SIZE.
  122. It's easy enough to change the kernel to use a smaller value than
  123. PAGE_SIZE: just make cramfs_readpage read multiple blocks.
  124. The cost of option 1 is that kernels with a larger PAGE_SIZE
  125. value don't get as good compression as they can.
  126. The cost of option 2 relative to option 1 is that the code uses
  127. variables instead of #define'd constants. The gain is that people
  128. with kernels having larger PAGE_SIZE can make use of that if
  129. they don't mind their cramfs being inaccessible to kernels with
  130. smaller PAGE_SIZE values.
  131. Option 3 is easy to implement if we don't mind being CPU-inefficient:
  132. e.g. get readpage to decompress to a buffer of size MAX_BLKSIZE (which
  133. must be no larger than 32KB) and discard what it doesn't need.
  134. Getting readpage to read into all the covered pages is harder.
  135. The main advantage of option 3 over 1, 2, is better compression. The
  136. cost is greater complexity. Probably not worth it, but I hope someone
  137. will disagree. (If it is implemented, then I'll re-use that code in
  138. e2compr.)
  139. Another cost of 2 and 3 over 1 is making mkcramfs use a different
  140. block size, but that just means adding and parsing a -b option.
  141. Inode Size
  142. ----------
  143. Given that cramfs will probably be used for CDs etc. as well as just
  144. silicon ROMs, it might make sense to expand the inode a little from
  145. its current 12 bytes. Inodes other than the root inode are followed
  146. by filename, so the expansion doesn't even have to be a multiple of 4
  147. bytes.