4

Writing in Swift, I have a specific need that might be a candidate for type punning. I'm reading files from a disk image where they exist in contiguous 512 byte sectors. A file comes off the disk as a Data struct which is easily converted to a byte array and/or 512 byte DataSlices without making copying necessary.

So far, so good! A file can be represented quickly as a collection of arrays containing 512 UInt8 units. However, some disk sectors, typically containing metadata, are better treated as 256 UInt16 items. Ideally, I'd like to be able to refer to the sectors in memory either way without any copying.

Right now, I'm taking a simple approach in which I copy a sector (needing two copies) into something like:

struct Sector {
    let bytes: [UInt8]
    let words: [UInt16]
    . . .

init() {
    self.bytes = Array(repeating: 0, count: 512)
    self.words = Array(repeating: 0, count: 256)
    . . .

and then refer to sector.words or sector.bytes depending which is appropriate. Luckily, the [UInt16] sectors are much fewer than their byte equivalents, so this isn't too awful; and any given sector is one or the other, I don't have to worry that changing bytes doesn't change words (although that's an accident waiting to happen).

Everything I read decries punning the [UInt8] and [UInt16] arrays, and Swift's "unsafe" memory functions are somewhat intimidating. I'm trying to think of a way to make .words in the above struct occupy the same memory as .bytes and would appreciate some suggestions.

I'm still working on this .. if I find something worthwhile, I'll share.

RamsayCons
  • 71
  • 3

2 Answers2

3

You're looking for the technique described in https://stackoverflow.com/a/38024025/341994. Keep the data in a Data and access it as an array of UInt8 or UInt16 as desired. (Actually a Data is an array of UInt8, so you'd really only need to do anything special when you want to see it as an array of UInt16.) Example:

extension Data {
    init<T>(fromArray values: [T]) {
        self = values.withUnsafeBytes { Data($0) }
    }
    func toArray<T>(type: T.Type) -> [T] where T: ExpressibleByIntegerLiteral {
        var array = Array<T>(repeating: 0, count: self.count/MemoryLayout<T>.stride)
        _ = array.withUnsafeMutableBytes { copyBytes(to: $0) }
        return array
    }
}

let eights : [UInt8] = [1,2,3,4]
let data = Data(fromArray:eights)
let sixteens = data.toArray(type: UInt16.self)
print(sixteens) // 513, 1027

That's the right answer, assuming the right endianity;

  • 0b00000001 is 1
  • 0b00000010 is 2

So, assuming lo-byte hi-byte:

  • 0b0000001000000001 is 513
matt
  • 515,959
  • 87
  • 875
  • 1,141
0

Move that into the type!

struct Sector<Datum: FixedWidthInteger> {
  let data: [Datum]

  init() {
    data = .init(repeating: 0, count: 0x1000 / Datum.bitWidth)
  }
}

typealias Byte = UInt8
typealias Word = UInt16

Sector<Byte>().data.count  // 512
Sector<Word>().data.count  // 256
  • But then each sector can be seen only one way. The problem is to represent any given sector freely _either_ way. Basically it's a Swift version of a C union. – matt May 11 '20 at 02:34
  • Protocols ! –  May 11 '20 at 02:35