sorry for this rather simple question, however there is yet too little documentation about the usage of Microsoft's OpenSource AI library CNTK.
I continue to witness people setting the reader's feature start to 1, while setting the labels start to 0. But should both of them be always 0, as informations does in computer science always start from the zero point? Wouldn't we lose one piece of information this way?
Example of CIFAR10 02_BatchNormConv
features=[
#dimension = 3 (rgb) * 32 (width) * 32(length)
dim=3072
start=1
]
labels=[
dim=1
start=0
labelDim=10
labelMappingFile=$DataDir$/labelsmap.txt
]
Update: New format
Microsoft has recently updated this, in order to get rid of these confusion and make the CNTK Definition Language more readable.
Instead of having to define the start of the values within the line, you can now simply define the type of data in the dataset itself:
|labels <tab seperated values> | features <tab seperated values> [EndOfLine/EOL]
if you want to reverse the order of features and lables you can simply go for:
|features <tab seperated values> | labels <tab seperated values> [EndOfLine/EOL]
You only have still to define the dim value, in order to specify the amount of values you want to input.
Note: There's no | at the end of the row. EOL indicates the end of the row.