Tensorflow can't apply sharing policy FILE when using mirrored strategy

Question

In tensorflow 2.8.0 using mirrored strategy:

strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
    self.model()

yields the following warning:

W tensorflow/core/grappler/optimizers/data/auto_shard.cc:776] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Did not find a shardable source, walked to a node which is not a dataset: name: "FlatMapDataset/_2"
Consider either turning off auto-sharding or switching the auto_shard_policy to DATA to shard this dataset. You can do this by creating a new `tf.data.Options()` object then setting `options.experimental_distribute.auto_shard_policy = AutoShardPolicy.DATA` before applying the options object to the dataset via `dataset.with_options(options)`.

However I would like to use the FILE sharing policy, since I have multiple GPUs set up. Any ideas how I can achieve this?

I just saw your other post. When you used `FILE` and you said it was only on one GPU, how did you tell? In my experience, one GPU is likely to be used more than the other without some configuration. — Djinn, Jun 24 '22 at 23:01

score 0 · Answer 1 · answered Jun 24 '22 at 22:54

If you are using tf.data for your dataset, you need to use tf.data.Options() with your dataset to set shard policies.

import tensorflow as tf

dataset = # some dataset
options = tf.data.Options()
options.experimental_distribute.auto_shard_policy = tf.data.experimental.AutoShardPolicy.FILE
dataset = dataset.with_options(options)  # use this as input for your model

FILE vs DATA has nothing to do with using multi-GPU specifically, they both use multi-GPU. It's more about how the data is loaded onto them. By default, FILE is used and should work if you have more data than workers (which you should). It's looking like your input data may not be created correctly.

If you're not using tf.data, but something like numpy arrays, you can just ignore the warning or stop them from displaying since they will default to FILE (if it's working properly, that is. Don't disable warnings willy nilly.)

import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"  # this MUST come before any tf call.

Edit: There may be a way to set the shard policy using numpy arrays, I just haven't found it. I use numpy arrays and just hide the warnings since I know they are working fine. The warnings for me just say it's defaulting to FILE, but for you, you're having issues with the dataset itself.

Tensorflow can't apply sharing policy FILE when using mirrored strategy

1 Answers1

Linked