For pandas (and spark), there is a good general-purpose approach for having full control over how the data is read, which is to specify an already-available dataframe via your BatchKwargs.
So, in your case, you could do the following:
my_dataset = pd.read_pickle(filename)
batch_kwargs = {"dataset": my_dataset}
batch = context.get_batch("my_datasource/in_memory_generator/my_dataset", "warning", batch_kwargs)
Note: this is for the 0.8.x series API, and assumes a data context configuration like the following:
datasources:
my_datasource:
class_name: PandasDatasource
...
generators:
in_memory_generator:
class_name: InMemoryGenerator
PS - This purpose is the primary reason for the existence of the InMemoryGenerator
.
EDIT
In Great Expectations >= 0.9.0, the API for get_batch has been simplified, so you would no longer need a generator at all in this case, and the datasource name is specified in the batch kwargs. The analogous code snippet looks like this:
context = DataContext()
my_dataset = pd.read_pickle(filename)
batch_kwargs = {"datasource": "my_datasource", "dataset": my_dataset}
batch = context.get_batch(batch_kwargs=batch_kwargs, expectation_suite_name="warning")
(and no generator is needed)