Questions tagged [data-augmentation]

Data augmentation

Data augmentation is the technique of increasing the size of data used for training a model. It also helps in preventing overfitting.

Some of the most common data augmentation techniques for images include:

  • Scaling
  • Cropping
  • Flipping
  • Rotation
  • Color jittering
465 questions
3
votes
1 answer

Tensorflow Object Detection API Data Augmentation Bounding Boxes

For Object Detection via the Tensorflow API using the model_main.py, when I use i.e. random_horizontal_flip in the data_augmentation_options in the train_config of my pipeline.config, are my bounding boxes also affected? This is very important, as…
3
votes
2 answers

How to perform data augmentation in Tensorflow Estimator's input_fn

Using Tensorflow's Estimator API, at what point in the pipeline should I perform the data augmentation? According to this official Tensorflow guide, one place to perform the data augmentation is in the input_fn: def parse_fn(example): "Parse…
2
votes
1 answer

Preprocessing layers with seed not producing the same data augmentation for images and masks

I'm trying to create a simple preprocessing augmentation layer, following this Tensorflow tutorial. I created this 'simple' example that shows the problem I'm having. Even though I'm initializing the augmentation class with a seed, operations…
2
votes
1 answer

Tensorflow does not apply data augmentation properly

I'm trying to apply the process of data augmentation to a database. I use the following code: train_generator = keras.utils.image_dataset_from_directory( directory= train_dir, subset = "training", image_size = (50,50), …
Gevezo
  • 364
  • 3
  • 17
2
votes
1 answer

How to modify image in custom Tensorflow layer? (working example provided)

How can I draw a filled rectangle as a custom (data augmentation) layer in Tensorflow 2 on Python 3? Input Expected output With image_pil = Image.fromarray(image), I get the error: AttributeError: Exception encountered when calling…
2
votes
0 answers

Should we cache the augmented data

I suddenly wonder, in terms of accuracy on new data, should we cache the augmented data or no (on data pipeline) I don't think caching the augmented data is a good idea, if we're not caching it makes the data passed thru the model become more…
2
votes
0 answers

How to replace [UNK] tokens with original tokens in BERT nlpaug

I am trying to use nlpaug to swap some words out but am having issue with it replacing tokens permanently with the [UNK] token. I am using the docs here: https://nlpaug.readthedocs.io/en/latest/augmenter/word/context_word_embs.html My code an…
geds133
  • 1,503
  • 5
  • 20
  • 52
2
votes
0 answers

Keras use augmentation with custom image generator

I am using a custom image generator to read my data off disk in batches as described here, https://keras.io/examples/vision/oxford_pets_image_segmentation/ The exact generator looks like this: from tensorflow import keras import numpy as np from…
Stefano Potter
  • 3,467
  • 10
  • 45
  • 82
2
votes
2 answers

Word2VecKeyedVectors' object has no attribute 'index_to_key'

I am trying to implement word2vec within nlpaug library and the following code : aug = naw.WordEmbsAug( model_type='word2vec', model_path='GoogleNews-vectors-negative300.bin', action="insert") gives me error of : Word2VecKeyedVectors'…
2
votes
4 answers

Sampling for large class and augmentation for small classes in each batch

Let's say we have 2 classes one is small and the second is large. I would like to use for data augmentation similar to ImageDataGenerator for the small class, and sampling from each batch, in such a way, that, that each batch would be balanced.…
Michael D
  • 1,711
  • 4
  • 23
  • 38
2
votes
2 answers

How to loop through array of objects and add new object key based on condition in JavaScript?

I have the following array of objects: [{"id":0,"name":"Katy","age":22}, {"id":2,"name":"Lucy","age":12}, {"id":1,"name":"Jenna","age":45}, {"id":3,"name":"Ellie","age":34}] I need to add another key into the objects (PaymentCategory) where…
2
votes
1 answer

How to set possbility to tf.keras.layers.RandomFlip?

Is there possible to set a possibility when doing random flip operations by using tf.keras.layers.RandomFlip ? for example: def augmentation(): data_augmentation = keras.Sequential([ keras.layers.RandomFlip("horizontal", p=0.5), …
haofeng
  • 592
  • 1
  • 5
  • 21
2
votes
0 answers

ImageDataGenerator data preparing for semantic segmentation

Preparing my data with ImageDataGenerator. So far I did the following, For training data: def data_aug(validation_split=0.25, batch_size=32, seed=42): datagen = ImageDataGenerator( rotation_range=10, …
MSI
  • 105
  • 9
2
votes
1 answer

Albumentations in Pytorch: Inconsistent Augmentation for multi-target datasets

I'm using Pytorch and want to perform the data augmentation of my images with Albumentations. My dataset object has two different targets: 'blurry' and 'sharp'. Each instance of both targets needs to have identical changes. When I try to perform the…
Ruffybeo
  • 78
  • 1
  • 9
2
votes
0 answers

Data augmentation using SMOTE for images

I have tried two ways to apply SMOTE function to my dataset. However, I can't figured out how to proceed with the Smote function. 1st method: I have applied data augmentation and then tried to apply SMOTE train_data_gen = ImageDataGenerator( …
Jenny
  • 21
  • 1