Questions tagged [data-generation]

263 questions
4
votes
1 answer

How to generate survival data with time dependent covariates using R

i want to generate survival time from a Cox proportional hazards model that contains time dependent covariate. The model is h(t|Xi) =h_0(t) exp(gamma*Xi + alpha*mi(t)) where Xi is generated from Binomial(1,0.5) and mi(t) is a time-dependent…
Sheikh
  • 57
  • 1
  • 7
4
votes
2 answers

Test data generation framework in python?

Is there any "test-data" generation framework out there, specially for Python? To make it clear, instead of writing scripts from scratch that fill my database with random users and other entities I want to know if there are any tools/frameworks…
Ali
  • 18,665
  • 21
  • 103
  • 138
3
votes
2 answers

How to prepare data in the input format table and metadata for the Synthetic Data Vault (SDV) library

I want to use the synthetic data generation method of the Synthetic Data Vault (SDV) library (reference https://sdv.dev/SDV/index.html), but I can't. I think my problem is how to prepare data in the input format required for the method ".fit()". The…
3
votes
2 answers

How do I generate a rule for more than one option in Bogus?

I have a rule to select an option at random: .RuleFor(c=>field, x=>x.PickRandom("Option1", "Option2", "Option3", "Option4")) With a default value, I can alter the probability of one of the items. I would like to set the probabilities of all…
Cakemeister
  • 191
  • 2
  • 6
3
votes
2 answers

Generate a binary variable with a predefined correlation to an already existing variable

For a simulation study, I want to generate a set of random variables (both continuous and binary) that have predefined associations to an already existing binary variable, denoted here as x. For this post, assume that x is generated following the…
ecl
  • 369
  • 1
  • 15
3
votes
3 answers

Are generators time efficent?

I understand that generators in python atleast are memeory efficent as it deals with one item at a time but how does this make it time efficent (if it is) ? Specifically, say I'm using generator function to load one data at a time for a machine…
black sheep 369
  • 564
  • 8
  • 19
3
votes
1 answer

R: adjusting a given time-series but keeping summary statistics equal

Let's say I have a time-series like this t x 1 100 2 50 3 200 4 210 5 90 6 80 7 300 Is it possible in R to generate a new dataset x1 which has the exact same summary statistics, e.g. mean, variance,…
Jj Blevins
  • 355
  • 1
  • 13
3
votes
1 answer

Keras seeding ImageDataGenerator versus Sequence

I'm currently using tensorflow.keras.preprocessing.image.ImageDataGenerator and flow_from_directory. For example: from tensorflow.keras.preprocessing.image import ImageDataGenerator train_datagen = ImageDataGenerator(rotation_range=20, …
Austin
  • 6,921
  • 12
  • 73
  • 138
3
votes
1 answer

Generate and Test accumulating valid answer for next test

I know how to do a simple generate and test to return each answer individually. In the following example only items that are greater than 1 are returned. item(1). item(1). item(2). item(3). item(1). item(7). item(1). item(4). gen_test(Item) :- …
Guy Coder
  • 24,501
  • 8
  • 71
  • 136
3
votes
1 answer

Trying to generate a large-scale test data set with Bogus

I'm trying to generate a production-quality and -quantity size test data set with Bogus, and this library works extremely well with basic data - simple datatypes like int or string, things like first and last name etc. I'm currently not seeing how I…
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
3
votes
3 answers

Build numbers table on the fly in Oracle

How do I return a rowset consisting of the last four years based on the current date? If this query runs on 12/31/2010 it should return: 2007 2008 2009 2010 But if it is run on 1/1/2011 it should return: 2008 2009 2010 2011 Here's what I started…
ErikE
  • 48,881
  • 23
  • 151
  • 196
3
votes
1 answer

How can I get the decision path to specific class type in classification decision tree

Let's say I have created a classification decision tree as following: HP(1:size(HP), end) = 0; LP(1:size(LP), end) = 1; % the dt's input & target pop x = [HP(:,1:end-1); LP(:,1:end-1)]; t = [HP(:,end); LP(:,end)]; dt =…
dariush
  • 3,191
  • 3
  • 24
  • 43
3
votes
2 answers

A quick SQL query to generate example data

I need to populate a currently empty table with a hundred or so fake records to simulate logins over the past two years to test my code with. The login table schema looks like: CREATE TABLE `Logins` ( `ID` int(11) NOT NULL AUTO_INCREMENT, …
Austin Hyde
  • 26,347
  • 28
  • 96
  • 129
3
votes
4 answers

how to read/parse dynamically generated web content?

I need to find a way to write a program (in any language) that will connect to a website and read dynamically generated data from the website. Note that it's dynamically generated--it's not enough to get the source html, because the data I'm…
djames
  • 31
  • 2
3
votes
4 answers

Generate a string representation of a one-hot encoding

In Python, I need to generate a dict that maps a letter to a pre-defined "one-hot" representation of that letter. By way of illustration, the dict should look like this: { 'A': '1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0', 'B': '0 1 0 0 0…
E.M.
  • 4,498
  • 2
  • 23
  • 30
1 2
3
17 18