Questions tagged [sample-data]

Sample data is a term used for publicly available sets of data in a variety of formats.

Sample data is used to get an application started quickly with data for demo purposes or to load test an application or database platform. The data held within may accurately represent a data set, such as a list of countries or be completely manufactured.

The idea is that is it not used as a basis of a good data sample, but merely useful as 'data'.

Examples include (but not limited to) CSV, Database Backups, Excel or plain text files. The OP would usually specify the format that they require in their question.

151 questions
1
vote
1 answer

Re sampling a data to match the population profile using two demographic variables ( sex and age) (using R)

I'm struggling with what I imagine is a multi-level sampling procedure in R. Let's say I have a dataset composed of a very biased sampling method. Therefore, the results obtained with the participants are biased. I would like to adjust the dataset…
Luis
  • 1,388
  • 10
  • 30
1
vote
1 answer

Can i sample sets of data within a dataframe without selecting the same set twice (without replacement)?

I am fairly new to python and i would like to sample sets of data in the following dataframe by their group, without selecting the same group twice. The code i have written does sample the sets of data correctly, however, it can select the same set…
d.patel
  • 13
  • 3
1
vote
2 answers

Changing the correlation between two variables in an example/fictitious data set

I am trying to create a sample data set (most of the code is from this question). It is almost how I want it to be. However, there are two things I still want to do, but I cannot figure out. I would like to create a higher correlation between y and…
Tom
  • 2,173
  • 1
  • 17
  • 44
1
vote
1 answer

How to sample small set of images for training a model from a large images folder in r?

I have a very large folder of images (train_dir), as well as a CSV file containing the class labels for each of those images(train_df). Because the data is huge, I'd like to take only a sample of images (say 25%) along with labels(train_df); How…
1
vote
2 answers

Using Replace function in sample with R

I'm trying to use sample function however encounturing some trouble. My objective is to have 500 samples from a normal distrubition and replace any numbers that are less than 5. I tried using replace function but not familiar with syntax and keep…
M.Ustun
  • 289
  • 2
  • 6
  • 16
1
vote
2 answers

Pandas resample frequency within index level

Within Pandas, I would like to resample my dataframe and take the mean within a 5 hour period and within index level. My dataframe looks like: df timestamp width length name 10 2019-08-01…
Jeroen
  • 801
  • 6
  • 20
1
vote
5 answers

Need well formatted data for testing

Sometimes you need data for tests, like Adobe Thermo has prewritten "sets" of data, like 1-word strings, 3-word strings, etc for use in populating data controls. I need: Continuous text, no newlines CSV Numbers, Integers CSV Numbers, Decimals URL…
Robin Rodricks
  • 110,798
  • 141
  • 398
  • 607
1
vote
2 answers

How do I write a for-loop so a program reiterates itself for a set of 94 DNA samples?

I have written some code in a bash shell (so I can submit it to my university's supercomputer) to edit out contaminant sequences from a batch of DNA extracts I have. Essentially what this code does is take the sequences from the negative extraction…
Kim L.
  • 13
  • 2
1
vote
2 answers

Generalizion 3d possion disk sampling

I want to create a Poisson disk sampling for a 3d grid. I used https://github.com/emulbreh/bridson implementation to generalize it to 3D. However it seems that I'm missing something, and I can't find the problem. This implementation follows the…
1
vote
2 answers

Market Basket analysis -Apriori Algorithm Database Sample (Ms Sql Server)

i'm looking a database sample for Apriori Algorithm. i need to find e-commerce site's database or a supermarket database. it's for my school homework. Can you advice me something? note: sorry my bad english.
ozkank
  • 1,464
  • 7
  • 32
  • 52
1
vote
0 answers

Working with sample data and getting error message

I am trying to work with my data set to graph a cluster analysis similar to this example: Why is the line of wss-plot (for optimizing the cluster analysis) looks so fluctuated? with a previous related thread (How to draw the plot of within-cluster…
1
vote
1 answer

Sample of Instagram webhook notification data

I am currently doing an integration to the Instagram API and would like my app to receive webhook notifications whenever there is new media on any Instagram account which has authorized my app via OAuth. I've been able to write the code which sets…
Hafiz Adewuyi
  • 360
  • 4
  • 15
1
vote
3 answers

Mysql - Create sample data from existing rows

I have a table with about 50K rows. I need to multiply this data 10 fold to have at least 5M rows for testing the performance. Now, its taken me several long minutes to import 50K from a CSV file so I don't want to create a 5M record file and then…
Whip
  • 1,891
  • 22
  • 43
1
vote
2 answers

implicit DataTemplates vs. sample data vs. blendability

I have two simple ViewModels, NodeViewModel and LeafViewModel that can be items in a TreeView. Just like below. Templates are applied implictly because I don't want a custom template selector.
bitbonk
  • 48,890
  • 37
  • 186
  • 278
1
vote
2 answers

Core Audio AudioFIleReadPackets... looking for raw audio

I'm trying to get raw audio data from a file (i'm used to seeing floating point values between -1 and 1). I'm trying to pull this data out of the buffers in real time so that I can provide some type of metering for the app. I'm basically reading the…
Corey
  • 41
  • 3