Questions tagged [data-transform]

Data transformation is the process of converting data from one format or structure into another format or structure. This can range from a simple transformation like transforming a comma-separated list to a line-break-separated list to complex transformations like speech-to-text. Strategies and technologies used can vary widely based on the complexity, volume, format and structure of the data being transformed.

278 questions
17
votes
3 answers

Rename Azure Storage Table?

Is it not possible to rename an Azure Storage Table? I cannot seem to find anything online (not even cmdlets). There are no options for this in Visual Studio Server Explorer, Cloud Storage Studio or TableXplorer.
Dave New
  • 38,496
  • 59
  • 215
  • 394
9
votes
3 answers

Convert various dummy/logical variables into a single categorical variable/factor from their name in R

My question has strong similarities with this one and this other one, but my dataset is a little bit different and I can't seem to make those solutions work. Please excuse me if I misunderstood something and this question is redundant. I have a…
iNyar
  • 1,916
  • 1
  • 17
  • 31
7
votes
3 answers

Azure Table Storage - remove columns

I think this is not possible, but however I ask the question, maybe I have missed something. Can we add/remove columns from an azure table? For example by default we get those columns: PartitionKey, RowKey, Timestamp, ETag. Can I add for example…
user2818430
  • 5,853
  • 21
  • 82
  • 148
4
votes
1 answer

Can we call any external REST API inside DBT(Data Build Tool)?

I am working on some analytical work and we need to transform data from one source to another and we are using DBT for transformation purpose. one of the data available to use via only REST API. so my question is can we call external API inside dbt…
MegaBytes
  • 6,355
  • 2
  • 19
  • 36
4
votes
2 answers

Use dplyr's _if() functions like mutate_if() with a negative predicate function

According to the documentation of the dplyr package: # The _if() variants apply a predicate function (a function that # returns TRUE or FALSE) to determine the relevant subset of # columns. # mutate_if() is particularly useful for transforming…
MS Berends
  • 4,489
  • 1
  • 40
  • 53
3
votes
1 answer

How do I convert spreadsheet data with multiple repeating columns, a grouping variable, and values into long format?

I am a little afraid to ask this question considering all the warnings I see about similarly phrased questions. I do not know how to phrase this question, and I have spent at least the past 5 hours searching for a solution to my particular case.…
Alan Lipps
  • 43
  • 6
3
votes
6 answers

Remove duplicates from array of objects but keep one property as an array

I have a collection like this: const data = [ {index: 1, number: 's1', uniqId: '123', city: 'LA'}, {index: 2, number: 's2', uniqId: '321', city: 'NY'}, {index: 3, number: 's3', uniqId: '123', city: 'LA'}, {index: 4, number: 's4', uniqId: '111',…
3
votes
2 answers

Converting columns with date to rows in R

Let's say we have a data.frame in R like this: d = data.frame('2019q1' = 1, '2019q2' =2, '2019q3' = 3) Which looks like this: X2019q1 X2019q2 X2019q3 1 1 2 3 How can I transform it to looks like this: Year Quarter …
Mehdi Zare
  • 1,221
  • 1
  • 16
  • 32
3
votes
1 answer

Label day timing into morning, afternoon and evening in R

How can i label time of the day (Morning, Afternoon and Evening) for given timestamps? Initial Data Id Time_stamp 3083188c 2016-08-29 13:10:51 924d500e 2016-08-29 09:22:33 ad4dd7ff 2016-08-25 20:29:35 Final data Id …
ajax
  • 131
  • 1
  • 11
2
votes
3 answers

Python polars dataframe transformation: from flat dataframe to one dataframe per category

I have a flat dataframe representing data in multiple databases, where each database has multiple tables, each table has multiple columns, and each column has multiple values: df = pl.DataFrame( { 'db_id': ["db_1", "db_1", "db_1",…
Qunatized
  • 197
  • 1
  • 9
2
votes
1 answer

Data transformation from Quarterly to Monthly in R

I want to convert my quarterly data to monthly but getting the error: Error in `mutate()`: ! Problem while computing `Month_Num = rep(rep(1:3, each = 2), 4)`. ✖ `Month_Num` must be size 8 or 1, not 24. I know there are many questions already on…
talha asif
  • 65
  • 5
2
votes
2 answers

How to transform a dataframe so that values of a column are the column names and column names are rows but unstacked?

I'm comparing two tables to monitor for changes. I need a resulting table that shows a before and after the change per user per attribute changed. I used the pandas .compare method and below is an example of my current result, but I can't figure out…
2
votes
3 answers

r long to wide and covariance matrix

This is my dataset, df1 <- "ID t res 1 1 -1.5 1 2 -1.5 1 3 0.5 1 4 0.5 2 1 -0.5 2 2 -0.5 2 3 -2.0 2 4 …
2
votes
1 answer

SQL query/UDF across columns in GROUP by

I'm working with a table similar to this in bigquery at my job: id | x | y a | 1 | 2 a | 2 | 3 a | 3 | 4 b | 1 | 2 b | 2 | 3 b | 3 | 2 c | 3 | 2 c | 2 | 4 c | 3 | 4 ... We want to take this data and perform the following transformation: For each…
2
votes
2 answers

How to check specific columns for values and assign weighted integer values when checking against variables of lists

I have a dataset containing diagnosis columns (DIAGX1-DIAGX42) for patients and I need to create a variable that sums the values for these based on weights from an external index. df_patients patients = [('pat1', 'Z509', 'M33', 'M32', 'M315'), …
Eoin Vaughan
  • 121
  • 1
  • 10
1
2 3
18 19