I need to take the following(CSV):
deviceId,ip,mac,serial,version,platform,vtpDomain,stack num, stack ports, stack model, stack version, stack image, stack MAC, stack Serial AA_MDF_2960X_2,10.10.10.10,abab.abab.abab, 123123123,15.0(2a)EX5,cisco WS-C2960X-48FPS-L,'AAHS',,,,,,,
,,,,,,,1,52,WS-C2960X-48FPD-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,2,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,3,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,*4,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
AA_MDF_2960X_2,10.10.10.10,abab.abab.abab, 123123123,15.0(2a)EX5,cisco WS-C2960X-48FPS-L,'AAES',,,,,,,
,,,,,,,1,52,WS-C2960X-48FPD-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,2,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,3,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,*4,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
To Look like:
The columns kind of have repeated information. There is Name, IP, MAC, Serial number, Version, Model, Location for each of the Masters in a switch stack. There are two masters shown in my example, one for each stack. Same information is shown for each switch in the stack (including the master). Plus there is the number denoting location in the stack (plus an '*' showing master), Switch ports, and stack image. Each line that starts with a Name is the master in a switch stack. The lines that start with blanks are the switches in that stack. I need the information about the switches in the stack (including master) placed into a single cell in column Notes in a row with the Master's information. I can create an excel formula to fill-in missing items and it looks like:
(In R123)=if(A123="",R122,A123)
Which is fine, but still leaves me with removing lines and the data still in the wrong format. It felt like the wrong way to go. In addition, I'm looking for a more automated way of handling is data. I would like to be able to take the CSV, open[Something], refresh, export new CSV in a format like the picture. I can get close using power query, Excel, or Power Pivot. With Power Query I can add an index column. Then I can add a custom column to check if the first column is blank and return a result. I don't know how to make Power Query reference "the cell above" when the column is blank.
Even a suggestion of how to phrase this question to get useful Google results would be helpful. Messy Data, Power Query data reformat, and the like are too vague.
Can anyone tell me what terms to use when referring to the two types of data? I think the latter might be tabular and is definitely not normalized.
edit: I added headers to the CSV example.
I am now using the following in power query:
let
Source = Csv.Document(File.Contents("C:\rawdata.csv"),[Delimiter=",", Columns=17, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source),
#"Removed Columns" = Table.RemoveColumns(#"Promoted Headers",{"mac", "serial", "platform"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Notes_StepOne", each Text.Combine({[stack num],[#" stack model"],[#" stack version"],[#" stack image"],[#" stack MAC"],[#" stack Serial"] ,"#(000D)#(000A)"})),
#"Replaced Value" = Table.ReplaceValue(#"Added Custom","",null,Replacer.ReplaceValue,{"deviceId", "ip", "version", "vtpDomain"}),
#"Filled Down" = Table.FillDown(#"Replaced Value",{"deviceId", "ip", "version", "vtpDomain"}),
#"Group By" = Table.Group(#"Filled Down", {"deviceId", "ip", "version", "vtpDomain"}, {{"notes", each Text.Combine([Notes_StepOne]), type text}})
in
#"Group By"
Edit: replace quotes with brackets around "Notes_StepOne" Well done Alejandro!