0

I need to take the following(CSV):

deviceId,ip,mac,serial,version,platform,vtpDomain,stack num, stack ports, stack model, stack version, stack image, stack MAC, stack Serial    AA_MDF_2960X_2,10.10.10.10,abab.abab.abab, 123123123,15.0(2a)EX5,cisco WS-C2960X-48FPS-L,'AAHS',,,,,,,
,,,,,,,1,52,WS-C2960X-48FPD-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,2,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,3,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,*4,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
AA_MDF_2960X_2,10.10.10.10,abab.abab.abab, 123123123,15.0(2a)EX5,cisco WS-C2960X-48FPS-L,'AAES',,,,,,,
,,,,,,,1,52,WS-C2960X-48FPD-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,2,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,3,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123
,,,,,,,*4,52,WS-C2960X-48FPS-L,15.0(2a)EX5,C2960X-UNIVERSALK9-M,abab.abab.abab, 123123123

To Look like:

Excel CSV Ready for import

The columns kind of have repeated information. There is Name, IP, MAC, Serial number, Version, Model, Location for each of the Masters in a switch stack. There are two masters shown in my example, one for each stack. Same information is shown for each switch in the stack (including the master). Plus there is the number denoting location in the stack (plus an '*' showing master), Switch ports, and stack image. Each line that starts with a Name is the master in a switch stack. The lines that start with blanks are the switches in that stack. I need the information about the switches in the stack (including master) placed into a single cell in column Notes in a row with the Master's information. I can create an excel formula to fill-in missing items and it looks like:

(In R123)=if(A123="",R122,A123)

Which is fine, but still leaves me with removing lines and the data still in the wrong format. It felt like the wrong way to go. In addition, I'm looking for a more automated way of handling is data. I would like to be able to take the CSV, open[Something], refresh, export new CSV in a format like the picture. I can get close using power query, Excel, or Power Pivot. With Power Query I can add an index column. Then I can add a custom column to check if the first column is blank and return a result. I don't know how to make Power Query reference "the cell above" when the column is blank.

Even a suggestion of how to phrase this question to get useful Google results would be helpful. Messy Data, Power Query data reformat, and the like are too vague.

Can anyone tell me what terms to use when referring to the two types of data? I think the latter might be tabular and is definitely not normalized.

edit: I added headers to the CSV example.

I am now using the following in power query:

let
    Source = Csv.Document(File.Contents("C:\rawdata.csv"),[Delimiter=",", Columns=17, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source),
    #"Removed Columns" = Table.RemoveColumns(#"Promoted Headers",{"mac", "serial", "platform"}),
    #"Added Custom" = Table.AddColumn(#"Removed Columns", "Notes_StepOne", each Text.Combine({[stack num],[#" stack model"],[#" stack version"],[#" stack image"],[#" stack MAC"],[#" stack Serial"] ,"#(000D)#(000A)"})),
    #"Replaced Value" = Table.ReplaceValue(#"Added Custom","",null,Replacer.ReplaceValue,{"deviceId", "ip", "version", "vtpDomain"}),
    #"Filled Down" = Table.FillDown(#"Replaced Value",{"deviceId", "ip", "version", "vtpDomain"}),
    #"Group By" = Table.Group(#"Filled Down", {"deviceId", "ip", "version", "vtpDomain"}, {{"notes", each Text.Combine([Notes_StepOne]), type text}})
in
    #"Group By"

Edit: replace quotes with brackets around "Notes_StepOne" Well done Alejandro!

  • You could use the solution found [here](http://stackoverflow.com/questions/29440349/aggregate-collate-and-transpose-rows-into-columns/29441346#29441346) and use `Chr(10)` to delimit the *notes* column. –  Apr 11 '16 at 21:48

1 Answers1

0

Here's one potential solution:

  1. The result you generate can be a combination of the columns, plus a newline. It would look something like this for the custom column formula:

    Text.Combine({[first column], [second column], "#(000D)#(000A)"})
    
  2. Fill Down the values in whichever columns you want to group by. You can fill down values in Power Query by going to the Transform tab and selecting Fill | Down.

  3. Do a Group By. The UI doesn't let you group by adding text columns together, so you may want to add it as a separate step (clicking on the fx button will create a new step). It will look something like this:

    = Table.Group(#"Filled Down", {"first column", "second column"}, {{"notes", each Text.Combine([custom column made in the first part]), type text}})
    
  • When solutions are offered that are very version specific (without any version-specific tags on the question) it is generally considered 'good form' to include a caveat or disclaimer noting this. –  Apr 12 '16 at 00:28
  • Great job! (initially used quotes in the text.combine part, but it's all sorted now) – Michael Fisher Apr 20 '16 at 04:33