-1

I have a table called Students that stores all the basic information of students and trainings that they have attended to(The table has more than 15 columns and more than 5000 records). a sample part of the table is like this:

St_id  St_Name  St_University  SoftSkillTraining  StartDate   EndDate     ComputerTraining  StartDate   EndDate 
---------------------------------------------------------------------------------------------------------------
 1      x        x             True               12/02/2017  12/03/2017  False             -           -       
 2      y        x             True               25/05/2016  25/06/2016  True              01/08/2017          

However, The table is not normalized and I need to split the student table into three specific tables (in the form of many to many relations)

  1. Student table that contains basic information of students like:
 St_id    St_Name     St_University     St_Faculty
--------------------------------------------------
1        X           Some University    Law
2        y           Some University    IT
  1. Training table that store 'Training name', 'start date' and 'end date' columns

Training table should be:

TrainingId TrainingName  StartDate   EndDate     TrainingLocation
-----------------------------------------------------------------
1         SoftSkill      12/02/2017  12/03/2017   Some Location
2         SoftSkill      25/02/2016  25/06/2016   Some Location
3         CMOA           01/08/2017  01//09/2017  some location
  1. An intersection table that stores participants of the trainings and stores only primary keys of Student and Training tables as foreign key like below:
st_id      training_id
-----------------------
1           1
2           2
2           1

How can I transfer data from student into Training Table as you can see data of different column from student table should appear as a row in training table using stored procedure ?

2 Answers2

0

One way of achieving your task:

create table Students (
St_id int primary key,
St_Name varchar(5),  
St_University varchar(5),
SoftSkillTraining varchar(5), 
ST_StartDate  varchar(10),
ST_EndDate varchar(10),
ComputerTraining varchar(5),  
CT_StartDate varchar(10),
CT_EndDate varchar(10),
); 

insert into Students (St_id,  St_Name,  St_University,  SoftSkillTraining,  ST_StartDate ,  ST_EndDate, ComputerTraining, CT_StartDate,   CT_EndDate)
values('1','x', 'x' , 'True' , '12/02/2017', '12/03/2017' , 'False',NULL , NULL)

insert into Students (St_id,  St_Name,  St_University,  SoftSkillTraining,  ST_StartDate ,  ST_EndDate, ComputerTraining, CT_StartDate,   CT_EndDate)
values('2' , 'y' ,'x' , 'True' , '25/05/2016' ,  '25/06/2016' , 'True' , '01/08/2017', NULL)

create table Student (
St_id int primary key,
St_Name varchar(5),  
St_University varchar(5),
);

insert into Student (St_id, St_Name,St_University)
select distinct St_id , St_Name , St_University  from Students; 

create table Training (
Training_Id int identity(1,1) primary key,
Student_Id int foreign key references Students(St_id), 
Training_Name varchar(20),
StartDate  varchar(10),
EndDate varchar(10),
);

insert into Training (Student_Id ,Training_Name , StartDate, EndDate)
values ('1' , 'SoftSkillTraining' ,  '12/02/2017' , '12/03/2017' );

insert into Training (Student_Id ,Training_Name , StartDate, EndDate)
values ('2' , 'SoftSkillTraining' ,  '25/05/2016' , '25/06/2016' );

insert into Training (Student_Id ,Training_Name , StartDate, EndDate)
values ('2' , 'ComputerTraining' ,  '01/08/2017' , NULL );

create table Intersection (
Intersection_Id int  identity(1,1) primary key,
Student_id int foreign key references Students(St_id), 
Training_Id int foreign key references Training(Training_id),
);

insert into Intersection (Student_id,Training_Id)
select St_id, Training_Id   from Student join Training on St_id = Student_Id  


go
create view  Participants
as
select St_Name as Participant, Training_Name  from Intersection join Student on student_id = St_id  join Training on intersection.Training_Id = training.Training_Id   
go
  • the old Student Table contains more than 5000 records and I can not insert into the new table one by one. secondly, 'SoftSkillTraining' and 'ComputerTraining' appears as a column in the old table. how can I insert data from different columns to appear in a single column? – Kabir Panahi Oct 07 '17 at 11:14
  • If using version > SQL Server 2008, then you can go for "CROSS APPLY" with the VALUES clause for unpivoting: select value from DesiredTable cross apply ( values ('I1', I1), ('I2', I2), ('I3', I3) ) c(col, value) where value is not null order by id, col for more on CROSS APPLY, you can visit: https://sqlstudies.com/2013/04/01/unpivot-a-table-using-cross-apply/ – amazing athlete Oct 07 '17 at 17:50
0

You have quite a task to perform, but normalizing that table is the right thing to do. In your sample of the old table I notice you have both [StartDate] & [EndDate] repeating. This isn't possible in SQL Sever, all column names must be unique in a table. I'm hoping this is just a glitch in the sample because it is going to be very important.

Below I use a method to "unpivot" a student row into multiple shorter rows which represents an interim step to reach your goal. This method uses CROSS APPLY and VALUES. Note here that you will need to prepare this VALUES section manually, but you might be able to get the list of fields from a query against your information schema (this query not provided).

See a working model of this at SQL Fiddle

MS SQL Server 2014 Schema Setup:

CREATE TABLE Student
    ([St_id] int, [St_Name] varchar(1), [St_University] varchar(1)
     , [SoftSkillTraining] varchar(4), [StartDate1] datetime, [EndDate1] datetime
     , [ComputerTraining] varchar(5), [StartDate2] datetime, [EndDate2] datetime)
;

INSERT INTO Student
    ([St_id], [St_Name], [St_University]
     , [SoftSkillTraining], [StartDate1], [EndDate1]
     , [ComputerTraining], [StartDate2], [EndDate2])
VALUES
    (1, 'x', 'x', 'True', '2017-02-12 00:00:00', '2017-03-12 00:00:00', 'False', NULL, NULL),
    (2, 'y', 'x', 'True', '2016-05-25 00:00:00', '2016-06-25 00:00:00', 'True', '2017-08-01', NULL)
;

This is the most important Query it "unpivots" source data to multiple rows

Note how it needs an id to be assigned for each training course, and that the column groups such as [SoftSkillTraining], [StartDate1], [EndDate1] must be specified row by row in the values area. Each row here will result in a new row of output, so the "layout" of the values area basically determines what the final output will be. It is in this area you will need to collect all the column names carefully and arrange them accurately.

select
    St_id, ca.TrainingId, ca.TrainingName, ca.isEnrolled, ca.StartDate, ca.EndDate
    into training_setup
from Student
cross apply (
  values
     (1, 'SoftSkillTraining', [SoftSkillTraining], [StartDate1], [EndDate1])
    ,(2, 'ComputerTraining', [ComputerTraining], [StartDate2], [EndDate2])
  ) ca (TrainingId,TrainingName,isEnrolled, StartDate,EndDate)
where ca.isEnrolled = 'True'
;

Query 2:

select
*
from training_setup

Results:

| St_id | TrainingId |      TrainingName | isEnrolled |            StartDate |              EndDate |
|-------|------------|-------------------|------------|----------------------|----------------------|
|     1 |          1 | SoftSkillTraining |       True | 2017-02-12T00:00:00Z | 2017-03-12T00:00:00Z |
|     2 |          1 | SoftSkillTraining |       True | 2016-05-25T00:00:00Z | 2016-06-25T00:00:00Z |
|     2 |          2 |  ComputerTraining |       True | 2017-08-01T00:00:00Z |               (null) |

Query 3:

-- this can be the basis for table [Training]
select distinct TrainingId,TrainingName, StartDate,EndDate
from training_setup

Results:

| TrainingId |      TrainingName |            StartDate |              EndDate |
|------------|-------------------|----------------------|----------------------|
|          1 | SoftSkillTraining | 2016-05-25T00:00:00Z | 2016-06-25T00:00:00Z |
|          1 | SoftSkillTraining | 2017-02-12T00:00:00Z | 2017-03-12T00:00:00Z |
|          2 |  ComputerTraining | 2017-08-01T00:00:00Z |               (null) |

NOTE I have reservations about the consistency of this data, notice the start/end dates differ for one course. I don't have a simple solution for that. You may need to scrub your data to minimize thee discrepancies and/or you may need an additional step that matches by the id we used in the cross apply plus the start/end date pairs to arrive at a better version of the training_id by updating the training_setup staging table before moving on.

Query 4:

-- this can be the basis for table [Student_Training]
select St_id, TrainingId
from training_setup

Results:

| St_id | TrainingId |
|-------|------------|
|     1 |          1 |
|     2 |          1 |
|     2 |          2 |
Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51