Creating concatenative summary data in SAS (PROC SQL)
Creating concatenative summary data in SAS (PROC SQL)
I have a SAS dataset (change_detail) that maintains change variables and looks like this (except there are over 200 change variables in the actual table):
Reference Chg1Code Chg1Desc Chg2Code Chg2Desc Chg3Code Chg3Desc
--------- -------- -------- -------- -------- -------- --------
1234 C Change2
1234 C Change1 C Change2
1234 C Change1 C Change3
2345 C Change1 C Change3
2345 C Change3
3456 C Change2 C Change3
3456 C Change1 C Change2
3456 C Change1
3456 C Change1 C Change3
I need to distill this information into a summary of the changes that have occurred to each reference number. When I used the following PROC SQL in SAS:
proc sql;
create table work.changedata_summary as
select distinct
a.Reference,
catx(',',a.Chg1Desc,a.Chg2Desc,a.Chg3Desc) as Changes length=1000
from
work.change_detail a
;
quit;
It concatenates the individual Change Description variables, but maintains the original number of observations per Reference Number. The results I get are:
Reference Changes
--------- -------
1234 Change2
1234 Change1,Change2
1234 Change1,Change3
2345 Change1,Change3
2345 Change3
3456 Change2,Change3
3456 Change1,Change2
3456 Change1
3456 Change1,Change3
I need some help in flattening this data out (using either PROC SQL or a DATA step) to show just the summary results by distinct Reference number. The expected result is:
Reference Changes
--------- -----------------------
1234 Change1,Change2,Change3
2345 Change1,Change3
3456 Change1,Change2,Change3
Any help provided will be appreciated! Thanks!
This is just a small sample. The actual dataset contains change data on over 200 different variables, so I need a scalable solution.
– Brian S
Aug 20 at 16:43
Why SQL? Would be much easier in a data step.
– Tom
Aug 20 at 17:11
3 Answers
3
if your data is something like this then adding a group by and max to your code should work
proc sql;
create table work.changedata_summary as
select
Reference,
catx(',',max(chg1Desc),max(Chg2Desc), max(Chg3Desc)) as Changes length=1000
from
have
group by Reference
;
quit;
I wouldn't even try to do such a thing in PROC SQL. If you just want to see the last non-empty change for each variable then use a data step with an UPDATE statement to collapse the data to one observation per group.
data want ;
update have (obs=0) have ;
by reference ;
length changes $500 ;
changes=catx(',',of change1-change3);
run;
I would use Data step for this kind of operation
Try this out
data want;
length change_tot $200.;
do until (last.Reference);
set have;
by Reference notsorted;
change_tot=catx(',',change_tot,Chg1Desc, Chg2Desc, Chg3Desc);
end;
Keep reference change_tot;
run;
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Can you have only 3 changes or is this just a small sample of your data?
– Reeza
Aug 20 at 15:30