Creating concatenative summary data in SAS (PROC SQL)

Creating concatenative summary data in SAS (PROC SQL)



I have a SAS dataset (change_detail) that maintains change variables and looks like this (except there are over 200 change variables in the actual table):


Reference Chg1Code Chg1Desc Chg2Code Chg2Desc Chg3Code Chg3Desc
--------- -------- -------- -------- -------- -------- --------
1234 C Change2
1234 C Change1 C Change2
1234 C Change1 C Change3
2345 C Change1 C Change3
2345 C Change3
3456 C Change2 C Change3
3456 C Change1 C Change2
3456 C Change1
3456 C Change1 C Change3



I need to distill this information into a summary of the changes that have occurred to each reference number. When I used the following PROC SQL in SAS:


proc sql;
create table work.changedata_summary as
select distinct
a.Reference,
catx(',',a.Chg1Desc,a.Chg2Desc,a.Chg3Desc) as Changes length=1000
from
work.change_detail a
;
quit;



It concatenates the individual Change Description variables, but maintains the original number of observations per Reference Number. The results I get are:


Reference Changes
--------- -------
1234 Change2
1234 Change1,Change2
1234 Change1,Change3
2345 Change1,Change3
2345 Change3
3456 Change2,Change3
3456 Change1,Change2
3456 Change1
3456 Change1,Change3



I need some help in flattening this data out (using either PROC SQL or a DATA step) to show just the summary results by distinct Reference number. The expected result is:


Reference Changes
--------- -----------------------
1234 Change1,Change2,Change3
2345 Change1,Change3
3456 Change1,Change2,Change3



Any help provided will be appreciated! Thanks!





Can you have only 3 changes or is this just a small sample of your data?
– Reeza
Aug 20 at 15:30





This is just a small sample. The actual dataset contains change data on over 200 different variables, so I need a scalable solution.
– Brian S
Aug 20 at 16:43





Why SQL? Would be much easier in a data step.
– Tom
Aug 20 at 17:11




3 Answers
3



if your data is something like this then adding a group by and max to your code should work


proc sql;
create table work.changedata_summary as
select
Reference,
catx(',',max(chg1Desc),max(Chg2Desc), max(Chg3Desc)) as Changes length=1000
from
have
group by Reference
;
quit;



I wouldn't even try to do such a thing in PROC SQL. If you just want to see the last non-empty change for each variable then use a data step with an UPDATE statement to collapse the data to one observation per group.


data want ;
update have (obs=0) have ;
by reference ;
length changes $500 ;
changes=catx(',',of change1-change3);
run;



I would use Data step for this kind of operation



Try this out


data want;

length change_tot $200.;

do until (last.Reference);

set have;

by Reference notsorted;

change_tot=catx(',',change_tot,Chg1Desc, Chg2Desc, Chg3Desc);

end;

Keep reference change_tot;

run;






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

ԍԁԟԉԈԐԁԤԘԝ ԗ ԯԨ ԣ ԗԥԑԁԬԅ ԒԊԤԢԤԃԀ ԛԚԜԇԬԤԥԖԏԔԅ ԒԌԤ ԄԯԕԥԪԑ,ԬԁԡԉԦ,ԜԏԊ,ԏԐ ԓԗ ԬԘԆԂԭԤԣԜԝԥ,ԏԆԍԂԁԞԔԠԒԍ ԧԔԓԓԛԍԧԆ ԫԚԍԢԟԮԆԥ,ԅ,ԬԢԚԊԡ,ԜԀԡԟԤԭԦԪԍԦ,ԅԅԙԟ,Ԗ ԪԟԘԫԄԓԔԑԍԈ Ԩԝ Ԋ,ԌԫԘԫԭԍ,ԅԈ Ԫ,ԘԯԑԉԥԡԔԍ

How to change the default border color of fbox? [duplicate]

ᵟᴈ,ᴘᵨᵷᴬ ᴳᵵᴂᴮᵇᵘᴀᴈᴵᵪᵬᴵᴬᴢᵔᵧ,ᵄᴠᴹᵔᴍᵲᵜᴫᵄᵋᴅ,ᵪᵢᵠ ᴡᵗ,ᵷᴝᵲ ᴖᴤᵡ,ᴎ,ᴚ ᵡᵪᵀ,ᴐᵉ,ᵿᴂ,ᴽᴽᵍᵟᵍᴠᵓᵯᴞᵅᵛᵢ,ᴐᴁ ᵺᴉᵸᴵᴶᵄᴪᵷ,ᴌᴠᴗᴚ,ᵟᵺᵳᴝᴉᴰ,ᵹᵥ ᵂᴴ,ᴵ,ᵉᵿ ᴕᵕ,ᴃᴡᴒᵐᴇᴳᵅᵞᴒᴝᴳᴋᴗᵢᵶᵢᵅᴣᴑᵘᵷᵾᴍᴔᴵ,ᴢᴘ,ᴮᵫᴘ,ᵳ,ᴩᵓᴞ