1、WRDS SAS User Guide West Virginia UniversityAvailable Data in WRDS By SubjectData Contents StocksCRSPSecurity price,return,and volume data for main indexes Dow JonesDow Jones averages and total returnCBOEKey measure of market expectations of volatility FundamentalsCompustat North AmericaU.S.and Cana
2、dian accounting and market information Compustat GlobalFinancial and accounting data of publicly traded companiesSECInformation about Disclosure of Order Execution StatisticsBank Bank Regulatory Financial information about U.S.banking institutionsBondTRACETransaction data for all eligible corporate
3、bondsCRSP TreasuryHistorical info and market data including yields,and durations Interest RateFRBDatabases collected from Federal Reserve BanksMergers&AcquisitionBank Regulatory Merger information concerning U.S.banking institutionsCurrency OptionPHLXPhiladelphia Stock Exchanges United Currency Opti
4、ons MarketMarketingDEMFCustomer buying history Ownership BlockholdersStandardized data for blockholders Advantage for using SAS WRDS is built using SAS data sets,so manipulating data through SAS is easier than almost any other querying tool.Any combination of two databases can be constructed.The web
5、 interface deletes observations for which the chosen variables have missing values and there is no simple way of finding out what observations were deleted SAS Sample programs For simple SAS codes:SAS Sample Programs on WRDS Info Home For advanced SAS codes:Support WRDS Datasets and Sample program S
6、AS Support Research ApplicationsConnect to WRDS%let wrds=wrds.wharton.upenn.edu 4016;options comamid=TCP remote=wrds;signon username=_prompt_;rsubmit;*-*your code here*-*;endrsubmit;Note:you always need this code since it has SAS connected to WRDS.Autoexe.sas data _NULL_;file autoexec.sas;put%includ
7、e!SASROOT/wrdslib.sas;run;A list of important libnames already is assigned by WRDS through this statement.You may run this code only when libname error happens.Libname SAS library names are already defined in all user accounts of Unix.For example,Bank regulatory bank:/wrds/bank/sasdata Compustat:com
8、p/wrds/compustat/sasdata CRSP CCM:crsp/wrds/crsp/sasdata/cc CRSP Monthly stock:crsp/wrds/crsp/sasdata/sm Unix home directory:Temp/home/wvu/min06/temp Data set To set up data steps,use the LIBNAME statement and then name the dataset.This is enough to create it.Example:For CRSP monthly stock file:set
9、crsp.msf For Compustat Industrial Fundamental file:set comp.ina All data set are from SAS data files stored in Unix.A good way to fix your error is checking variables and directory name of SAS files in Unix.Finding variables Web Based:Documents Tools(Searching variables)Using SAS:proc contents data=
10、crsp.dsf;Finding identifiers Web Based:Code lookup Tools Using SAS:For example,to find identifiers in Compustat;data names;set comp.namesann;where coname contains IBM or SMBL contains IBM;run;proc print data=names;run;For CRSP:the file name is“stocknames”.Using Unix command:grep a.Merge CRSP/Compust
11、at using CUSIPWhen merging two databases,we need a commonID Best way is to match them with CUSIP:Names and Tickers are problematic since they change though time,can be re-used.and therefore have different entries in different databases.CUSIP changes through time but are not re-used.should be histori
12、cal one(NCUSIP)Understanding CUSIP Example for IBM Compustat:CNUM=459200(6 digits of CUSIP)CRSP:CUSIP=45920010 459200 10 1 Matching identifiersDatabase Ticker CUSIP GVKEYPERMNOCRSPYES CUSIP NCUSIPNOYES Main identifierCOMPUSTATYES(SMBL)CNUMYESMain identifierNO To create a common identifier(cnum),we u
13、se CUSIP and subtract 6 digits from it.Step 1.Headers from Compustat From Compustat header file“namesann”:Find“cnum”and“gvkey”for IBM Then exclude missing data Sort data by cnum proc sort data=comp.namesann(keep=gvkey cnum)out=comp nodupkey;where missing(cnum)=0 where smbl in IBM;by cnum gvkey;run;S
14、tep 2.Headers from CRSP From CRSP header file“stocknames”;Find“ncusip”and“permno”for IBM Then exclude missing data Sort data by ncusip Define the output“mse”proc sort data=crsp.stocknames(keep=permco ncusip)out=mse nodupkey;where missing(ncusip)=0;by permco ncusip;run;Step 3.Creating cnum from ncusi
15、p Create 6 digits identifier(cnum)from ncusip in order CRSP and Compustat to be matched on cnum:Using functions“length”and“subtr”,create cnum from ncusip in“mse”Sort data by cnum Define the output as“mse3”data mse2;length cnum$6.;set crsp.mse;cnum=substr(ncusip,1,6);run;proc sort data=mse2 out=mse3(
16、keep=permco cnum)nodupkey;by cnum permco;run;Step 4.Merging Create temporary variables“aa”and“bb”using option“in”in order to track whether that data set contributed to the current observation data joint2;merge comp(in=aa)mse3(in=bb);by cnum;/*Create Dummies to test source of merging*/if aa=1 then co
17、mpustat=1;else compustat=0;if bb=1 then crsp=1;else crsp=0;run;b.Extract data from CCM Concepts needed:Historical identifier(NPERMNO)Linking file(cstlink2)see CCM guide Most of the SAS procedures on WRDS use SQL see SQL referencesStep 1.Libname&Years Libname:wrds/crsp/sasdata/cc Sepecify beginning a
18、nd ending years:%let beg_yr=1995;%let end_yr=2003;BEGFYRENDFYRStep 1Step 2.Link file(cstlink2)Specify link information:Create a data(temp1)which is set on“cstlink2”/*data temp1;set crsp.CSTLINK2;run;*/Select link types and link dates(1995 date=year(LINKDT)or LINKDT=.B)and (&beg_yr-1=year(LINKENDDT)o
19、r LINKENDDT=.E);by GVKEY LINKDT;run;*/BEGFYRENDFYRLINKDTLINKENDDTStep 2Step 3.a A,B part (LINKDT=FYENDDT or LINKDT=.B)and (FYENDDT=LINKENDDT or LINKENDDT=.E)B part(LINKDT=FYENDDT or LINKENDDT=.E)A,B,C part (LINKDT=FYBEGDT or LINKENDDT=.E)BEGFYRENDFYRLINKDTLINKENDDTENDFYRBEGFYR A B C Step 3Step 3b.Sp
20、ecify overlapping periods Create table(defined“mydata”)which has following variables from the file“link”Name“crsp.CSTANN”as“cst”(*CSTANN is a file which contains all compustat data)Specify date requirements.(Select A,B or C)With GVEKY we found,extract data we need from“CSTANN“by the corresponding GV
21、KEY(lnk.GVKEY=cst.GVKEY)proc sql;create table mydata(keep=GVKEY NPERMNO NPERMCO SMBL YEARA LINKDT LINKENDDT LINKTYPE DATA6)as select*from lnk,crsp.CSTANN as cst where lnk.GVKEY=cst.GVKEY and (&beg_yr=YEARA=&end_yr)and (LINKDT=cst.FYENDDT or LINKDT=.B)and(cst.FYENDDT=LINKENDDT or LINKENDDT=.E);quit;ReferencesTo see all references and SAS programs:http:/www.be.wvu.edu/wrds/home/index.html