harris 43

50 %
50 %
Information about harris 43
Science-Technology

Published on May 7, 2008

Author: Christian

Source: authorstream.com

Slide1:  Access to Confidential Data for Statistical Analysis Kenneth Harris, Director of Research Data Center National Center for Health Statistics (NCHS):  National Center for Health Statistics (NCHS) Despite the wide dissemination of its data through publications, CD-ROMs, etc., the inability to release files with, for instance, lower levels of geography, severely limits the utility of some data for research, policy, and programmatic purposes and sets a boundary on one of the Center’s goals to increase its capacity to provide state and local area estimates. NCHS (cont.):  NCHS (cont.) In pursuit of this goal and in response to the research community’s interest in restricted data, NCHS established the Research Data Center (RDC), a mechanism whereby researchers can access detailed data files in a secure environment, without jeopardizing the confidentiality of the respondents. Research Data Center:  Research Data Center The NCHS Research Data Center, established in 1998, is a facility at the NCHS headquarters in Hyattsville, Maryland, where researchers are granted access to restricted data files needed to complete approved projects. Restricted data files may contain information, such as lower levels of geography, but do not contain direct identifiers (e.g., name or social security number). Data Restrictions:  Data Restrictions Section 308 (d) of the Public Health Service Act and the NCHS Staff Confidentiality Manual do not permit the release of data that are either identified or identifiable to persons outside of NCHS. Data Restrictions (cont.):  Data Restrictions (cont.) Identifiable data include not only direct identifiers such as name, social security number, etc., but also data that can serve to allow inferential identification of either individual or institutional respondents by a number of means. Data Restrictions (cont.):  Data Restrictions (cont.) Research indicates that identifiability is greatly enhanced if geographic identifiers for state, county, census tract, block-group or block are released on public use files. Key Issues for Research Data Availability:  Key Issues for Research Data Availability CONFIDENTIALITY The dissemination of data in a manner that would allow public identification of the respondent or would in any way be harmful to him/her is prohibited and the data are immune from legal process. Key Issues for Research Data Availability (cont.):  Key Issues for Research Data Availability (cont.) DISCLOSURE Disclosure relates to inappropriate attribution of information to a data subject, whether an individual or an organization. Disclosure occurs when a data subject is identified from a released file (identity disclosure), sensitive information about a data subject is revealed through the released file (attribute disclosure), or the released data make it possible to determine the value of some characteristic of an individual more accurately than otherwise would have been possible (inferential disclosure). Appendix I – Rules for the Release of Micro Data Files:  Appendix I – Rules for the Release of Micro Data Files The data file must not contain any detailed information about the subject that could facilitate identification and that is not essential for research purposes (e.g., exact date of the subject’s birth). Geographic places that have fewer than 100,000 people are not to be identified on the data file. Characteristics of an area are not to appear on the data file if they would uniquely identify an area of less than 100,000 people. Appendix I – Rules for the Release of Micro Data Files (cont.):  Appendix I – Rules for the Release of Micro Data Files (cont.) Information on the drawing of the sample which might assist in identifying a data subject must not be released outside the Center. Thus, the identities of primary sampling units are not to be made available outside the Center. Before any new or revised micro data files are published, they, together with their full documentation, must be approved for publication by the NCHS Director or Deputy Director. A micro data file containing confidential data on unidentified individuals or facilities may not be released to any person or organization outside NCHS until that person, or a responsible representative of that organization, has first signed the statement on the Order Form, whereby he gives assurance that the data provided will be used only for statistical reporting or research purposes. Why NCHS Does Not Release Files With Lower Levels of Geography:  Why NCHS Does Not Release Files With Lower Levels of Geography Research suggests that in the case of personal surveys nine commonly collected variables result in the table below. Why NCHS Does Not Release Files With Lower Levels of Geography (cont.):  Why NCHS Does Not Release Files With Lower Levels of Geography (cont.) Notes: A geopolitical area may be a county, city, town, or other place with well- defined boundaries. In this case, identification refers to certainty identification. How Does RDC Operate?:  How Does RDC Operate? On-Site Access Remote Access Staff Assisted Analytical Session User Procedures:  User Procedures To gain access to NCHS restricted data through either method, user must: Submit a research proposal. An advisory and proposal review committee receives, reviews, and approves researcher proposals Proposals are evaluated primarily on the confidentiality disclosure risk. Scientific merit is not an evaluation criteria. Sign an affidavit of confidentiality and promise not to use any method to attempt to identify respondents. User Procedures (cont.):  User Procedures (cont.) Not take any materials or equipment into RDC unless approved by RDC staff. Submit data files to be merged onto NCHS data ahead of time – all merging is done by RDC staff. Subject all output and/or materials removed from the RDC to a disclosure limitation review. May not remove any NCHS restricted data files nor linked data files. Researcher Affidavit of Confidentiality:  Researcher Affidavit of Confidentiality I certify that no confidential data or information viewed or otherwise obtained while I am a researcher in the National Center for Health Statistics (NCHS), Research Data Center (RDC) will be removed from NCHS. Further, I understand that NCHS will perform a disclosure review and must provide approval to me before I remove any data from the RDC, whether it be in electronic or paper form. I acknowledge NCHS Confidentiality Statute, 308(d) of the Public Health Service Act stated below and fully understand my legal obligations to NCHS to protect all confidential data. Further I understand any violation I may perform is punishable under 18 United States Code (USC), 1001 which carries a fine of up to $10,000 or up to 5 years in prison. Researcher Affidavit of Confidentiality (cont.):  Researcher Affidavit of Confidentiality (cont.) NCHS 308(d) Confidentiality Statute - No information, if an establishment or person supplying the information or described in it is identified, obtained in the course of activities undertaken or supported under section 304, 305, 306, 307, or 309 may be used for any purpose other than the purpose for which it was supplied unless such establishment or person has consented to its use for such other purpose and in the case of information obtained in the course of health statistical or epidemiological activities under section 304 or 306, such information may not be published or released in other form if the particular establishment or person supplying the information or described in it is identifiable unless such establishment or person has consented to its publication or release in other form. Researcher Affidavit of Confidentiality (cont.):  Researcher Affidavit of Confidentiality (cont.) 18 United States Code, 1001 - Deliberately making a false statement in any matter within the jurisdiction of any Department or Agency of the Federal Government violates 18 USC 1001 and is punishable by a fine of up to $10,000 or up to 5 years in prison. ____________________ _______________ Researcher’s Signature Date ____________________ _______________ NCHS Witness Date Can Researcher Merge his/her Data with NCHS ?:  Can Researcher Merge his/her Data with NCHS ? Must Interact with RDC staff to ensure that their data can be merged with the NCHS data. User-supplied data will be merged with NCHS data by RDC staff only. The NCHS RDC policy states that merged and user-supplied data will not be made available for analysis to anyone without the written consent of the user. The Cost per Project:  The Cost per Project On Site $200 per day (2 day minimum) Remote Access NSFG-CDF = $500/ year NHIS-polio = $500/ year NHIS Linked Mort. File = $250/Month NHANES Linked Mort. File = $250/Month The Cost per Project (cont.):  The Cost per Project (cont.) Files <= 130k records = $500 per month Files > 130k records = $1000 per month Staff Assisted Variable File Construction and Setup For Mortality Files = $250 per day For all Other Files = $500 per day Do Doctors perform “defensive Cesareans”?:  Do Doctors perform “defensive Cesareans”? Overview: This topic re-examined the issues of “defensive medicine” and state reforms designed to limit malpractice risk on the use of cesarean section delivery. NCHS Data Used: National Hospital Discharge Survey (NHDS) Years of Data Used: 1980 through 1992, inclusive. User’s Data Merged with NCHS? Yes Method of Access to NCHS Data: Remote and On-site Access Statistical Software Used: SAS Economic Model to Explain the Incidence of Sexual Activity, Contraceptive Use, STD, and Pregnancy Among Teenage Girls.:  Economic Model to Explain the Incidence of Sexual Activity, Contraceptive Use, STD, and Pregnancy Among Teenage Girls. Overview: National Survey of Family Growth Data provide extensive socio-demographic information and reports of the sexual histories of these women. Researcher focused on the effects of a number of policies measured at the state-level. These included: Parental notification of consent laws. Medicaid funding of abortions. Welfare generosity. NCHS Data Used: National Survey of Family Growth (NSFG) User’s Data Merged with NCHS? Yes Method of Access to NCHS Data: Remote Access Statistical Software Used: SAS Nursing Home Admission and Payment Source?:  Nursing Home Admission and Payment Source? Overview: This project tested if patients with Medicare were being discriminated against because their reimbursement rate was significantly below the private pay rate for nursing homes. NCHS Data Used: National Nursing Home Survey (NNHS) Years of Data Used: 1985, 1995, and 1997 User’s Data Merged with NCHS? No Method of Access to NCHS Data: Remote Access Statistical Software Used: SAS Hardware and Software:  Hardware and Software All RDC hardware and software are standard. Hardware Pentium IV computers with Windows 2000 Software SAS (only language on ANDRE) Sudaan Fortran HLM Stata Limdep text editors/viewers Onsite workstations do NOT have email or internet access Only access to printer is through RDC staff Record Linkage for Epidemiologic Research: Accessing Linked data at the NCHS Research Data Center :  Record Linkage for Epidemiologic Research: Accessing Linked data at the NCHS Research Data Center Christine S. Cox NCHS Data Users Conference July 12, 2006 Slide28:  Administrative records Linked Data File NCHS Surveys What is Record Linkage? NCHS Linked Data: Major Activities:  NCHS Linked Data: Major Activities Mortality National Death Index Health Care Utilization and Costs Medicare Data Retirement and Disability Social Security Data NCHS Linked Data: Mortality :  NCHS Linked Data: Mortality Eligibility status Assigned vital status Date of death Age at death Underlying and multiple causes of death Adjusted sample weights Research Potential of Linked Mortality Data:  Research Potential of Linked Mortality Data Living and Dying in the USA: Behavioral, Health, and Social Differentials of Adult Mortality RG Rogers, CB Nam, RA Hummer A Semiparametric Analysis of the Body Mass Index’s Relationship to Mortality JT Gronniger The Income-Associated Burden of Disease in the United States P Muennig, P Franks, H Jia, E Lubetkin and MR Gold Excess Deaths Associated with Underweight, Overweight, and Obesity KM Flegal, BI Graubard, DF Williamson; MH Gail JAMA. 2005;293:1861-1867. NCHS Linked Data: Medicare:  NCHS Linked Data: Medicare Medicare entitlement and health care utilization and payment data for 1991-2000 Denominator file MEDPAR Inpatient hospitalization MEDPAR Skilled nursing facility Hospital outpatient Home Health Care Hospice Carrier (physician/supplier Part B file) Durable Medical Equipment Research Potential of Linked Medicare Data:  Research Potential of Linked Medicare Data Examine risk factors for health conditions Examine reliability of survey data Examine survey report of disability with program participation eligibility criteria Compare survey reported health conditions to claims records Examine disparities in Medicare service utilization NCHS Linked Data: Retirement/Disability:  NCHS Linked Data: Retirement/Disability Social Security data from Retirement, Survivors, and Disability Insurance (RSDI) and Supplemental Security Insurance (SSI) programs Master Beneficiary Record (MBR) 1962-2003 Payment History Update System (PHUS) 1984-2003 Supplemental Security Record (SSR) 1974-2003 Research Potential of Linked Social Security Data:  Research Potential of Linked Social Security Data Examine reliability of survey information for SSA program participation and benefits Compare the health characteristics of those who take early (age 62) Social Security benefits to those who postpone benefits Policy analysis using validated survey data Predicting the number of people who will become disabled based upon survey reported health conditions Determining whether current disability entitlement funding levels will be adequate as the population ages Summary NCHS Data Linkage:  Summary NCHS Data Linkage Slide37:  www.cdc.gov/nchs/r&d/nchs_datalinkage/data_linkage_activities.htm Why can’t you just give me the data?:  Why can’t you just give me the data? NCHS does not “own” the linked administrative data NCHS data confidentiality rules prohibit the release of potentially identifiable data – special considerations concerning the protection of linked data The RDC is the only option for access for now…. Overview: Data Access Procedures:  Overview: Data Access Procedures Proposal Requirements Access Methods Helpful Tips Where to get help? Proposal Requirements:  Proposal Requirements Proposal is evaluated by review committee Review criteria Scientific and technical feasibility Availability of RDC resources Disclosure risk for restricted information The extent to which project is in accordance with the mission of NCHS Special note: NCHS does not try to determine if proposals are duplicative Proposal Requirements:  Proposal Requirements Cover letter Project title Abstract (maximum 300 words summarizing project) Full contact information Institutional affiliation Mail address, phone, email Dates of proposed time at RDC (or indication of using remote access) Source of funding for proposed research Proposal Requirements:  Proposal Requirements Study background Key study questions or hypotheses Public health benefits Methods Analytic approach and statistical methods Statistical software requirements Description of intended output for nondisclosure review, e.g. Table shells Model equations Test statistics that researcher plans to remove from RDC Proposal Requirements:  Proposal Requirements Explanation of why restricted data are needed, e.g. describe why publicly available data are insufficient Summary of data requirements to be included in analytic file Identification of sample Identification of variables Description of additional data to be supplied by researcher to be merged with NCHS or other data source (must clearly identify source of other data) Proposal Requirements: Appendices:  Proposal Requirements: Appendices Current Curriculum Vitae or resume for each investigator Data dictionary – complete listing of specific data requested and its source(s) and indicate if public use or restricted access variables specific files and years sample variables (dependent, independent, matching/linking) Proposal Requirements: Appendices:  Proposal Requirements: Appendices For remote-access applicants Description of the computer and email system to be used to receive output Security provisions for the computer and email systems For students Letter from department chair or academic advisor stating that student is working under the direction of the department Overview: RDC Data Access Procedures:  Overview: RDC Data Access Procedures Proposal Requirements Access Methods Helpful Tips Where to get help? Access Methods:  Access Methods Once approved, three methods to access restricted data on-site - use local computing resources in the NCHS RDC, Hyattsville, MD remote – submit programs electronically to be executed in the RDC with output returned by email staff assisted – RDC staff provide on-site programming for off-site approved researchers For all methods of access, restricted data files remain in RDC and output is inspected for disclosure violations On-Site Access:  On-Site Access RDC staff constructs necessary data files, including merged user data Most statistical packages available with sufficient lead time Output subject to disclosure review Open only during normal working hours Remote Access Method:  Remote Access Method RDC staff constructs necessary data files, including merged user data SAS programs only (certain procedures and functions not allowed) – additional software options expected Both submitted programs and output undergo a programmed disclosure limitation review RDC Staff-assisted Programming Method:  RDC Staff-assisted Programming Method Subcontract with the RDC staff to perform programming tasks Useful for those planning to use statistical software not available for the remote system and who are not able to travel to the RDC facility Cost is estimated for each research project Overview: RDC Data Access Procedures:  Overview: RDC Data Access Procedures Proposal Requirements Access Methods Helpful Tips Where to get help? RDC Helpful Tips:  RDC Helpful Tips Be clear about research and data requirements (helps to determine feasibility of project) Clearly identify the sample to be included in the analytic file Provide data dictionaries for both Public use data Restricted data Provide examples of expected output Overview: RDC Data Access Procedures:  Overview: RDC Data Access Procedures Proposal Requirements Access Methods Helpful Tips Where to get help? Slide54:  Visit the RDC at: www.cdc.gov/nchs/r&d/rdc.htm or email: rdca@cdc.gov Slide55:  LINKED DATA, CONTEXTUAL DATA, and GEO-CODING ON-SITE and STAFF-ASSISTED DATA ACCESS Christopher Rogers Research Data Center cor2@cdc.gov Why Link Data Sets?:  Why Link Data Sets? Improve modeling and make use of existing data. Compensate for increased difficulties taking surveys. Open your mind. Common Example: Economic variables versus Ethnic variables Historical Trends:  Historical Trends More linking of scientific data sets between government agencies. Confidential Information Protection and Statistical Efficiency Act of 2002 (CIPSEA.) Confused political and social situation in US. Quality NCHS Resources:  Quality NCHS Resources Linked Birth and Infant Death Data with Fetal Death Data. Geo-coded NHIS 1986-2003 (2004-2005). Geo-coded NHANES III. Cycles 4, 5, and 6 NSFG Contextual Data. Linked Data Sets described earlier. Linked Birth and Infant Death:  Linked Birth and Infant Death Designed to study factors in infant death. Links birth and death certificates for deaths under one year of age. Includes fetal deaths for 1995-1997 Years: 1983-1991 and 1995-1997 Numerator File (for deceased children): Parental information and behavior, prenatal care, infant health variables, demographics, cause of death. Denominator File (for control group): Parental information and behavior, prenatal heath, infant health, demographics. Fetal Death Data: 1995-1997 Restricted Data: County/City of mother’s residence or County of child’s birth or death when under 250,000. 100,000 starting 1989. Data Example:  Data Example From the Division of Vital Statistics. Proposals or questions can go either to the RDC or the DVS. Fetal Death Data portion. Given 1989-1999. Linked to county level contextual data. Goal to model fetal death with emphasis on ground water quality. Estimates death rates for each county. Geo-Coded NHIS:  Geo-Coded NHIS National Health Interview Survey. RDC has access to files from 1963 to present. Previously geo-coded households for 1986-1994. Recently geo-coded by RDC from 1995-2003. 2004-2005 coding in progress. State (2 digits), County (3 digits), Tract (6 digits), Block Group (1 digit), and Block (3-4 digits) levels. Households coded to 1990 and 2000 Censuses. Geo-Coded NHANES III:  Geo-Coded NHANES III NHANES III is also linked to NDI Mortality data. NHANES III has been geo-coded twice. The RDC has done it at the same level of detail as NHIS. Continuous NHANES has not been geo-coded yet. Example: Large project with neighborhood, economic, ethnic, and individual medical and behavioral variables. Multi-level models. NSFG Contextual Data:  NSFG Contextual Data Contextual variables available with Cycles 4, 5, and 6. Supplied for each individual in sample. Cycle 6: 1054 contextual variables at the state, county, tract, and block group levels. For respondent addresses in 2000 and 2002. Contextual data include both economic and demographic characteristics of locations. Easily merged by case ID to individual characteristics, behaviors, and histories. Simple NSFG Example:  Simple NSFG Example A simple example relating economics on state level, ethnicity, and behavior, but not using contextual variables. Treatment States given waiver to offer more family planning services (FPS). Questions: FPS effects on behavior FPS effect on pregnancy rates Differential impacts across demographic subgroups? Change of Topic: Accessing Data :  Change of Topic: Accessing Data On-site access to data at the RDC in Hyattsville. Staff-assisted remote access to data via e-mail. Researchers often use both types of access. Potential Designated Agent status. (CIPSEA) The RDC has put many resources into automated remote access. On-Site Access:  On-Site Access Rules in 24 page file GuidelinesRDC11-8-05.pdf available on-line. The RDC and NCHS surveys have knowledgeable professional staffs that review proposals carefully. Clients can only remove what has been approved. Checked by staff. Exploratory Data Analysis. If needed, ask. Recent example: Checking general shapes of variables for model validity. OKed by survey. Modeling needs. Recent example: Nested randomized geo-codes. Estimation problems. Example: Single PSU in a Stratum. Staff-Assisted Remote Access:  Staff-Assisted Remote Access Analysis done through a particular staff member. Usually efficient, but could be very busy. Staff member determines costs based on time. Staff usually not asked to do much programming. Staff creates data, runs e-mailed programs, checks, and returns output to researcher. Staff can do exploratory analysis, if needed. Staff can help check modeling problems. Commonly done after on-site visit. Our Mission:  Our Mission The RDC has a professional staff dedicated to helping researchers uncover knowledge and advance understanding. Slide69:  Remote Access System Vijay Gambhir Remote Access System:  Remote Access System Envisioned as an integral Part of RDC Pre – onsite usage Post – onsite usage Super store/ Convenience store Basics of Remote Access System:  Basics of Remote Access System Object oriented, event driven system based upon the principles of distributed computing About two years of development efforts Set of applications called in service by resident component Advanced pattern recognition techniques Analytic Data Research by Email (ANDRE):  Analytic Data Research by Email (ANDRE) NCHS has been providing remote data access to researchers through ANDRE since April 1998. In the past five years, ANDRE has served 45 different data analysts and executed over 9,500 SAS programs for their research programs. Main Features of ANDRE:  Main Features of ANDRE Completely automated system Operates round the clock without any human intervention Registered subscribers only Proposals already reviewed and approved Have an agreement with NCHS/RDC Unlimited Access during the subscription period Data Requests:  Data Requests Registered user can submit data requests by email from anywhere and at any time. Results of the data request released to a specified email address that has been certified as secure by the subscriber and approved by NCHS/RDC. Authentication:  Authentication Multi-levels of system security: Submission syntax User id Password Email/code word Package Path info Data Request Analysis:  Data Request Analysis Compliance with the disclosure limitation constraints of NCHS Integrity of the system Resource constraints (CPU time & Storage requirements) Protection of ANDRE’s work environment Prevention of Direct Disclosure:  Prevention of Direct Disclosure Cleaning up of the Log File Categorization of SAS commands/words Forbidden Commands Modifications to the Commands Output suppression Sample: Original Log:  Sample: Original Log 1 options nocenter; 2 Data one; 3 Infile 'd:\nchs\respnd95.dat' lrecl=13064; 4 Input 5 TODAYSPG 6847-6847 6 CONSTAT1 11934-11935 7 CONSTAT2 11936-11937 8 CONSTAT3 11938-11939 9 CONSTAT4 11940-11941 10 SEX1MTHD 11945-11946 11 POST_WT 12350-12359; 12 if constat1 = 'ab' then vjvar=1; else vjvar = 2; 13 WGT1000=POST_WT/1000; 14 title 'NSFG cycle 1995'; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 12:15 NOTE: The infile 'd:\nchs\respnd95.dat' is: File Name=d:\nchs\respnd95.dat, RECFM=V,LRECL=13064 NOTE: Invalid numeric data, 'ab' , at line 12 column 15. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 1000000111260837511521 1 1050 12 106921124112411189 101 2 201 19211059110611197 …… Sample: Original Log (cont.):  Sample: Original Log (cont.) …… 12901 11232521101 05267213103033921811931011103 01030000000321120000392702210611511200403 1344 1316 13001 622501001006034 TODAYSPG=1 CONSTAT1=5 CONSTAT2=88 CONSTAT3=88 CONSTAT4=88 SEX1MTHD=1 POST_WT=2545.7569 vjvar=2 WGT1000=2.5457569 _ERROR_=1 _N_=20 NOTE: 10847 records were read from the infile 'd:\nchs\respnd95.dat'. The minimum record length was 13064. The maximum record length was 13064. NOTE: The data set WORK.ONE has 10847 observations and 9 variables. NOTE: DATA statement used: real time 39.88 seconds cpu time 12.10 seconds 15 proc freq; 16 tables CONSTAT1 vjvar; 17 run; NOTE: There were 10847 observations read from the data set WORK.ONE. NOTE: PROCEDURE FREQ used: real time 0.49 seconds cpu time 0.04 seconds Sample: Cleaned Log:  Sample: Cleaned Log 1 options nocenter; 2 Data one; 3 Infile 'd:\nchs\respnd95.dat' lrecl=13064; 4 Input 5 TODAYSPG 6847-6847 6 CONSTAT1 11934-11935 7 CONSTAT2 11936-11937 8 CONSTAT3 11938-11939 9 CONSTAT4 11940-11941 10 SEX1MTHD 11945-11946 11 POST_WT 12350-12359; 12 if constat1 = 'ab' then vjvar=1; else vjvar = 2; 13 WGT1000=POST_WT/1000; 14 title 'NSFG cycle 1995'; NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 12:15 NOTE: The infile 'd:\nchs\respnd95.dat' is: File Name=d:\nchs\respnd95.dat, RECFM=V,LRECL=13064 NOTE: Invalid numeric data, 'ab' , at line 12 column 15. Sample: Cleaned Log (cont.):  Sample: Cleaned Log (cont.) NOTE: 10847 records were read from the infile 'd:\nchs\respnd95.dat'. The minimum record length was 13064. The maximum record length was 13064. NOTE: The data set WORK.ONE has 10847 observations and 9 variables. NOTE: DATA statement used: real time 39.88 seconds cpu time 12.10 seconds 15 proc freq; 16 tables CONSTAT1 vjvar; 17 run; NOTE: There were 10847 observations read from the data set WORK.ONE. NOTE: PROCEDURE FREQ used: real time 0.49 seconds cpu time 0.04 seconds Forbidden Commands:  Forbidden Commands Commands That Pose Unacceptable Disclosure Risks OR Disallowed to Protect Integrity/Internal Environment of ANDRE Add firstobs report iml Print first. Pctn nofreq Obs last. Pctsum nocum Firstobs nocol tabulate editor Browse summary list put Commands Modification:  Commands Modification Modify user’s program to enforce restrictions on options allowed with certain SAS procedures to prevent objectionable info appearing in the output PROC MEANS n mean std; Output Suppression:  Output Suppression Wiping out of extreme values from the output of Proc Univariate. Suppressing complete output line (Procs Means, corr, Univariate, etc) where sample size less than the minimum acceptable value. Proc Means Suppression:  Proc Means Suppression The MEANS Procedure Variable Label N Mean Std Dev -------------------------------------------------------------------------------------------- EXPEND_R Current expend/pupil in public schl/1000 5424 5.0830820 1.3958710 *** Values Suppressed *** RPUB87 exp. for contr. serv. and supplies 1997$ 5424 23472052.60 18806802.86 RPUB92 exp. for contr. serv. and supplies 1997$ 5424 34800922.98 30481634.59 PRGPRO Coordinated Pregnancy Prevention Program 1708 0.0679157 0.2516749 HIVED HIV/AIDS Education 1708 3.5146370 0.8044378 *** Values Suppressed *** PRGPRO87 Coordinated Pregnancy Prevention Program 5424 0.0540192 0.2260764 HIVED87 HIV/AIDS Education 5424 3.4968658 0.8008324 WT_PER15 % Wt females aged 15-19/total 15-19 5424 0.7279681 0.1265796 BK_PER15 % Bk females aged 15-19/total 15-19 5424 0.1409869 0.0932332 HS_PER15 % Hs females aged 15-19/total 15-19 5424 0.0962413 0.1055191 TEENMMC2 Teenmom by cohort (1,2,3r) 1201 1.7119067 0.7715351 C18_2_1S R in C2 (vs 1) at 18-19 endpt (1,2) 1770 1.5248588 0.4995228 TM2_1S18 R tnmm in Coh 2 (vs 1)-age 18 @ ext 358 1.4804469 0.5003168 AGE_12 Date R = 12 in century months 6450 979.5613953 69.3124265 STRTST IA5 Date R started living in current sta 3870 1132.55 753.2066507 BDAYCENM R date of birth 6450 835.5613953 69.3124265 RAVPAY95 real av. an. pay 95 dollars 5424 26933.93 2826.80 PERCAFDC percent of households receiving AFDC 5424 0.0422254 0.0127307 SALARY teacher salaries real 96-97$$$ 5424 35338.66 5729.11 -------------------------------------------------------------------------------------------- Proc Univariate Output Unsuppressed:  Proc Univariate Output Unsuppressed The SAS System 9 14:09 Sunday, October 24, 1999 Univariate Procedure Variable=AVHRATET Moments Quantiles(Def=5) N 2283 Sum Wgts 2283 100% Max -0.25314 99% -1.62008 Mean -4.66219 Sum -10643.8 75% Q3 -3.56179 95% -2.37588 Std Dev 1.892017 Variance 3.57973 50% Med -4.50491 90% -2.79152 Skewness -2.11919 Kurtosis 6.892929 25% Q1 -5.30374 10% -6.07639 USS 57792.36 CSS 8168.944 0% Min -13.5463 5% -7.19645 CV -40.5821 Std Mean 0.039598 1% -12.7402 T:Mean=0 -117.738 Pr>|T| 0.0001 Range 13.29321 Num ^= 0 2283 Num > 0 0 Q3-Q1 1.741949 M(Sign) -1141.5 Pr>=|M| 0.0001 Mode -13.5463 Sgn Rank -1303593 Pr>=|S| 0.0001 Extremes Lowest Obs Highest Obs -13.5463( 1547) -0.90519( 649) -13.5397( 1836) -0.81756( 1094) -13.4637( 2084) -0.76928( 1739) -13.4413( 1127) -0.5907( 21) -13.4402( 1088) -0.25314( 400) Proc Univariate Output Suppressed:  Proc Univariate Output Suppressed The SAS System 9 14:09 Sunday, October 24, 1999 Univariate Procedure Variable=AVHRATET Moments Quantiles(Def=5) N 2283 Sum Wgts 2283 100% Max -0.25314 99% -1.62008 Mean -4.66219 Sum -10643.8 75% Q3 -3.56179 95% -2.37588 Std Dev 1.892017 Variance 3.57973 50% Med -4.50491 90% -2.79152 Skewness -2.11919 Kurtosis 6.892929 25% Q1 -5.30374 10% -6.07639 USS 57792.36 CSS 8168.944 0% Min -13.5463 5% -7.19645 CV -40.5821 Std Mean 0.039598 1% -12.7402 T:Mean=0 -117.738 Pr>|T| 0.0001 Range 13.29321 Num ^= 0 2283 Num > 0 0 Q3-Q1 1.741949 M(Sign) -1141.5 Pr>=|M| 0.0001 Mode -13.5463 Sgn Rank -1303593 Pr>=|S| 0.0001 Proc Univariate Output Suppressed (sample size = 1):  Proc Univariate Output Suppressed (sample size = 1) Univariate Procedure Variable=FREQ (sum) freq Moments Quantiles(Def=5) Serious Disclosure limitation Violations Values too low to release Output of Proc Univariate withheld Proc Freq Suppression (One-Way Tables):  Proc Freq Suppression (One-Way Tables) Suppress at least two consecutive rows to prevent derivation of suppressed values from cumulative totals. Disallow single row output. One-Way Freq Table Suppressed:  One-Way Freq Table Suppressed Cumulative Cumulative LOGRNTOPAT Frequency Percent Frequency Percent ----------------------------------------------------------------- 0.2277839309 ????? ????? ????? ????? 0.2277839309 ????? ????? ????? ????? 0.2305236586 5 0.08 6429 97.99 0.231111721 5 0.08 6434 98.06 0.232058915 ????? ????? ????? ????? 0.232058915 ????? ????? ????? ????? 0.2436220827 ????? ????? ????? ????? 0.2436220827 ????? ????? ????? ????? 0.2498117984 6 0.09 6456 98.40 0.2504106777 6 0.09 6462 98.49 0.2513144283 18 0.27 6480 98.77 0.2595111955 6 0.09 6486 98.86 0.2670627852 ????? ????? ????? ????? 0.2670627852 ????? ????? ????? ????? 0.2736958305 5 0.08 6500 99.07 0.2814124594 5 0.08 6505 99.15 0.3022808719 6 0.09 6511 99.24 0.3364722366 10 0.15 6521 99.39 One-Way Freq Table suppressed (cont.):  One-Way Freq Table suppressed (cont.) Cumulative Cumulative LOGRNTOPAT Frequency Percent Frequency Percent ----------------------------------------------------------------- 0.3403258059 ????? ????? ????? ????? 0.3403258059 ????? ????? ????? ????? 0.3715635564 6 0.09 6537 99.63 0.3856624808 ????? ????? ????? ????? 0.3856624808 ????? ????? ????? ????? 0.6931471806 6 0.09 6550 99.83 1.2527629685 ????? ????? ????? ????? 1.2527629685 ????? ????? ????? ????? 1.2527629685 ????? ????? ????? ????? Proc Freq Suppression (Two-way Tables):  Proc Freq Suppression (Two-way Tables) Rows and columns totals preserved Cells with values less than the acceptable minimum are suppressed Additional suppressions to ensure that no row and no column has single suppression. Logical stitching of horizontal and vertical splits. Proc Freq: Two-way Tables Suppression:  Proc Freq: Two-way Tables Suppression TABLE OF FAMREL BY FAMSIZER FAMREL FAMSIZER Frequency| Percent | Row Pct | Col Pct | 2| 3| 4| 5| Total ---------+--------+--------+--------+--------+ 3 | 94 | 388 | 792 | 533 | 2206 | 3.97 | 16.40 | 33.47 | 22.53 | 93.24 | 4.26 | 17.59 | 35.90 | 24.16 | | 98.95 | 96.28 | 96.12 | 94.34 | ---------+--------+--------+--------+--------+ 4 | ?????? | 9 | 22 | 27 | 104 | ?????? | 0.38 | 0.93 | 1.14 | 4.40 | ?????? | 8.65 | 21.15 | 25.96 | | ?????? | 2.23 | 2.67 | 4.78 | ---------+--------+--------+--------+--------+ 6 | ?????? | 6 | 10 | 5 | 56 | ?????? | 0.25 | 0.42 | 0.21 | 2.37 | ?????? | 10.71 | 17.86 | 8.93 | | ?????? | 1.49 | 1.21 | 0.88 | ---------+--------+--------+--------+--------+ Total 95 403 824 565 2366 4.02 17.03 34.83 23.88 100.00 (Continued) Proc Freq: Two-way Tables Suppression (Cont.):  Proc Freq: Two-way Tables Suppression (Cont.) checking frequencies 4 12:01 Thursday, May 6, 1999 TABLE OF FAMREL BY FAMSIZER FAMREL FAMSIZER Frequency| Percent | Row Pct | Col Pct | 6| 7| 8| 9| Total ---------+--------+--------+--------+--------+ 3 | 209 | 98 | 19 | 73 | 2206 | 8.83 | 4.14 | 0.80 | 3.09 | 93.24 | 9.47 | 4.44 | 0.86 | 3.31 | | 90.48 | 83.05 | 59.38 | 74.49 | ---------+--------+--------+--------+--------+ 4 | 13 | 10 | ?????? | 12 | 104 | 0.55 | 0.42 | ?????? | 0.51 | 4.40 | 12.50 | 9.62 | ?????? | 11.54 | | 5.63 | 8.47 | ?????? | 12.24 | ---------+--------+--------+--------+--------+ 6 | 9 | 10 | ?????? | 13 | 56 | 0.38 | 0.42 | ?????? | 0.55 | 2.37 | 16.07 | 17.86 | ?????? | 23.21 | | 3.90 | 8.47 | ?????? | 13.27 | ---------+--------+--------+--------+--------+ Total 231 118 32 98 2366 9.76 4.99 1.35 4.14 100.00 Fully Automated and Expert system?:  Fully Automated and Expert system? Fully automated? Reboot to deal with memory leakage. Confidentiality Expert? How reliable? As good as underlying algorithms. Needs constant monitoring What is new?:  What is new? Improved and expanded hardware platform Two machines dedicated to heavy remote access usage Three additional machines dedicated to general remote access usage What is New?:  What is New? Sudaan now available to remote access users Proc Crosstab Proc Rlogist Proc Regress Proc Multilog Proc Survival What is new:  What is new Proc Descript Other new Sudaan procedures will be made available shortly Plans to make Stata available through remote access What is new:  What is new Web Component of ANDRE under construction. On-line scanning of users’ code Valuable research tools and information readily available to the users. Contact Information:  Contact Information For general Questions/Comments Email: rdca@cdc.gov Phone: (301) 458-4732 For On-site Info: Email: Neb9@cdc.gov Phone: (301) 458-4097 For Remote Access Info: Email: vgambhir@cdc.gov Phone: (301) 458-4226

Add a comment

Related presentations

Related pages

Model 43-2 | The Harris Products Group

Model 43-2. Description: Heavy Duty Straight Cutting Torch. The Model 43-2 is a Harris Premium Heavy Duty Cutting Torch with the most efficient and safe ...
Read more

Modell 43-2 | The Harris Products Group

The Harris Products Group, ein Unternehmen der Lincoln Electric, ist weltweit führend in der Entwicklung und Herstellung von Autogen Schneid- und ...
Read more

Model K-43 | The Harris Products Group

The Harris Products Group is a world leader in the design, development and manufacture of brazing, soldering and welding alloys and equipment, cutting and ...
Read more

Modell L-43 | The Harris Products Group

The Harris Products Group, ein Unternehmen der Lincoln Electric, ist weltweit führend in der Entwicklung und Herstellung von Autogen Schneid- und ...
Read more

Miller Harris Le Petit Grain kaufen » bis zu -43%

Le Petit Grain von Miller Harris ab 43,95 EUR im Beauty-Shop · Trusted Shops Geld-zurück-Garantie · 30 Tage kostenlose Rücksendung.
Read more

Autogen HARRIS Handgriff Modell 43 - 2 in Münster ...

Verkaufe hier einen HARRIS Handgriff Modell 43-2 Ich habe den beim aufräumen in der Garage entdeckt...,Autogen HARRIS Handgriff Modell 43 - 2 in Münster ...
Read more

43 Harris Cir, Newark, DE 19711 | Redfin

43 Harris Cir is a house in Newark, DE 19711. This 2,912 square foot house sits on a 0.51 acre lot and features 4 bedrooms and 2.5 bathrooms. This property ...
Read more

Model F-43 | The Harris Products Group

Model F-43. Description: Equal Pressure “E” Type Mixer. To thoroughly mix the oxygen and fuel gas, “E” mixer designs rely on equal pressure control ...
Read more

Harris County Municipal Utility District 43 Map

Harris County MUD 43 Map. The content contained in this web site is provided by Harris County M.U.D. No. 43 as a service to you, our residents.
Read more

Harris, Iowa - Wikipedia, the free encyclopedia

Harris is a city in Osceola County, Iowa, United States. The population was 170 at the 2010 census History. Harris had ... (43.445575, -95.433168).
Read more