ESSNet on Statistical Disclosure Control

Task 3. Output checking

3.a. Guidelines

The success of our OnSite/Datalab facilities and the Remote Execution/Remote Access has a price. Researchers very much appreciate the enhanced facilities to have access to sensitive micro data enabling them to perform all kinds of analysis. The price for the NSIs however is that all output generated in these facilities has to be checked; otherwise the NSIs face serious confidentiality problems. The expanding use of these facilities makes this burden even heavier.
At present limited guidelines for output checking are available although a few first steps have been made. The aim of this task will be to develop a framework/guidelines for such output checking.
In order to facilitate the communications in this group, we plan two meetings one each year, to discuss the guidelines, and also to consider potential usability of new emerging methods as of 3.b. Deliverables: We will write an intermediate report after one year and a more final version at the end of the project. Ideally this could then be incorporated in the handbook.
Partners: UK, NL, DE, IT.

3.b. Automatic output checking

A quicker and cheaper alternative to output checking would be to provide the researchers with safe output generated automatically. This can be achieved by a slight perturbation to the output. Methodology for how to compute suitable perturbation has been developed at Destatis (c.f. Heitzig, 2005). So far, this methodology has been implemented prototypically as a set of macros for the commercial software SAS. Within this project we will develop and publish a free, productive version of the method, implemented prototypically as a package for the freely available R environment for statistical computing. The package will be licensed under the GNU Public License and will be able to produce safe results of the most important uni- and bivariate analysis methods. We will write an intermediate report after one year and provide the software along with suitable documentation at the end of the project.
Partners: NL, DE.
Deliverables: Interim report after year1 and final report after year2 with a documented version of the software.
References: Heitzig, J., 2005: The Jackknife Method: Confidentiality Protection for Complex Statistical Analyses. Invited paper, Joint UNECE/Eurostat work session on statistical data confidentiality (Geneva, Switzerland, 9-11 November 2005). URL: http://www.unece.org/stats/documents/ece/ces/ge.46/2005/wp.39.e.pdf.