62leftbanner.gif (3495 bytes) 62rightintroduction.gif (769 bytes)
Job submission instructions
Download LISSY VI Job Submission Instructions in a Printable Format (PDF)
reviewed
December 2007

BASIC INFORMATION

A.    The LIS Microdata

The LIS microdata are housed on a secured data server at the LIS office in Luxembourg-City in the Grand-Duchy of Luxembourg.  The data can only be accessed only by submitting an emailed job request in SPSS, SAS or Stata.  The resulting output can report aggregated information and results, but will not allow for direct access to the micro-level data.

Please note that the databases cannot be downloaded, however, direct access to a limited number of data sets can be obtained by users visiting LIS in Luxembourg.  (See http://www.lisproject.org/fellowships/directaccess.htm for information about which data are available to LIS visitors.)

B.     The LISSY System

LISSY, the system that allows users to access the LIS microdata, resides on a separate Windows network at the LIS office.  Users send job instructions (programs) to LISSY via e-mail.  LISSY processes the programs and e-mails the output back to the user.

E-mails sent to LISSY are standard SPSS, SAS, or Stata batch programs (MS‑Windows versions of SPSS 11.5, SAS 9.1.3, and Stata 9.2) with a program‑specific header that allows the LISSY system to process the user request.  In order to ensure data confidentiality, a few exceptions or modifications to the user’s programming style may be needed for the LISSY system to process the user request.  Language‑specific details are described for each of the three programming languages later in this document.

1.      Registration

In order for the LISSY system to accept a job request, a prior registration with LIS is mandatory. A user obtains an account and a password after signing a pledge stating that a LIS user will abide by the rules for accessing data and publishing results. This form (the User Registration Form) is available on the LIS web site (see http://www.lisproject.org/introduction/userform.htm). Registration will not be finalised until LIS receives a signed copy via fax (+352 26 00 30 30) or by regular mail addressed to:

Caroline de Tombeur

Luxembourg Income Study

17, rue des Pommiers

L-2343 Luxembourg

Grand-Duchy of Luxembourg

Please note that as of 2007, registration must be updated annually (at the 1st of each year).  However, once the initial pledge has been signed, re‑registration is simply a matter of updating information and must be done over the Internet 
(See http:www.lisproject.org/dataccess/renewal.htm).

2.      General information

Job requests must be submitted by e-mailing LISSY at postbox@lisproject.org.  Jobs are processed by LISSY and the output is returned to the user at the e-mail address recorded while registering as a LIS user.  (Please note that output will be returned to the e-mail address registered with LIS, even if you sent your job request from a different mail account.  Upon request, the e‑mail address LISSY uses can be temporarily or permanently updated.)

Processing time varies and depends on the number of job submissions being processed at one time as well as the complexity of the jobs being sent.  As a rule of thumb, a LIS user may expect results within one hour.  If the number of jobs in our system is low, listings can be back within minutes.  Maintenance on the LIS system will cause longer delays.  The LIS team does not guarantee that your job will be returned within any fixed time period, nor do we accept responsibility in case output does not return within the user’s expected time period.  However, please feel free to contact user support (usersupport@lisproject.org) if your job is taking much longer than expected.

3.      Some important information about the LISSY system

a)      Data Security

(1)   LIS Security Review

LISSY is programmed to direct all suspect output (i.e., any output that may potentially reveal individual or household identifying information) to LIS staff for review.  Users should take care to avoid program commands and output that will lead to a review.  As these listings have to be examined manually, this not only means an additional burden for the LIS team, it also causes longer delays for you to receive your results.  In addition, individuals whose programs are routinely directed to LIS security review may have their accounts revoked.  We ask that you adhere to the following programming rules in order to avoid the security review.

(2)   Printing or listing individual data is prohibited

In order to maintain confidentiality of the survey participants in the LIS/LWS/LES microdata, LISSY will not accept any program commands that allow users to examine individual records or small aggregated groups.

Do not try to isolate information on individual households or persons in any way.  Abstain from commands that list individual records (such as PROC PRINT in SAS or LIST in SPSS or Stata).  Do not tabulate variables or attempt to obtain frequencies that will result in small cell sizes.  Using commands such as TABULATE or FREQUENCIES on continuous variables (e.g., casenum, hweight, pweight, or income variables) will be detected by LISSY.  Output from any program that attempts any of the above procedures will not be not returned to the user, but will be directed to LIS for security review.

(3)   Long listings

Any output that exceeds the LIS-designated length will cause LISSY to send the output for security review.  Therefore LIS advises also in each user interest, to split such programs over several separate jobs and to use the language‑specific options for suppressing output.  (See suggestions for each statistical package later in this document.  Note that SAS, in particular, tends to generate rather large log files, which are inserted into the output sent to the user).  Through a few simple commands, users can generate log files that contain only the needed results, thereby making them easier for the user to read and avoiding the LIS security review.

b)     LISSY’s Vacation Schedule

LISSY is a true workaholic and works 24 hours a day, 7 days a week.  Unfortunately computers are not perfect and it can happen that LISSY takes some unscheduled time off.  We try to detect and solve these problems as quickly as possible.  However, when they occur at night (CET time), during the weekend, or during other holidays, the LIS team will be unable to correct the problem until the next working day.

4.      Important remarks about e-mail editors

Some e‑mail editors may interfere with the way LISSY performs.  LISSY interprets plain ASCII code and will have difficulty when non‑ASCII characters are introduced into your program.  LISSY will translate non-ASCII characters into ASCII characters (such as '3D', 'E20'), which will make jobs unreadable.

In order that your jobs are completed quickly and error-free, please note the following:

   Jobs should be written inside the body of an e-mail message and not as an attachment to such a message.

       It is important that some of the more advanced settings of the e-mail editor used are disabled.  Do not use colour, bold, italics, or any other formatting.  Do not use electronic business cards or auto-confirmation.  Our system needs to receive your job as plain text/ASCII e-mail message.

         Do not use special characters, such as:

o        accents (' ^); or

o        country-specific characters (ü ß æ Å ç ñ).

In all cases where a LIS user fails to follow these rules, LISSY will send the error message:

WRONG HEADER

(See the section on Error messages, below, for more detail.)

II.      Structure of LIS/LWS/LES Job Submissions

A.    The header

Regardless of the programming language used, each job submitted to LISSY must begin with a 4‑line header in plain ASCII text at the beginning of the e-mail.  These first 4 lines of your job will allow you to be recognised as a LIS‑registered user and will give you access to the data. They are essential for ALL jobs regardless of the statistical package (SPSS, SAS, or Stata) or project (LIS or LWS or LES) you want to use.  The general format of the header is:

* user              =       your username

* password      =       your password

* package        =       your statistical package

* project           =       LIS (or LWS or LES)

Note that you must include the asterisk at the beginning of each of the header lines.  After these first lines you can write your normal program, bearing in mind the language‑specific requirements and warnings described below.

B.     File identification

Rather than specifying the entire file path for each of the individual LIS/LWS/LES files, LISSY relies on aliases or macros that represent the file names.  Each alias is built on the 2‑digit country code followed by the 2‑digit year (e.g., AU01).  Following this 4‑digit country‑/year‑specific code, an additional 1 to 2 letters are used to identify the specific file type you wish to access (e.g., household, person).  The appropriate identification is specified under each language‑specific heading.

The country abbreviations and years can be found in the country reference list that is regularly updated.  Mistyping aliases or using aliases that no longer exist (e.g., GE is now DE for all German files) will results in a “file not found” message in your listing.  We recommend that you regularly check LIS list of available datasets or the LWS list of available datasets for the correct 4‑digit country/year abbreviation.

C.    Language‑specific information

1.      SPSS job syntax

a)      Calling data files

The SPSS “get file =...” statement is used to call the file you wish to access.  Instead of calling the specific file, LISSY requires the following aliases:

LIS

household file

ccyyh

household shadow file

ccyyhs

person file

ccyyp

person shadow file

ccyypr

  LWS

household file

ccyyw

LES

True labour force file

ccyye

LIS wave 5 release 1

wave 5 household file release 1

ccyyhr

wave 5 person file release 1

ccyypr

LES 

LIS/LES integrated files

ccyyl

Where cc is replaced with the 2‑digit country code and yy stands for the 2‑digit year code.  For example:

de89h: selects the LIS household file for Germany 1989

ca94p: selects the LIS person file for Canada 1994

fi94w: selects the LWS household file for Finland 2004

us00l: selects the LIS/LES integrated file for United States 2000

uk89e: selects the LES file for United Kingdom 1989

b)     SPSS Programming Tips

§         The commands LIST and FREQUENCIES should be used with caution as they can produce listings that are too long, especially when used with categorical variables that have a large number of categories.

§      While the dot at the end of each command line is not mandatory when using LIS, it is strongly recommended in order to prevent problems related to multiple lines commands.

§      Please look at the SPSS self-teaching package on-line at http://www.lisproject.org/dataccess/selfteach.htm for practical examples on how to use LIS microdatabases.  Examples include accessing files, calculating simple statistics, weighting results, using macros, using include files, merging files, and the calculation of some basic indicators in the fields of poverty, inequality, and labour markets.

2.      SAS job syntax

a)      Calling data files

Instead of the specific file, your SAS program references the LIS/LWS/LES data through aliases in LIS defined macros.

For those who are still working with the “wave 5 release1” datasets (including the LIS/LES integrated files), you must point to a format catalog while accessing release 1 data.  These references are included with the data aliases below:

LIS

household file

&ccyyh

household shadow file

&ccyyhs

person file

&ccyyp

person shadow file

&ccyyps

  LWS

household file

&ccyyw

LES

true labour force file

&ccyye

LIS Wave5 Release 1

wave5 household file release 1

&ccyyhr

wave 5 person file release 1

&ccyypr

related format files 

&ccyyf

LES

LIS/LES integrated files

&ccyyl

related format files

&ccyyef

Where cc is replaced with the 2‑digit country code and yy stands for the 2‑digit year code.  For example:

&de89h : selects the LIS household file for Germany 1989

&ca94p: selects the LIS person file for Canada 1994

&fi94w: selects the LWS household file for Finland 2004

&us00l: selects the LIS/LES integrated file for United States 2000

&uk89e: selects the LES file for United Kingdom 1989

b)     SAS Programming Tips

§      Do not use the any of the commands identified as data security problems.  In SAS, this includes PROC PRINT.

§      Beware of long listings.  SAS tends to generate rather large log files that are inserted into the output sent to the LIS user. These jobs often end up in our manual review queue as this listing is rather long.  In order to avoid this, you may use the SAS programming option: OPTIONS nonotes nosource;

§     Errors due to format syntax often occur in SAS, especially while merging different datasets.  This is especially true of older data sets.  In order to avoid this, you may use the SAS programming option: OPTIONS nofmterr;

3.      Stata job syntax

a)      Calling data files

Instead of the specific file, your Stata program references the LIS/LWS/LES data through aliases in LIS‑defined macros.  For Stata, the aliases are:

LIS

household file

$ccyyh

household shadow file

$ccyyhs

person file

$ccyyp

person shadow file

$ccyyps

LWS household file $ccyyw

LES

true labour force file

$ccyye

LIS wave5 release 1

wave5 household file release 1

$ccyyhr

wave 5 person file release 1

$ccyypr

LES

LIS/LES integrated files

$ccyyl

where cc is replaced with the 2‑digit country code and yy stands for the 2‑digit year code.  For example:

$de89h: selects the LIS household file for Germany 1989

$ca94p: selects the LIS person file for Canada 1994

$fi94w: selects the LWS household file for Finland 2004

$us00l: selects the LIS/LES integrated file for United States 2000

$uk89e: selects the LES file for United Kingdom 1989

Please note that Stata is case‑sensitive.  Therefore, be certain to refer to Stata aliases in lower case.

b)     Stata Programming Tips

§      Do not use the any of the commands identified as data security problems.  In Stata, this includes LIST.

§      Beware of long listings that may end up in our manual review queue.  

       Some Stata programming commands result in long output (e.g., logit and probit).  By using the option “, nolog” following these programming commands, you can minimize the amount of text in your log file.

      To ensure short output for all commands. Make use of the “quietly” and “noisily” commands. If used appropriately, your log file can consist only of needed results meaning that you can create longer program files.

§         The semicolon delimiter (i.e., “ #delimit ;”) is not mandatory for LISSY to understand Stata.  However, we strongly recommend its use in order to avoid problems of multiple or long lines.

§         You may need to save intermediate datasets for later use (in the same job), for example to append or merge two datasets. When doing so, please always save your intermediate files with the following command: save $mydata\myfile.dta, replace. The macro $mydata indicates the path where your temporary intermediate datasets will be saved. Substitute myfile with the name you want to give to your dataset.

§         LISSY automatically sets the maximum memory when processing a Stata job request.  For this reason, do not use the command: set mem Xm.

§         LISSY automatically exits every processed job.  However, in some e-mail editors, failure to use the “exit” statement will result in an error message at the end of your output (as LISSY tries to read frames as ASCII characters).  While this will not affect the majority of your program, it may result in long output, which should be avoided.

§         LIS has installed the official (stb) Stata ado files that are commonly used by researchers investigating poverty and income inequality (poverty, inequal, povdeco, ineqdeco, etc).  If additional ado files are needed to perform your job, please send an email to the LIS User Support (usersupport@lisproject.org) and the ado file will be added, if possible.

D.    Self-teaching packages on‑line

Please look at the on‑line self-teaching packages for practical examples on how to use LIS/LWS/LES microdata.  Examples include accessing files, calculating simple statistics, weighting results, using macros, using include files, merging files, and the calculation of some basic indicators in the fields of poverty, inequality, and labour markets.

http://www.lisproject.org/dataccess/selfteach.htm

 

III.    OUTPUT FROM LISSY

Whenever a user sends a message to LISSY, an answer will be sent back that contains one of the following:

A.    The listing with the output of the job execution

If all the LIS rules are followed, LISSY will automatically forward the listing containing the output from the job execution.  Please note that the output will be included within the text of the e-mail message, and NOT as a log file attachment.  (Please do not open log files within your program in any of the three languages.  LISSY will not return the log file, only the output.)  Your output will be returned to the address you provided when registering with the LIS.  (Note that this may not be the address from which your e-mail was sent!)

If your program contains any errors, your output will be returned with the language‑specific error message.

B.     Warning of long listing

One of two situations will occur if your program results in a long listing:

1.      Security review queue

If your program is error‑free and the job is fully executed, but the size of its listing is larger than a given limit, the job is automatically put in the security review queue.  In this case, you will be automatically sent a message stating that the job has to be manually reviewed by the staff.  Generally within one working day, you will either receive the listing with the output of the job execution (if no confidentiality rules were broken), or you will receive no further correspondence (if the listing contained confidential data).  Both the warning and the listing (if applicable) will be returned to the address you provided when registering with the LIS.  (Note that this may not be the address from which your e-mail was sent!)

2.      Trash can

If the job listing is exceptionally long, the job is stopped [HCC4]  and is automatically discarded.  LISSY will immediately inform the user accordingly through an e‑mail sent to the address provided when registering with the LIS.

C.    LIS error messages

In some cases, the job is never run by LISSY because the header contains errors.  If this happens, you can expect to receive one of three error messages:

1.      Wrong package or project

If the user specifies a non-existent package or project in the first 4 lines (header) of the job sent to LISSY, the job cannot be executed.  In this case, you will receive a message explaining where you went wrong.  The error message will be returned to the address that you provide when you registered with LIS (which may not be the address from which you sent the e-mail!)

2.      Wrong user id or password

If the user specifies an invalid user id or password, LISSY will not recognise you as a registered LIS user and your job will not be executed.  Please note that, in this case, the error message will be sent back to the e-mail address from which the job was sent (and NOT the one that you provided when you registered!)

3.      Wrong header

If LISSY cannot identify the header in the first 4 lines of text, you will not be recognised as a registered LIS user.  LISSY will automatically and immediately inform you accordingly by sending a message to the e‑mail address from which the job was sent (and NOT the one that you provided when you registered!)  Please note that this can happen if:

   The first 4 lines of your e‑mail are omitted, mistyped, or misplaced;

·      A non‑ascii or rich text character is included in the text of your e-mail message; or

·     You forget to include the asterisk (*) at the beginning of each line of the header.)

 For more information, see Important Remarks about e‑mail editors (above).

IV.    RECOMMENDATIONS AND ADVICE

A.    Use Weights

We strongly recommend that you use the weighting variables when performing any statistical operation using the LIS/LWS/LES data base.

The variables HWEIGHT (in the LIS household file), PWEIGHT (in the LIS person file), WGT (in the LWS household file) and TI04 (in LES) contain the appropriate weights for each data sample.  This weight adjusts the sample to correct for the bias that results from non‑respondents and, thus, ensures that the sample is representative of the population.  (In some countries, the weight inflates the sample to the total population of the country in the given year.)  Not using weights will affect your results; they will be biased towards the answers provided by those who respond to the survey (or the oversampled individuals).

Please see the LIS Variable Definition page for more information on the weights.

Please see the detailed documentation by country for country-specific weights.

B.     Use the D5 Filter (LIS only)

We strongly recommend that, if you are trying to access a dataset before Wave V, you should include the D5 filter variable when performing any statistical operation.

The D5 variable (family /unit structure) is a filter variable that identifies the reporting unit (household structure) in LIS.  (This is not a demographic variable.)  In most cases (and all data from Wave V forward), the value of D5=5 ("household unit").  This indicates that the basic unit for the survey is the household, regardless of whether it is a single‑person household or a multi‑generational unit.  For previous waves, however, the reporting unit may be more complex, and multi‑family households as well as single families within multi‑family households are both included as different observations.  In these cases, the user should take care to identify only one of the units within the overall reporting structure in order to avoid double counting.

Please see the LIS Variable Definition page for more information on D5.

C.    Check the Documentation

It is strongly recommended that you carefully check the on‑line documentation for each data set you wish to use.

In spite of the LIS harmonisation, all datasets and variables are not exactly the same across countries and years.  For example, not all variables are available in each and every dataset.  If they are available, they may include slightly different contents or have different value codes.  Differences between datasets may arise in the following cases (among others):

·   Some datasets have individual information only for the head and spouse.

·       Some datasets only have information on a limited number of persons.

·       Some datasets have no information on children.

·       Some datasets have been revised since their original release.

·      All income amounts are in national currencies (sometimes in hundreds or thousands).

·       Some datasets have income variables available only in net amounts (net of taxes and contributions).

·        The contents of the weights can differ across datasets.

Carefully check the on-line documentation for each country.

D.    Provide an E‑mail Subject Line

It is advisable to give your e‑mail a subject line.  This way, you can easily distinguish which LISSY response belongs to the program you sent.

E.     Keep a Copy of Your Program

LIS strongly advises that you keep a copy of all important code (jobs) that you send to LIS.  There is no way to receive a copy of your program from LIS.

V.      INCLUDE FILE LIBRARY

LIS provides pre‑written (“include”) programs for SPSS, SAS, and Stata that allow the user to easily perform repeated functions.  For example, LIS provides an include file in all languages that converts the completed education level of individuals into comparable “low”, “medium”, and “high” categories.

A.    SPSS

The include files for SPSS are located in drive "I:\" with the extension .sps.  The syntax used to call an include file in SPSS is:

include file ’I:\filename.sps’.

To look at the contents of the SPSS include files, please refer to (http://www.lisproject.org/dataccess/spssinclude.htm).

B.     SAS

The include files for SAS are located in "I:\" with the extension .sas. The syntax used to call an include file in SAS is:

%include ”I:\filename.sas”;

The only include file available for SAS users is educrecode.sas. To look at the contents of this file, please refer to http://www.lisproject.org/dataccess/sasinclude.htm.  Definitions of equivalence scales using SAS syntax are also available here.

C.    Stata

The syntax used to call an include file in Stata is:

do $myincl\filename.do;

The only include file available for Stata users is educrecode.do. To look at the contents of this file, please refer to http://www.lisproject.org/dataccess/statainclude.htm.

VI.    ADDRESSES FOR FURTHER INFORMATION

If, after reading this document, you have any problems using LIS/LWS/LES data, please contact our user support at usersupport@lisproject.org.

Please note that we assume that you have a basic knowledge of batch coding in the statistical package you are using.  We are not a help desk for SPSS, SAS, or Stata, although we are available to offer advice concerning LIS‑specific matters.  For those users who may be unfamiliar with all the three packages, we advise that you complete the self‑teaching package for the statistical package you choose.  These exercises are available through LISSY and also interactively (if you have the capability of using one of these packages on your PC) using the U.S. sample file that is downloadable from the LIS web site

(http://www.lisproject.org/dataccess.htm).

 

Copyright (c) 2000 Luxembourg Income Study all rights reserved
Send mail to Caroline de Tombeur

File current January 28, 2008