LIS Revision Notes
Taiwan 1997 & Taiwan 2000 (Wave IV & WaveV.2) June 2008 Mexico 2002 (WaveV.2) April 2008 Sweden 2000 (WaveV.2) April 2008 Germany 2000 (WaveV.2) March 2008 Finland 2000 (WaveV.2) January 2008 Spain 2000 (WaveV.2) December 2007 Estonia 2000 (WaveV.2) December 2007 Hungary 1999 (WaveV.2) November 2007 Italy 1998/2000 (WaveV.2) October 2007 Belgium 2000 (WaveV.2) October 2007 Switzerland 2000/20002 (WaveV.2) September 2007 Wave 5, release 2 February 2007 Belgium 2000 October 2005 United States 1991 June 2005 United Kingdom 1999 March 2005 Mexico 1984 - 1998 January 2005 Norway 2000 June 2004 Germany 1984 May 2004 Germany 1989 May 2004 Germany 1994 May 2004 Germany 2000 May 2004 Italy 1995 November 2003 Italy 1991 November 2003 Finland 2000 January 2003 Finland 1995 January 2003 Finland 1995 January 2003 Belgium 1997 July 2002 Belgium 1997 November 2001 Finland 1995 September 2001 Canada 1971 - 1975 March 2001 Germany 1973 - 1978 March 2001 United Kingdom 1969 - 1974 March 2001 Israel 1992 Nov 2000 United States 1997 May 2000 Russia 1995 Netherlands 1994 January 2000 Poland 1992 December 99 Australia 1994 November 99 Sweden 1995 October 99 Australia 1989 May 98 Finland 1987 - 1991 May 98 Spain 1980 February 97 Spain 1990 February 97 Hungary 1994 January 1997 France 1989 December 1996 France 1984 (Budget) December 1996
Revision notes – Taiwan 1997 & 2000
(June 2008)TW97 - Revision notes
Due to the uncovering of a major error in the construction of V13 and hence the derivation of DPI (whereby employee as well as employer and government contributions were subtracted from GI, whereas only the employee ones were initially included in GI, implying a substantial underestimation of DPI), the dataset was reopened and rerun with the new Wave V.2 rules (see Wave V.2 revisions and lissification table for details).TW00 - Revision notes
LM variablesFollowing the data provider LFS based-definition, adults have been defined as individuals aged 15 or over (this has an implication for the universe of all LM variables). Furthermore:
PUMAS: The detail about employment status was removed (it is in PACTIV), while detail on full- vs. part-time status was added (not clear whether it refers to main job or total employment); Information about whether the not employed are marginally employed was added.
PACTIV: Detail on socio-economic group was added.
PIND: The mapping to NACE was troublesome in 3 cases: in 2 cases the original code overlaps over 2 major NACE groups (code 65 & 81), and in one case it only includes a subset of a major group (code 71). Those cases were coded with a 20000 flag (see LM table).
PTYPEWK: Codes 201, 202 and 208 were recoded into respectively 211 (Publ; general government), 221 (Publ; public enterprises) and 281 (Publ; soldiers).
PFULPAR: The labels were changed to Full-time and Part-time.
PSLOT1: The variable, previously empty, was filled with the number of weeks covered by social insurance. Please note that this is positive for the largest part of the employed persons (including the marginally employed) and zero for the largest part of the not employed. Selecting employed persons only (both mainly employed and marginally employed), this can be used as an approximation of number of weeks worked (even though it may include periods of unemployment).
Income variables
An error was uncovered in variable V13, which overestimated substantially that variable, hence underestimating the final DPI. In fact, it previously contained all insurance contributions, those paid by the employees (20% of the total), but also those paid by the employers (70%) and the government (10%); the last two amounts though were not included in the gross income figures, hence when subtracting V13 from GI, the resulting DPI was substantially underestimated. Those amounts have now been removed from V13, which now contains only the employees contributions.
The contents of V25SR were moved to variable V24SR (as it turned out that households were receiving those benefits irrespective of the income quintile to which they belonged).
Retirement pay (a mandatory occupational pension) was moved from V32SR to V32S1a.
Variable V35X, previously empty, was filled with transfers paid to privates (excl. charity).
Expenditure variables
Some corrections were carried out to include a few items had previously been completely left out:
household operations (COICOP 5.6) has been added to EQUIPEXP;
medical consumption of NHI (COICOP 6) has been added to MEDEXP;
personal accident and medical premiums - COICOP 12.5.3).has been added to MISCEXP.
And other corrections concerned the move of misplaced items:
household premiums (COICOP 12.5.2) was moved from HOUSEXP to MISCEXP;
motor vehicle premiums (COICOP 12.5.4) was moved from TRANEXP to MISCEXP.
Revision notes – Mexico 2002
(April 2008)Socio-Demographic Variables
PMART The universe was redefined to be all adults (excluding all children, but including adult guests, domestic servants and absent heads).PETHNAT The variable, previously not filled, was filled with information on literacy in Spanish or other languages for all individuals aged 5 or above, with the following labels: 1 can read and write in Spanish 2 cannot read and write in Spanish, but can in another language/dialect 3 cannot read and write in any language
PEDUC / PTOCC Both general education (previously coded in PEDUC) and technical and commercial education (previously coded in PTOCC) were aggregated in PEDUC, using the same structure of education level used for MX04.PENROL Codes (and labels) were modified as follows: 111 enrolled in general education, public or official school 112 enrolled in general education, private school 113 enrolled in general education, other 120 enrolled in technical education 200 not enrolled in education
PDISABL Because of the extreme bias associated with the only available information about disability status (it only identifies disabled persons who do not receive any pension at all - not even a disability one - and who do not work at all - not even unpaid family work or occasional work), the variable was emptied (plus the same information can be derived from PCLFS).
D22 The universe was enlarged to include all households (also the additional ones living in same dwelling as principal household, coded with a 0).
Labour Market Variables
PCLFS Code 399 (NILF; n.e.c.) was recoded into 381 and relabeled "NILF; other situation".PCMAS Code 299 (Not Empl; n.e.c.) was recoded into 281 and relabeled "Not Empl other situation". PACTIV Armed Forces were separated out (with code 181).
PIND The use of a SCIAN-ISICS conversion table allowed the better identification of some ISIC code. PFULPAR The universe was reduced to employed persons (i.e. all the missings were recoded into -1).
PCONTRA The codes were changed to comply with teh semi-standard ones: 110 base, or indefinite contract 120 fixed-term contract 200 no written contract
PSECJOB Detail about status in second employment was added in 2nd and 3rd digits.
PHOURSU The variable was topcoded at 97.
Expenditure Variables
All expenditure variables were revisited to better comply to the COICOP major groups. More specifically:
- HOUSEXP: imputed rent was added, while mortgage payment was removed (and put into MORTEXP), as well as property tax (which was put in V12);
- photographic and video equipment was moved from EQUIPEXP, and audio-visual articles and equipment from COMMEXP, both into CULTEXP;
- major maintenance works were removed from EQUIPEXP as it is not a current expenditure;
- all exepditures which are not strictly related to (public) education were removed from EDUCEXP and put in the appropriate variables (CULTEXP, EQUIPEXP, RESTOEXP, TRANEXP and MISCEXP);
- contributions towards (private) health insurance were moved from MEDEXP to MISCEXP;
- packages for parties, including catering services, touristic expenditures, including hotels, and/ /non touristic hotels and accommodation were moved from MISCEXP to RESTOEXP;
- a few items of non- consumption expenditures (such as indemnisations paid to others, cash losses and burglaries, transfers to other households/, /charitable donations and expenditures for gifts) were removed from MISCEXP;
- variable TOTEXP was then modified to take into account expenditures which were removed or added in the above-mentioned variables.Income Variables
Some income sources were moved from across LIS variables:
- profits from cooperatives were moved from V1NET (and PNWTIME) to V5 (as members of cooperatives are considered as self-employed persons by ILO, as well as in PACTIV);
- pensions are no longer split in between V18S1 and V19SR, as the way the split was done was not efficient;
- donations and scholarships from the government were moved from V24S2 to V25SR (as they seem to include mostly donations rather than scholarships);
- benefits from Progresa-Oportunidades were moved from V24SR to V25S4 (as these are benefits towards families with children in extreme poverty);Some incomes were removed and other added, none of those affecting DPI though:
- to be consistent with the expenditure side, all non current non-monetary expenditures were removed from the non cash incomes (V6, V9 and ALTNCASH);
- an error which caused the imputed rent to record both the amount one would charge for the dwelling and the amount one would pay for it was uncovered and corrected (the latter was used);
- variable V12, previously empty, was filled with the amount of the property taxes paid; - variable V35X, previously empty, was filled with transfers paid to relatives and other persons outside the household;
- variable V37S1 (previously including incomes from withdrawals or disinvestments) was emptied as those do not represent capital gains; - all the incomes from sales, credits or loans were removed. from V37SR.
Revision notes – Sweden 2000
(April 2008)File information variables
HSLOT1 The flag for interviewed versus not interviewed households was replaced by the weight recalibrated for the smaller sample of the interviewed households (the non interviewed ones being set to missing).Socio-demographic variables
D7 The codes were reorganised to facilitate conversion into NUTS 3 classification. * PETHNAT* The variable was further refined (which allowed to correct some missings into valid values)PIMMIGR Sweden born individuals were better identified and coded with 0.
PEDUC More detail added in the second digit: 0 pre-primary level of education 10 primary level of education 20 lower secondary level of education 31 upper secondary level of education; less than two years 32 upper secondary level of education; two years 33 upper secondary level of education; three years 41 post-secondary non-tertiary 52 first stage of tertiary; two years 53 first stage of tertiary; three years 54 first stage of tertiary; four years 55 first stage of tertiary; at least five years 60 second stage of tertiary; other research education 62 second stage of tertiary; licenciat 64 second stage of tertiary; doctoral
PDISABL Detail about level of invalidity pension was added.*
Income variables
Some benefits were moved across LIS variables: - Work injury annuities from the collective insurance were moved from V17S2 to V32S1a; - The compensation for parents taking care of handicapped children was moved V20SR to V22SR.V37S1 Capital losses were subtracted from capital gains.
Labout market variables
PSLOT2 The universe was enlarged to include all adults, and information on three main activity degree was used for those were the actual worked hours were not available (coded with values above 10000 to distinguish them from the individuals for which the exact working hours were available).PACTIV Codes 210, 220 ad 290 were changed respectively to 291, 292 and 299
PIND The universe was enlarged to include all persons included into the employers' register (including children).
PSKILL Code 19 (skills level unknown) was set to missing.
Revision notes – Deutschland 2000
(March 2008)PGWTIME / PNWTIME
An error in variables PNWTIME and PGWTIME (all the amounts had been incorrectly multiplied by 1.95583) was corrected.
Revision notes – Finland
(January 2008)D7
The codes were reorganised to facilitate conversion into NUTS 3 classification.
PEDUC
Detail on degree level of first stage tertiary education (ISCED 5A) was added (previous code 52 was further disaggregated into 52 and 53).
PDISABLDetail on incapacity to work was added on top of percentage of official invalidity; the labels are now as follows:
0 not invalid
1001-1100 invalid (percentage of invalidity in digits 2-4) but not incapable of work
2001-2100 invalid (percentage of invalidity in digits 2-4) and incapable of work
9000 not invalid but incapable of work
POCC
An error in the code for Armed Forces was corrected (previously wrongly coded as 1000, now as 100, hence complying the ISCO classification).PCARE / PWEEKFT
The 1000 categories were used to flag partial information on weeks of care / FT work.V1 / V6
V1 / V4 / V8S4
Options given by the employer to the employee as means of payment were moved from V1 to V6 (hence affecting DPI).
A few minor errors were uncovered and were corrected, all slightly influencing the final DPI figure.
Revision notes – Spain
(December 2007)Socio-demographic variables
PPNUM
The 50 codes were added for persons with unknown age.
PMART
The universe was restricted to adults only (so that the imputation of children to never married was undone).
PPARSTA
2 children wrongly coded as single other persons (code 14) and 2 adults coded as missing (all 4 because of missing age), were recoded into the right categories.
PIMMIGR
The universe was polished (several adults coded as -1 were recoded into values in the universe (either valid values or missing).
Further detail was added for other modern languages; more specifically, the following codes (previously all lumped in 90) were added:
21 Albanian
22 Bulgarian
23 Czech
24 Slovak
25 Hungarian
26 Icelandic
27 Norwegian
28 Polish
29 Romanian
30 Serb
31 Serb-Croatian
32 Turkish
Labour market variables
Roster information on whether a person works at least 15 hours a week was used for individuals who did not complete the personal interview; this allowed a considerable reduction of the number of missing values for variables PCLFS, PCMAS and PHOURSU, and the inclusion in the universe of non interviewed adults working at least 15 hours (even though they are most often coded with missing) for most job characteristics variables (all except PSUPERV and PCONTRA).
Furthermore, some code were slightly changed (either to allow for further detail or for better comparability with other LIS datasets), and the new guidelines concerning time variables (see Section D of the Guidelines of Labour Market Variables) were applied to PHOURSU, PWEEKTL and PWEEKUP.
More specifically:
PCLFS
Thanks to the use of a roster variable previously ignored, the status of not interviewed individuals (previously always coded as missing) was identified as in employment (with code 191) for those who normally work at least 15 hours, the others being kept as missing (as no information on work less than 15 hours was available for them).
Also, some codes for the inactives and not employed have been slightly modified.
Code 348 (NILF; main activity is ill/disabled) was recoded as 341, while code 341 (NILF; not seeking work because of illness/injury) was recoded as 342.
A new code 351 (NILF; main activity is rentier) has been identified (previously in 391).
A new code 412 (Not emp; waiting for work season) has been identified (previously in 431).
PCMAS
Thanks to the use of a roster variable previously ignored, the status of not interviewed individuals (previously always coded as missing) was identified as mainly in employment (with code 191) for those who normally work at least 15 hours, the others being set to either 999 (Indist; not working 15+ hours, no other info) or kept as missing, depending on whether it is known if they do not work at least 15 hours, or whether this info was completely missing.
The unemployed (previously all coded 270) were further split in three categories according to whether they are self-defined unemployed (271), seeking work (272) or waiting for new job to start (273).
Code 210 (Not Emp; retired) was recoded as 211.
Code 248 (NILF; main activity is ill/disabled) was recoded as 241, while code 241 (NILF; not seeking work because of illness/injury) was recoded as 242.
A new code 251 (NILF; main activity is rentier) has been identified (previously in 298).
Code 298 (Not Emp; engaged in other activity) was recoded as 289.
PUMAS
An error which caused persons who were either mainly working less than 15 hours per week or inactives to become all mainly employed (old code 115), was uncovered and 2 new indistinguishable codes were created accordingly (902 for persons with more than 6 months in such uncertain situation, and 903 for persons with not more than 6 months in employment, not employment, and such uncertain situation).
PACTIV, POCC, PIND, PTYPEWK, PNEMP, PSECJOB
Thanks to the use of a roster variable previously ignored, the not interviewed individuals who normally work at least 15 hours were identifiable and thus included in the universe (with missing values), rather than left out of the universe (with -1).
PSKILL
The variable, previously not filled, was filled with information on formal training or education that has given skills needed for present type of employment.
PFULPAR
Thanks to the use of a roster variable previously ignored, the not interviewed individuals who normally work at least 15 hours were identifiable and thus included in the universe (with a specific code 902), rather than left out of the universe (with -1).
PCONTRA
For better comparability across LIS datasets (see BE00), codes 1,2 3, 4 were recoded as 100, 200, 300, 400.
PTENURE
The variable, previously non existing, was filled with information on tenure (as limited as it may be in this dataset because of not availability of previous waves' information).
PHOURSU, PWEEKTL, PWEEKUP
The 1000 categories have been added to flag for incomplete information on hours (hence allowing the loss of partial information and a reduction of missing values).
In case two or more original variables are added together, the sum is topcoded to the value of the original topcode of the data provider (96 for hours and - implicitly - 52 for weeks).
PWEXPTL
Further detail on work experience of current workers has been added, and all the codes have changed accordingly.
PSLOT1
The variable, previously filled with the number of weekly hours caring for children or other persons, has been replaced by the age when completed higher educational level achieved.
PSEARCH
Detail on reason why not looking was added (so that code 200 was split in codes 201 to 209).
PCARE
Detail on whether care affects paid work was added, while information on whether the other person live with the respondent was suppressed (so that codes 100s changed accordingly).
PSLOT2
The variable, previously not filled, was filled with information on weekly hours caring for children or other persons (please note that similar information - with a slightly different universe - used to be in variable PSLOT1 before the revision).
Income variables
V32S1b / V32S1r / V32SR
Old-age occupational pensions and survivors occupational pensions were moved respectively from V32S1r and V32SR to V32S1b (in Spain only voluntary occupational pensions).
Revision notes – Estonia
(December 2007)HSLOT1
The base weight for the total sample of households who took part to the interview and fully completed the Household Picture has been put here, so that users interested in using demographic and/or LM variables only can use all available observations (by appending the regular and shadow file) and use the correct weight (the weight included in HWEIGHT is recalibrated for the subset of households who also completed the Income and Expenditure Diaries).
Socio-demographic variables
D6
The variable, previously always missing because of the unavailability of individual level incomes, has now been filled with the number of persons who are currently employed.
D7
The codes were reorganised to facilitate conversion into NUTS 3 classification.
PAGE
Following the data provider definition, age was redefined as age as of 31 December 2000.
PMARTThe universe was enlarged to include everyone, with the not married (which include never married, widowed, divorced, etc.) being given code 3.
MORTEXP
Also, detail about cohabitation has been suppressed (it can now be found in PPARSTA).
PETHNAT
Persons coded as with undetermined citizenship (previously coded with missing) were recoded with a specific code (9 undetermined) since those are those who have an "aliens passport", or who do not have any official document proving their citizenship (most of them self-define themselves as Russians in variable PIMMIGR).
PIMMIGR
The variable, previously empty, has been filled with ethnic nationality (nationality which the person feels to be the most closely connected to ethnically and culturally).
PEDUC
More detail about national education levels has been added (on top of ISCED).
PENROL
The universe was enlarged to include all individuals aged 4 or over (with a new code for those not enrolled in education). Also, the variable has been reorganised to match with the ISCED level.
PDISABL
The variable was completely changed, and now contains a self-assessment of health status.
Income and expenditure variables
A better effort was made to distinguish zero and missing income and expenditure variables in the shadow file (by construction of the sample, in the regular file there are no missings).
TOTEXP
Was restricted to include only consumption expenditure.
The variable, previously empty, was filled with the amount of private loans repayments.
V8S1 / V8SR
Income from property was moved from V8S1 to V8SR.
V9
The variable, previously empty, was filled with the imputed rent.
V19S1r / V19SR
Pensions were moved from V19S1r to V19SR.
V20S1 / V20SR
Child benefits were moved from V20S1 to V20SR.
V21S1 / V21SR
Unemployment benefits were moved from V21S1 to V21SR.
V25S1 / V25SR
Social assistance were moved from V25S1 to V25SR.
V37S1
The variable, previously containing income from sales of stocks, was emptied.
Revision notes – Hungary 1999
(November 2007)
Socio-demographic variables
D7
The regional classification was recoded according to NUTS (3 digit) classification (the same 20 regions can still be identified, but the codes follow the NUTS).
D20
Detail on type of settlement (village / city / county seat) was added.
D22
Detail on landlord (local government, private property, employer or other) wad added for tenants and lodgers.
V10
The value of the residence was added for tenants as well. Also, the value of worth was used in case selling price was missing.
PWEIGHT
The cross-sectional individual weight was used instead of the household level weight already existing in HWEIGHT (note that it is only available for adults who completed the questionnaire - set to 0 for the others).
PMART
The codes were reorganised to follow the partial standardisation of this variable (see LIS Variables Definition List), mostly implying the non consideration of the cohabiting status (included in PPARSTA) and the reordering of other codes. Also, the 4 cases of unknown marital status (which had previously been imputed), were set back to missing.
PCOHABI
Married but separated persons not living with partner were wrongly coded as having a partner, this has been corrected.
PETHNAT
New variable: the variable was filled with indication of Gypsy descent.
PEDUC
The universe was enlarged to include everybody and code 0 was added for children of primary school age or not of school age yet; this allowed the inclusion of information on "children" who already completed primary school or who already dropped out of primary school, plus information on adults who are still in primary school.
PENROL
The universe was enlarged to include everybody and the coding changed accordingly.
PDISABL
The contents changed to health state as self-assessed by respondent (both in absolute terms and relative to people of the same age).
Labour Market variables
All labour market variables were extensively revised in order to better comply with the new guidelines for Wave V.2 LM variables (both in terms of missing values policy and contents and constructions of the variables themselves). Furthermore four additional variables were added. More specifically:
1) Polishing of the universe
In the original data, missing values and not applicables were indistinguishable. In an attempt to better define the universe of the LIS variables (and hence clearly distinguish between those in the universe but missing (dot), and those not in the universe (-1)), for all variables greater attention was paid to the definition of the universe and subsequent coding of missing versus -1. This implied a mere polishing of the universe for all previously existing variables (where some observations were recoded from -1 to missing - hardly ever the opposite operation was needed).
2) Modified variables
PCLFS
The universe was enlarged to include all individuals, and the code was substantially modified.
POCC
The codes for which teh data provider had not given any ISCO correspondence (and which had remained unlabeled) were labeled according to information supplied in the FEOR-93 Hungarian Standard Classification of Occupations from the Hungarian Central Statistical Office document on Hungarian Population Census 2001.
PIND
The codes were modified in order to get the best approximation possible of the 5-digit LIS/NACE code; original labels were retained and categories not easily matching the NACE code were assigned values above 20000.
PTYPEWK
Extensive detail on the organisational structure of the company was added.
PSKILL
The codes were reordered in order to get decreasing skill levels.
PSECJOB
Information about the number and type of jobs in addition to main job or pension was added.
PHOURSU / PHOURSA
The variables were modified in order to take better account of the difference between missing and zero hours.
The 1000 categories have been added to flag for incomplete information on hours.
PWEEKUP
The category 1000 was added to flag for incomplete information.
PSEARCH
The codes were modified to add information on whether currently employed or not (as only not employed were asked about the reason why not looking).
3) New variables
PCMAS
The variable, previously non existing, was filled with with the main activity status as derived from hours worked in February 2000.
PACTIV
The variable, previously non existing, was filled with the status in employment.
PFULPAR
The variable, previously non existing, was filled with the number of hours worked in main job in February.
PUPERV
The variable, previously non existing, was filled with the number of employees or staff under worker.
Expenditure variables
In all expenditure variables, missing values, which had previously been overwritten by zeros, were put back to missing.
In addition:
HOUSEXP
A major double count was corrected.
EQUIPEXP - CULTEXP - EDUCEXP - MISCEXP
A few changes were needed made to better comply with COICOP:
- home-cleaning babysitting and washing powder cleaning fuids were moved from MISCEXP to EQUIPEXP (as they correspond to COICOP category 5.6 - Goods and services for routine household maintenance);
- electronic durables, previously not considered, were put in CULTEXP (as they mostly refer to COICOP categories 9.1 (Audio-visual, photographic and information processing equipment) and 9.2 (Other major durables for recreation and culture), since the household electronic durables are already most likely included together with flat equipment articles in EQUIPEXP.
- culture, education and tutoring was moved from CULTEXP to EDUCEXP as it most likely includes educational expenditures for the larger part.
MORTEXP
New variable: the total amount of the monthly rate of payment (annualised) for all loans towards banks was put here (as this most likely corresponds to mortgage payments for the larger part).
Income variables
The treatment of all income variables changed in this revision, both in terms of repartition of total income among the different LIS variables, and in terms of amount of total household income.
In the previous version of the LIS dataset, the main principle underlying the lissification was that of getting to a measure of DPI almost identical (with the exception of the few income sources which are not considered as belonging to DPI in the LIS definition) to the total household income as estimated by the data provider. It has then been recognised that this approach entailed several problems in terms of comparability to other LIS datasets, among which: possible underestimaton of some major incomes, impossibility to break down total household income by income source for a quarter of the dataset (all cases for which total income was previously put in V36), break of the LIS internal rule that individual level incomes should always sum up to the household level ones.
As a result, the following rules were adopted for the treatment of income variables in this current version of the LIS dataset.
1. original individual level variables were used for all incomes which were originally collected at the individual level;
2. original variables at the highest detail level (rather than the aggregated ones) were always used, so that better attention could be paid to the distinction between missing and zero incomes;
3. following the methodology used by the data provider, total current income (self-reported by the respondent if available, otherwise imputed by the data provider), annualised and corrected for inflation, was used to estimate missing incomes, including both the cases where all incomes are missing (either because not completed questionnaire, or because did not answer to income questions or answered with all zeros) or where only a subset of them are missing (in which case the difference between the total current income and whatever was reported at the detailed level is used); the total income (or difference between total and reported) was then put in the appropriate variable if possible (i.e. in V1NET for persons whose wage is missing, in V8S1/V19S1b/V19S4 for persons whose disability/old-age/widow's pension is missing), otherwise in V36.
Furthermore, several income sources were moved across LIS variables in order to better comply with the LIS Variables Definitions, and a slightly different method for interest income imputation was adopted. More specifically:
V1NET / V4 / V5
The few self-employment incomes which were separable and had previously been pout in V4 (income from sale of farm animals raised by the household and of products from horticultural/viticultural and agricultural work carried out by the household) and V5 (incomes from sales, commerce, selling and buying, commissions, dealing, organising and agency, and the profit/share or dividend from own enterprise)
were moved to V1NET in order not to create biased results (since the bulk of self employment income is not separable from wages and was already in V1NET).
V8S1
For interest income, a slightly different method was used (again with the purpose of trying to better identify true missings, and avoid the large number of cases where they were overwritten by zero).
Namely, following the data provider methodology, interest income was calculated as 10.3% of total interest-yielding savings, but the amount of the total savings was estimated by LIS in the following way:
- first the sum of all the detailed savings (including both the risky and non-risky ones) was used;
- if the above was missing, either the sum of the non missing savings or the total estimated amount as reported by the respondent (either as exact amount or as midpoint of the bracket), depending on which of the two was greater, was used;
- if still missing, an imputation was carried out by LIS because of the high number of missing values, namely the median amount of persons with savings.
V20S3 / V19S4
Orphan's allowance was moved from V20S3 to V19S4.
V20SR / V22S3
Child raising support (GYET) was moved from V20SR to V22S3.
V22SR / V22S1
Child care fee (a parental leave benefit) was moved from V22SR to V22S1.
V24S4 / V22S3
Child care allowance (GYES) was moved from V24S4 to V22S3.
V26S4 / V25S4
Child protection benefits (GYV) was moved from V26S4 to V25S4.
V35S1 / V35S2 / V35SR
Transfers from non-household members (V35S2 and V35S1) were moved to V35SR.
V36 / all other income variables included in DPI
Where possible, the amount of total household income (or difference between total and sum of the valid subcomponents) was allocated to the actual missing income source.
PSLOT1
The variable, still containing a flag for estimation / imputation of personal income was revised in order to reflect the differential treatment of income variables.
Overall, the impact on the final inequality and poverty key figures is very moderate (with a slight decrease in inequality and poverty, with the exception of elderly poverty).Revision notes – Italy 1998/2000
(October 2007)Working time variables (PHOURSU, PSLOT1, PWEEKTL, PWEEKFT, PWEEKPT)
In order to make the working time variables even more consistent across datasets, and in order not to loose partial information, the following refinements have been applied to all working time variables:
** in case two or more variables are added together, the sum is topcoded to the value of the original topcode of the data provider (i.e. if the data provider topcodes at 100 hours for each job, the hours of persons with more than one job who have more than 100 hours over all jobs, will be topcoded at 100); similarly annual hours can never be higher than 52 * the original data provider topcode.
** in case we only have partial information (either because the hours/weeks of one of several jobs are missing, or because the nature of one specific original variable made it impossible to calculate the exact number of total hours or weeks - e.g. an occasional job which we know happened, but we do not know for how long or how often during the year), we keep whatever information we have, but clearly flagging it to show the potential underestimation (by adding a value of 1,000 for weekly hours and weeks, and 10,000 for annual hours); as a result, all values coded above that flag, will stand for "at least XX number of hours/weeks" (e.g. if a person has two jobs, one at 35 hours and the other at 10 hours, but refuses to give the hours of the second one, we can at least keep the information about the first one by coding PHOURSU at 1035 - at least 35 hours). As a result, there will be less missing values in those variables.
See the lissification tables on-line for a more precise indication of the current contents of those variables.
Revision notes – Belgium 2000
(October 2007)Demographic variables
D22
An additional original variable was used, allowing the inclusion of information on subsidised rent for tenants (previously code 1, now codes 1 to 3, with the subsequent codes changing adequately).
PPNUM / PREL / PPARSTA
The universe has been polished (some cases which were not in the universe were either recoded/imputed into valid values or coded into missing).
PENROL
The universe has been polished (8 children who did not fill the child questionnaire were recoded from -1 to missing).
Also, information was added for children not in education, allowing the distinction between those not yet in education and those no longer in education (codes 201-203)
Labour market variables
All labour market variables
The universe has been polished: in most cases this involved recoding some persons from -1 into valid (or missing) values (in some variables implying the creation of new codes), and in another few cases it involved the opposite operation.
Roster information on whether a person works at least 15 hours a week was used for individuals who did not complete the personal interview; this allowed a considerable reduction of the number of missing values for variables PCLFS and PCMAS, and the inclusion in the universe of non interviewed adults working at least 15 hours (even though they are all coded with missing) for most job characteristics variables (all except PSUPERV and PCONTRA), as well as for the PHOURSU variable (where all of the not interviewed were recoded from -1 into 1015 or missing).
In addition:
PCLFS
Additional original variables were used, allowing the inclusion of further detail; more specifically:
- information on full-time versus part-time work for persons normally working 15+ hours (codes 111 to 113);
- detailed reason for temporary absence from work (codes 121 to 129);
The use of country-specific codes for the activity allowed the inclusion of further detail for the inactives (codes 300s) and the not employed (400s).
PCMAS
An additional original variable was used, allowing the inclusion of the reason why not at work for the not employed (codes 290s), including the recoding of persons temporarily absent because of irregular work from not employed to employed (code 113).
The use of country-specific codes for the activity allowed the inclusion of further detail for the not employed (200s).
PUMAS
The variable was considerably revised, as an error in the variable which caused all persons in the same activity during all 12 months to become missing was uncovered.
Code 190 was changed to 191 and relabeled into "Emp; multiple emp activities".
PIND
Country specific code 17990 was changed to 20970 (in order to increase consistency with other datasets).*
PSKILL
The variable (previously empty) was filled with information on professional category and on whether formal training was required for present work activities.
PFULPAR
Code 100 was changed to 101 and relabeled "FT; 30 hours or more".
Some cases wrongly coded at 100, were recoded into their correct values.
In cases where a person reported a reason why working part-time, but also reported normal weekly hours being greater than 30, the latter indication is used for the determination of FT- versus PT-status.
Persons reporting that the reason why working part-time is that the work is not part-time were recoded from part-timers into 101 if they worked 30 hours or more, and 901 otherwise.
PCONTRA
The universe was restricted to only include persons in paid employment, excluding the trainees/apprentices.
An additional original variable was used, allowing the inclusion of a second level of contract detail. As a result, the coding scheme changed from 1-5 into100s, etc.
PTENURE
The variable was substantially revised, switching from a system of 10 brackets of period of time (ranging from <1 month to >10 years) to exact number of years (whereby the months were converted in digits), including a 1000 flag to signal for possible underestimation of the duration.
PSECJOB
An additional original variable was used, allowing the inclusion of information on the type of the second job.
PHOURSU
In case of partial information (i.e., missing hours in one of several jobs), the flag 1000 has been used to signal for possible underestimation.
Country-specific missing value 99 was recoded into missing.
PWEEKTL
The variable was considerably revised, as an error in the variable which caused all persons in the same activity during all 12 months to become missing was uncovered. Many of those previously missing are now counted as full-year workers.
PWEEKUP
The variable was considerably revised, as an error in the variable which caused all persons in the same activity during all 12 months to become missing was uncovered. Many of those previously missing have been reassigned as either full-year unemployed, or as having no unemployment.
PWEXPTL
The variable was considerably revised. Experience is defined as years of work (in 15+ hour jobs) beginning at 16 or completion of continuous full-time education, whichever is later. When experience is not avaialble, tenure is used to indicate the minimum level of experience acquired. The flag 1000 has been used to signal for possible underestimation.
PCARE
The coding has been changed from the previous version. Reference to care in or out of the house was removed. Reference to affect on paid work was added.
Income variables
PGWAGE / PNWAGE / V1 / V1NET
In the cases where the number of months is imputed to 12 for regular stable workers, the condition for "regular stable worker" has been refined.
PNWAGE / V1NET
A few minor payments from the employer have been added.
PGWTIME /PNWTIME
Monthly wage of the main job was used instead of all jobs.
V16 / V18S2
The original variable "other sickness benefits" was moved from V18S2 to V16 (as it mostly contains sickness benefits).
V17S1 / V17SR
Compensation for occupational accidents and diseases was moved from V17S1 to V17SR (as it also include long-term benefits).
V25S2 / V25SR
Handicapped persons allowance was moved from V25SR to V25S2 (following the extension of the contents of the V25S2 variable to disabled assistance as well).
V34
In a handful of cases, a monthly amount has been multiplied by 12.
V35S1 / V35SR
In 12 cases, external financial help received uniquely from persons not related was moved from V35S1 to V35SR.
V35X
The amount of external financial help paid to non relatives was removed.
In 3 cases where the amount of the alimony paid coincided exactly with the amount of external financial help to support children, it was assumed that those were double counts and the amount of external financial help was removed from V35X
V37S2
Severance payments were added.
Revision notes – Switzerland 2000/2002
(September 2007)PMART
CH02 only: The universe was "polished" (one adult with -1 was recoded into missing).*
PIMMIGR
CH00 and CH02: The universe changed from all foreigners to all individuals (nationals were given the code 0).
PEDUC
CH02 only: The universe was "polished " (7 adults with -1 were recoded into missing and a 13-year old with a value was recoded as -1).
PENROL
CH00 and CH02: The universe changed from adults (age>14) currently enrolled in education to all adults and the variable was recoded recoded as follows:
111 enrolled in compulsory school
112 enrolled in 1-2 years of trade school or domestic services
121 enrolled in basic vocational school, dual system
122 enrolled in intermediate level secondary school
123 enrolled in vocational school, dual system
124 enrolled in full-time vocational training
125 enrolled in secondary stage vocational school
126 enrolled in upper secondary general education
131 enrolled in higher vocational diploma
132 enrolled in higher technical school
133 enrolled in higher vocational college
134 enrolled in university, incl. post doctorate
200 not enrolled in education
PCMAS
CH00 and CH02: The universe was polished (some adults previously not in the universe were recoded into the universe) and the variable was slightly restructured (the priority on employment for the non employed was removed and the detail on third and fourth activity for the non employed was removed).
PHOURSU
CH00 and CH02: Some missings were recoded into -9.
POCC
CH02 only: The universe was "polished" (one worker coded as -1 was recoded into missing).
CH00 and CH02: Some missing lables were added.
CH00 and CH02: The universe was enlarged to include also a few unemployed and inactive persons with valid information about (presumably) their previous job.
PACTIV
CH00 and CH02: The universe was enlarged to include also a few unemployed and inactive persons with valid information about (presumably) their previous job.
PSKILL
CH00 and CH02: The universe was enlarged to include also a few unemployed and inactive persons with valid information about (presumably) their previous job.
PSUPERV
CH00 and CH02: The universe was enlarged to include also a few unemployed and inactive persons with valid information about (presumably) their previous job.
D22
CH02 only: Code 6 was recoded into code 5.
V8S3 / V32SR
CH00 and CH02: Private supplementary pensions were moved from V8S3 to V32SR (as they also include foreign pensions and persons from former employer).
V8X / MORTEXP
CH00: interests paid on mortgages of principal and secondary residences were added to V8X.
CH02: interests paid on mortgages of principal residence were moved from MORTEXP to V8X, and interests paid on mortgages of secondary residences were added to V8X.
V21S1 / V21SR
CH00 and CH02: Benefits from unemployment insurance were moved from V21S1 to V21SR.
V24SR
*CH00: Tax reimbursements were divided by 12.
CH02: Tax reimbursements were added to V24SR (without multiplying by 12).
V35S1 / V35SR
CH00 and CH02: Transfers from other households or charity were moved from V35S1 to V35SRRevision notes – Belgium 2000
(October 2005)One minor correction was made to Belgium 2000.One benefit included in variable V25SR was moved into variable V18.
Revision notes – United States 1991
(June 2005)
The U.S. 1991 dataset was re-lissified using the complete sample.
By enlarging the sample from 25% to 100%, number of households increased from 16052 to 59038.
As a result of re-lissification, there is a slight change in inequality and poverty figures.
The gini coefficient increased from 0.336 to 0.338. Also poverty rates went slightly up:
Poverty line
(percent of median)40 50 60 Old version 11.5 17.5 23.5 New version 12.1 18.1 24.3
Revision notes – United Kingdom 1999 (Family Resources Survey)
(March 2005)
The UK99 dataset was slightly revised mainly because we discovered that taxes for self-employed were underestimated.
At the same moment, some other smaller improvements were carried out.The following LIS variables were involved in the revision.
PTOCC, D12, D13, actual ongoing training
- the scope was enlarged to now all persons actually being in full-time or part-time education.D6, number of earners
- the scope was slightly extended by now counting as earner also those persons who run their own business , but who during 1999 made neither profit nor loss.V3, non-mandatory employer contributions
- the amounts of rent paid by the employer that were recorded here were moved to variable V6 , once we realized that the character of these employer payments as more in line with in-kind earnings. This move has no effect on DPI.V8, income from capital
- the new subvariables were added, and V8s1 , V8s2, V8s3 and V8s4 were filled.V11, PYTAX, income taxes
- due to the complexity of the original survey files, some income taxes paid by self-employed were previously omitted. After thorough comparison to the derived variables within the FRS we have now also included taxes paid by those who do not prepare business accounts for tax purposes. Also, whenever the difference between the gross and net self-employment income as derived by the FRS team did not coincide to sum of taxes and contributions as defined by LIS variables, the difference was added to PYTAX respectively V11.V7, contributions for self-employed
- contributions were included for those self-employed who do not prepare business accounts for tax purposes.V13, contributions for employees
- contributions for non-workers were added.V21s1, unemployment insurance
- used to contain total job seekers allowance. Now the income based component has been taken out (included under V25s3) , thus only the contributory component remains. The near-cash DSS direct payments for jobseekers allowance were deducted and are now found in V26sr.V24sr, other social insurance
- income tax refunds have been addedV25s1, social assistance
- the near-cash DSS direct payments for income support were deducted and are now found in V26sr.V32sr, other occupational pensions
- personal pensions were taken out and moved to V8s3.Most of the changes were simply refinements of the LIS income variables. Merely the increase of income taxes for self-employed made that DPI slightly went down. However, as this is a small subgroup of the total population, the effects on the Gini coefficient and poverty rates remain minimal.
Revision notes - Mexico 1984 - 1998 (Household Income and Expenditure Survey)
(January 2005)
The MX84, MX89, MX92, MX94, MX94, MX96 and MX98 datasets were slightly revised in order to have a consistent time-series for the whole range of Mexican datasets in LIS.
The following change was made to the demographics:
- in variable PLFS (as well as LFSHD and LFSSP) all values were shifted up in order to avoid the use of code zero. Code 1 now represents "currently working".
The following changes were made to the income variables:
- V17 : accident pay : this variable now remains empty. The "Indemnizaciones por despido y accidentes de trabajo" turned out to be an allowance granted by the employer ; not a government benefit. Has moved to V37sr.
- V19s1 : basic old-age benefit : due to an error in the previous lissification program some double counting took place. This has been corrected.
- V21sr : other unemployment benefit : this variable now remains empty. The "Indemnizaciones por despido y accidentes de trabajo" turned out to be an allowance granted by the employer ; not a government benefit. Has moved to V37sr.
Due to the changes, DPI slightly went down. However, the effects on the Gini coefficient and poverty rates remain minimal.
Revision notes – Norway 2000 (Income and Property Distribution Survey)
(June 2004)
Thanks to the help of a visiting Norwegian expert (Brynjar Indahl) the Norway 2000 dataset was slightly revised in June 2004. The main correction was that several social security benefits were relocated between the LIS income variables, and as such more accurately reflecting the type of benefits.The main changes are :
- Rental deficits (k3220) are no longer deducted in V8s2, but are now included in self-employment income (V5)
- Voluntary early retirement income (lto_227) is no longer part of V32s1, but was moved to V19s3
- Unemployment benefits for self-employed have been split off from unemployment insurance (V21s1) and are now given separately under V21sr
- Maternity allowance for non-working mothers (rt_27) was moved from V22s1 to V22sr. As a result, the pay replacement (V22s1) now remains empty. In the original registers, this benefit is included in wages and cannot be separated.
- The rather divers income variable “other tax-free income” (lto_916) was separated from V24sr to V36. In V24sr , only the transfer called “support, training courses, refugees (kursstønad, flyktninger) was kept.
- The labels for variables PIMMIGR, IMMIGRHD, IMMIGRSP were slightly improved to better distinguish the different types of immigration backgrounds.
These changes did not have any effect on the value of DPI, nor did indicators like the poverty rate or the gini-coefficient change.
Revision notes – Germany 1989 (Socio-Economic Panel GSOEP)
(May 2004)
The Germany 1989 dataset underwent a major revision. The main changes are :1. Imputed income sources
The SOEP team at DIW has finalized the full imputation of all income sources. The main result of this improvement is that there is no longer any underestimation of income, nor that households end up having missing disposable income (DPI) due to one of it’s components being missing. As a result of this new income distribution the mean DPI increased whereas the Gini-coefficient slightly dropped.
2. Extended or improved detail for variables
For several demographic variables like industry and occupation, we are now able to present the categories according to standard international classifications. In order to keep consistency over time for the full set of all LIS datasets coming from GSOEP, we have kept the variable names of LIS Wave 4 , even though some of these variables normally do not exist in Wave 3 datasets. The most important changes are :
PEDUC / D10 / D11 : contains highest attained level of GENERAL education.
PTOCC / D12 / D13 : contains highest attained level of VOCATIONAL training.
POCC / D14 / D15 : occupation according to ISCO88.
PIND / D16 / D17 : industry according to ISIC / NACE.
PTYPEWK / D18 / D19 : public versus private sector ; further detail moved to PACTIV.
D20 : reduced to three categories, alike in other years for GSOEP datasets.
V17 : new ; previously found in V18.
V21s2 : new .
V23 : new .3. More sample records
By improving the selection criteria, more households are kept in the LIS sample. The full German sample is taken as input, only deleting refusals to participate in the survey.
File
old version
new version
household
4187
4411
persons
8840
9500
children
2288
2252
*** Note that previously , 15 year old were in the children-file (246 records).
IN SHORT THIS WAS A MAJOR REVISION AND PERSONS WHO USED THIS DATASET EARLIER SHOULD PROBABLY RERUN THEIR PROGRAMS TO CHECK THE EFFECT ON THEIR RESULTS.
Revision notes – Germany 1984 (Socio-Economic Panel GSOEP)
(May 2004)
The Germany 1984 dataset underwent a major revision. The main changes are :1. Imputed income sources
The SOEP team at DIW has finalized the full imputation of all income sources. The main result of this improvement is that there is no longer any underestimation of income, nor that households end up having missing disposable income (DPI) due to one of it’s components being missing.
2. Extended or improved detail for variables
For several demographic variables like industry and occupation, we are now able to present the categories according to standard international classifications. In order to keep consistency over time for the full set of all LIS datasets coming from GSOEP, we have kept the variable names of LIS Wave 4 , even though some of these variables normally do not exist in Wave 2 datasets. The most important changes are :
PEDUC / D10 / D11 : contains highest attained level of GENERAL education.
PTOCC / D12 / D13 : contains highest attained level of VOCATIONAL training.
POCC / D14 / D15 : occupation according to ISCO88.
PIND / D16 / D17 : industry according to ISIC / NACE.
PTYPEWK / D18 / D19 : public versus private sector ; further detail moved to PACTIV.
D7 : now contains Bundesländer (provinces), alike in other years for
GSOEP datasets.
D20 : reduced to three categories, alike in other years for GSOEP
datasets.
V17 : new ; previously found in V18.
V21s2 : new .
V23 : new .3. More sample records
By improving the selection criteria, more households are kept in the LIS sample. The full German sample is taken as input, only deleting refusals to participate in the survey.
File
old version
new version
household
5159
5194
persons
11282
11377
children
2779
2840
IN SHORT THIS WAS A MAJOR REVISION AND PERSONS WHO USED THIS DATASET EARLIER SHOULD PROBABLY RERUN THEIR PROGRAMS TO CHECK THE EFFECT ON THEIR RESULTS.
Revision notes – Germany 1994 (Socio-Economic Panel GSOEP)
(May 2004)
The Germany 1994 dataset underwent a major revision. The main changes are :1. Imputed income sources
The SOEP team at DIW has finalized the full imputation of all income sources. The main result of this improvement is that there is no longer any underestimation of income, nor that households end up having missing disposable income (DPI) due to one of it’s components being missing. As a result of this new income distribution the mean DPI increased whereas the Gini-coefficient slightly dropped.
2. Extended or improved detail for variables
For several demographic variables like industry and occupation, we are now able to present the categories according to standard international classifications. The most important changes are :
PETHNAT / D8 / ETHNATSP : now contains nationality instead of EAST-WEST-FOREIGNER (German sample reference).
PEDUC / D10 / D11 : contains highest attained level of GENERAL education.
PTOCC / D12 / D13 : contains highest attained level of VOCATIONAL training.
POCC / D14 / D15 : occupation according to ISCO88.
PIND / D16 / D17 : industry according to ISIC / NACE.
PTYPEWK / D18 / D19 : public versus private sector ; further detail moved to PACTIV.
PMART / D21 / MARTSP : standardized to single , married, etc..
D20 : new ; size of city.
V17 : new ; previously found in V18.
V18 : different contents ; previously was part of V19.
V21s2 : new .
V23 : new .
V37sr : new .
3. More sample records
By improving the selection criteria, more households are kept in the LIS sample. The full German sample is taken as input, only deleting refusals to participate in the survey, and cases where the weighting factor is zero.
File
old version
new version
household
6045
6379
persons
12614
13324
children
3095
3264
IN SHORT THIS WAS A MAJOR REVISION AND PERSONS WHO USED THIS DATASET EARLIER SHOULD PROBABLY RERUN THEIR PROGRAMS TO CHECK THE EFFECT ON THEIR RESULTS.
Revision notes - Germany 2000
(May 2004)
The Germany 2000 dataset underwent a major revision for several reasons :1. Increased sample size
The SOEP was replenished with a large new sample (called “sample F” ). LIS has now added this replenishment in the revised version of our dataset. As a result, the number of records increased in the following way :
File
old version
new version
household
6367
10985
persons
12691
21681
children
2746
4718
From this table it will be clear that the new datasets will open up new possibilities like to better explore regional figures, or subgroups of the sample without running into too small cells.
2. Imputed income sources
The SOEP team at DIW has finalized the full imputation of all income sources. The main result of this improvement is that there is no longer any underestimation of income, nor that households end up having missing disposable income (DPI) due to one of it’s components being missing. As a result of this new income distribution the mean DPI increased whereas the Gini-coefficient slightly dropped.
3. Extended or improved detail for variables
We were able to fill a few more variables like :
D20 : NEW ; the variable contains number of inhabitants.
V8X : NEW ; the mortgage repayments plus interests are recorded here.IN SHORT THIS WAS A MAJOR REVISION AND PERSONS WHO USED THIS DATASET EARLIER SHOULD PROBABLY RERUN THEIR PROGRAMS TO CHECK THE EFFECT ON THEIR RESULTS.
Revision notes - Finland 1995 (Income Distribution Survey)
(January 2003)
The FI95 dataset was revised mainly because of incorrect child poverty rates.The following changes were made to the income variables:
- two original variables previously left out were added to V20S1 (hkotihv, hkotihm).
- the expenses for self-employed in forestry as given in the variables metsvak, metsvkust and lmhm in V4 were deducted in stead of being summed up.
- athletes rewards (tpalv2) was added to V1.
- royalties and copyright fees was corrected from tpalv2 to tpalv2a in V8.
- a tax correction term (siperoik) included in (V13) was moved to V11 and deducted instead of being summed up.
Some original income variables changed place in the LIS variables in order to improve the comparability with the 2000 dataset:- employment allowance for self-employed (ttyoltuk) was moved from V24SR to V21S2.
- study benefits for studying on unpaid leave (hamkou) was moved from V24SR to V24S2.
- financial aid for students (tkopira) was moved from V26S2 too V24S2.
- pensions from abroad (tulkel, muulktu) were moved from V36 to V32.
- basic unemployment allowance (htyotper) was moved from V25S3 to V21S1.
- long