Options retained for further study for the 2016 Census

Based on the analysis in Assessment of census approaches in the Canadian context, it is my conclusion that the only methodology approach that can be implemented by 2016 is a traditional census. Neither a register-based approach nor a continuous measurement approach can be put in place in the time frame required for 2016.

This leaves the question of what potential variants of a traditional census are feasible for 2016. This question can only be answered once some key considerations are more fully examined. For each such consideration, trade-offs among competing factors will need to be made to arrive at a suitable methodological design.

The first consideration is the appropriate use of mandatory and voluntary response methodologies in any future design. The designation of a question or related group of questions as mandatory or voluntary involves a trade-off among the following factors:

  • the importance of the requirement for the data, whether it be for legal or for other reasons
  • the degree of privacy-intrusiveness of the question
  • the accuracy of the resulting data, in particular the risk of non-response bias
  • the relative costs of collecting the data on a voluntary basis compared to collection on a mandatory basis.

Historically, some questions have been considered to be more inherently privacy-intrusive than others.Footnote 1 In theory, designating a question as voluntary makes it less intrusive, because the respondent is not legally required to answer the question. However, making a question voluntary increases the risk of non-response bias, due to the lower and less evenly spread response rate that may result. Collection costs when response is voluntary may also be higher if more effort is needed to convince the public to respond.

The 2011 NHS is the first time that a voluntary approach has been used to collect data that were previously collected as part of a mandatory census. More information on the accuracy of the NHS data and the costs of data collection will only become available later in 2011 and in 2012. Statistics Canada will be in a better position at that time to assess whether proceeding with a voluntary survey vehicle in 2016 is a desirable option. At this point in the development of the 2016 Census strategy, it is recommended that options for 2016 should leave open the question of the boundary between mandatory and voluntary collection vehicles.

In addition, I recommend that the criteria used to decide whether questions should be mandatory or voluntary in 2016 should, where possible, be made more explicit and quantifiable than in 2011. A useful starting point could be Statistics Canada's existing criteria, described in The Determination of Mandatory and Voluntary Surveys Guidelines (Statistics Canada 1997).

The second key consideration concerns the role to be played by sampling, i.e., whether questions are asked of every household or of only a sample.Footnote 2 When considering the use of sampling, the trade-offs are primarily among:

  • the accuracy of the resulting data, in particular the magnitude of the sampling error for small areas and population subgroups
  • the costs to collect and process the data (sampling generally reduces costs)
  • operational considerations, i.e., the sample design should be possible to implement in the field
  • the total burden on the population (sampling reduces the average number of questions asked per household).

By combining the type of response (mandatory or voluntary) with the choice of using sampling or not, the potential 2016 content can be grouped into the following three 'building blocks' for a 2016 Census/NHS design.Footnote 3

Block 1: Content that is required by legislation or where the need for highly accurate data is otherwise sufficiently important that it must be collected on a mandatory basis from 100% of the population. An example of such content might be the questions contained in previous census short forms, e.g., name, address, date of birth, sex, marital status, relationship to reference person in the household and mother tongue. Content in Block 1 would be considered to be part of the census under the Statistics Act.

Block 2: Content that is required by legislation or where the need for highly accurate data is sufficiently important that it needs to be collected on a mandatory basis, but only for a sample of the population. An example of this kind of content is a set of language questions that were previously on the census long form. Because this building block did not exist for 2011, but the questions were legally required to be collected as part of the census, the only solution for 2011 was to move them to Block 1. Because of its mandatory nature, content in Block 2 would also be considered to be part of the census under the Statistics Act.

Block 3: Content that only needs to be collected on a voluntary basis and only for a sample of the population. All of the questions in the 2011 NHS were considered to be of this type. Because it would not be part of the census according to the Statistics Act due to its non-mandatory nature, a wider variety of design options are possible for this building block, as discussed below.

Any 2016 or future Census would have to include some content in Block 1 to be considered a census. Consequently, there are four basic configurations (options) for a 2016 Census/NHS design, depending on the presence or absence of Blocks 2 and 3:

Option 1: Block 1 only (i.e., a single form census, no voluntary data collection). This is, in effect, the design that was used in every Canadian census from 1871 to 1966, when there was only one census questionnaire and it was mandatory.

Option 2: Block 1 plus Block 2 (i.e., all questions are mandatory, some are asked of 100% of households while others are asked only of a sample). This is the configuration that was used from 1971 to 2006, where a short and a long questionnaire were used, and both were part of the mandatory census.

Option 3: Block 1 plus Block 3 (i.e., some questions are asked on a mandatory basis from 100% of households, while the remaining questions are asked on a voluntary basis of a sample of households). This is the configuration that was used in the 2011 Census/NHS approach.

Option 4: Blocks 1, 2 and 3 (i.e., some questions are asked on a mandatory basis of 100% of the population, a second set of questions is asked on a mandatory basis but only from a sample of households, and a third set of questions is asked on a voluntary basis from a sample of households).

To my knowledge, Option 3 has only been used in Canada (in 2011 for the first time), and Option 4 has not been used in any country. Option 4 would provide the most flexibility for the collection of content, but would also be the most complex.

In conducting the more detailed assessment of these four basic design options, several more explicit considerations for sample design could be examined by Statistics Canada.

First, for any of the options involving sampling (Options 2, 3 and 4), the basic long-form/short-form methodology could be re-examined.Footnote 4 One drawback of the approach that has been used in Canada to date is that the number of questions to be answered by each household is unevenly spread; a household receives either a short questionnaire containing the minimum number of questions, or a long questionnaire containing the full set of questions. Other sample designs that would spread the respondent burden more evenly could be considered. One possibility, as proposed in Australia, is to have several different forms, with core questions on all forms and different sets of thematic questions on different forms. Another possibility is the so-called 'matrix' sampling, where questionnaires consist of various combinations (e.g., AB, AC and BC) of various content modules (A, B and C in this example). The drawbacks of such approaches are the inability to cross-tabulate variables that appear on different forms, as well as the extra complexity and costs of having several different versions of the questionnaire to print, deliver, capture and process.Footnote 5

Second, for Option 3 (the 2011 Census/NHS approach), the NHS sample design could be re-examined. As noted above, the 2011 NHS sampling fraction was set at one in three in order to keep the achieved sampling fraction roughly comparable to that of the 2006 Census long form. Sampling is also used for follow-up of non-response in order to mitigate the risk of uneven response rates across important subgroups of the population. Once the actual response rate to the 2011 NHS is known, as well as the effectiveness of using sampling for non-response follow-up, the sample design for a 2016 voluntary survey could be further refined. For example, it might be desirable to lower the overall sampling fraction but to follow up a higher fraction of non-respondents, or to use a higher overall sampling fraction but follow up a lower fraction of non-respondents.

For Option 4, which would include both mandatory and voluntary questions on a sample basis, there would be the issue of whether the two sets of sample questions should be asked of overlapping samples or of separate samples (known as 'positive coordination' and 'negative coordination' respectively). There are advantages and disadvantages to both approaches (for example, see Royce 2000). The sampling fractions for the mandatory and the voluntary sample questions could also be different. For example, because the voluntary component would presumably have less content than in 2011, consideration could be given to reducing the sampling fraction in order to keep the overall costs the same, at the expense of a higher sampling variance for those data items collected on a voluntary basis.

Finally, because any voluntary questions (i.e., Block 3 content) would not be considered to be part of the census under the Statistics Act, there is no legal requirement to collect Block 3 at the same time as the census. Collecting the data in close proximity to the census does have operational and statistical advantages (e.g., the census data can be used to weight the sample data), but the possibility of collecting them at a different time, or using a continuous collection approach for censuses beyond 2016 could be considered.

If either Option 3 or 4 were chosen, and the mandatory and voluntary content were collected in the same time frame, one could consider the extent to which data collection for the mandatory and voluntary questions should be more closely integrated. In the case of the Internet response channel, this is already the case; if the household is selected for the 2011 NHS, the invitation to complete the NHS questionnaire appears immediately after the respondent has submitted his or her completed census questionnaire. However, for self-responses on paper, separate census and NHS forms are used, and the collection for the NHS starts approximately one month later than for the census.

For 2016, consideration could be given to asking mandatory questions and voluntary questions on the same paper questionnaire, for those households which receive both types of questions. Such an approach is used to a limited extent in some other countries, including Australia, New Zealand, the United Kingdom and Hungary, where a question on religion appears on the census questionnaire but is designated as optional. Closer integration of the census and NHS paper questionnaires would reduce response burden by not asking NHS households to report the basic information (date of birth, gender, language, etc.) on both the census form and again on the NHS form,Footnote 6 as is the case in 2011, and might help to increase the response rates for the voluntary questions. The potential for cost efficiencies by combining the questions onto one form should also be examined.

On the other hand, the effects of mixing mandatory and voluntary questions on a single form need to be carefully considered. Doing so could conceivably put at risk the respondent's willingness to complete the mandatory questions, and could have negative effects on the amount of follow-up required to complete the census. Closer integration might also reduce Statistics Canada's flexibility to shift resources from the voluntary to the mandatory content if this became necessary. Field testing would be needed before proceeding with any integration of mandatory and voluntary questions and other aspects of the data collection operation. The experience of the integration of the census and NHS in the Internet response channel in 2011 would have to be carefully evaluated.

For all options, Statistics Canada should continue to build on the success of the use of income tax records to replace data collection in the 2006 Census by increasing the use of administrative data wherever possible. For example, consideration could be given to removing the income question completely from the questionnaire and simply informing respondents (as required by Statistics Canada's Policy on informing survey respondents) that Statistics Canada will be linking to their tax records, instead of asking them for their consent. In theory, this would permit tax data to be used for 100% of the population, effectively turning it into a short-form variable. In practice, there may be special populations with low rates of tax filing where the question would have to remain on the questionnaire, so further work would be needed to determine whether and exactly how this might be implemented. Statistics Canada would undoubtedly wish to test such an approach to ensure its public acceptability.

Statistics Canada should also consider whether there are administrative sources of data for some of the other content on the census or NHS questionnaire. In this regard, it would be worthwhile to examine in more detail the experiences of countries that use administrative sources for such variables. To date, the experiences of the countries that have been examined suggest that variables such as occupation, educational attainment and place of work are the variables that are often difficult to obtain from registers, or that are not available for particular subgroups of the population (e.g., education of older persons, place of work of the self-employed).

Finally, it is recommended that consideration should be given to additional possibilities for using administrative records within the traditional census approach, such as further improving the process for updating the Address Register, targeting non-response follow-up, or imputation of non-response. A separate project on the future uses of administrative data is being conducted and is expected to report later in 2011.

Date modified: