Sampling Procedure
Kenya
Sample size: The sample size for this survey was calculated using the United Nations (UN) formula (see Appendix 2) for estimating sample sizes in prevalence studies for household surveys (UN, 2008). In the computation of the sample, a 95% confidence level was applied, along with a default design effect of 2.0 to account for multistage sampling. A 10% non-response rate was factored into the calculations, consistent with other studies in Kenya (KNBS, 2015). An estimate of 16.2% was used for the expected prevalence of tobacco use among
adolescents (Nazir et al., 2019). The adolescent population proportion was estimated at 20.45% and the average household size estimated at 3.9, based on the 2019 Kenya Population and Housing Census (KNBS, 2019). Using these parameters, the calculation resulted in a nationally representative sample of 6,061 adolescents in Kenya, which is sufficient for analysis and national-level inferences. However, to adjust for the 10% non-response rate, a targeted sample size of 6,734 was computed
Sampling procedure:The survey utilized a three-stage stratified cluster sample design.The first stage involved the selection of 16 counties from Kenya's 47 counties. Prior to sampling, the
counties were stratified by grouping them into the eight former provinces. Thereafter, a representative and proportionate sample was selected from each province. The number of sampled counties was computed using Taro Yamane's simplified formula for proportions (Tepping, 1968). Nairobi county was included by default because it is a capital city, a region, and a county. The remaining 15 counties were randomly selected based on a computer-generated sequence using R statistical software.The second stage involved random selection of EAs within the 16 sampled counties, which was done with probability proportional to the size of the EA. Prior to EA sample selection, the EA sampling frame was first stratified by residence (rural and urban) and 224 EAs were selected: 81 in urban areas and 143 in rural areas. To generate a household sampling frame and identify households with eligible adolescents, the survey team conducted a household listing operation within the selected EAs. The operation involved visiting each EA to list all eligible households and their addresses.In the third stage, 30 households were randomly selected from each EA. In each selected household, only one adolescent aged 10 to 17 years was interviewed. These interviewees were randomly sampled if multiple adolescents were present in the household.
Nigeria
Sample size: Nigeria: The sample size for this study was estimated using the UN formula for estimating sample sizes in prevalence studies (UN, 2008), with a 95% confidence level. A sample design effect of 2.5 (default value) was applied since sampling was to be conducted at different administrative levels, such as geopolitical zones, states, and EAs. A non-response rate of 20% was factored into the calculations. While non-response rates for adult populations and previous adolescent studies in Nigeria are typically around 10% (NPC & ICF, 2019), a higher rate was considered due to the assumption that the target population may be mobile. The global prevalence of tobacco use among adolescents, reported as 19.4% (Itanyi et al.,2018) was used as the estimated prevalence due to a lack of recent national estimates. The adolescent population proportion was estimated at 17.9%, and the average household size was set at 4.7, based on national statistics from the 2018 Nigeria Demographic and Health Survey (NDHS) (NPC and ICF, 2019). Using these parameters, the calculation resulted in a nationally representative
sample of 6,358 adolescents in Nigeria, which is sufficient for analysis and national-level inferences. However, to adjust the 20% non-response rate, a targeted sample size of 7,948 was envisaged.
Sampling Procedure: The survey employed a multi-stage stratified cluster sampling design to produce a nationally representative sample of adolescents, covering both urban and rural areas. The first sampling stage involved randomly selecting 13 study states (12 states and the FCT, Abuja) from the national sampling frame of 36 states as provided by the NPC. The states were stratified by grouping them into their respective geopolitical zones, and then a representative and proportionate sample from each zone was randomly selected using a computer-generated sequence. The number of sampled states was calculated using Taro Yamane's simplified formula for proportions. The FCT was included by default due to its status as the capital. In the second stage, 265 EAs were selected using probability proportional to the size of the sampled states. Before selecting the EAs, the sampling frame was stratified by residence
(urban/rural). Among the selected EAs, 105 were in urban areas and 160 in rural areas. Prior to field work, the survey team carried out a household listing operation in all selected EAs to
obtain an updated list of eligible households in the selected EAs, which served as the sampling frame at the third stage of sample selection. In the third stage, 30 households per EA were randomly selected to reduce clustering effects. In each selected household, one adolescent aged 10 to 17 years was randomly selected to be interviewed (where multiple adolescents were available). If a selected adolescent was unavailable, interviewers made up to three return visits to complete the interview. If the adolescents remained unavailable after the third visit, the survey was closed, and no replacements were made.