# 1 Finding patterns of convergence and divergence within the convergEU package

The convergEU package allows to obtain patterns of change along time for indicators in the European Union (EU) by invoking the ms_pattern_ori function:

help(ms_pattern_ori)

The ms_pattern_ori function allows for obtaining patterns for both lowBest and highBest types of indicators. More specifically, in this function the following patterns are defined through numerical labels and corresponding string labels:

• 1: Catching up
• 2: Flattening
• 3: Inversion
• 4: Outperforming
• 5: Slower pace
• 6: Diving
• 7: Defending better
• 8: Escaping
• 9: Falling away
• 10: Underperforming
• 11: Recovering
• 12: Reacting better
• 13: Paralleling better over
• 14: Paralleling equal over
• 15: Paralleling worse over
• 16: Paralleling worse under
• 17: Paralleling equal under
• 18: Paralleling better under
• 19: Crossing
• 20: Crossing reversed
• 21: Other (Inspection)

It is important to note that for finding patterns for indicators of type “low is better”, we assume that higher the indicator value, worse the considered socio/economic feature in a given member state (MS). Instead of creating new labels to tag patterns of this class of indicators, we transform the original indicator after noting that the absolute positioning of values is not relevant while judging for the presence of a given pattern. Thus, the indicators of type “low is better” are transformed, and the distance from the maximum value for each original observation is calculated. If the original index decreases then the transformed value increases, and the pattern recognition scheme applies in the same way as for indicators of type “high is better”.

The graphical plots for the defined patterns depending on the type of indicators (lowBest or highBest) are available by invoking the patt_legend function:

help(patt_legend)

When considering indicators of type highBest, the graphical representation of the patterns is as follows:

highind<-patt_legend(indiType="highBest")
highind

while for the lowBest type of indicators the plot of the patterns is:

lowhind<-patt_legend(indiType="lowBest")
lowhind

For further details on the defined patterns we refer to the Eurofound report “Monitoring convergence in the European Union Upward convergence in the EU: Concepts, measurements and indicators” (2018, p. 25-26).

For illustrating practically the points discussed above, let’s consider a first example related to the emp_20_64_MS dataset for which the indicator is of type highBest. Thus, for obtaining the patterns for this type of indicator,we invoke the ms_pattern_ori function as follows:

myemp <-ms_pattern_ori(emp_20_64_MS, "time",type="highBest")

The output of the ms_pattern_ori function consists of the usual three list components: “$res” that contains the results, “$msg” that possibly carries messages for the user and “$err” which is a string containing an error message, if an error occurs: names(myemp) #> [1] "res" "msg" "err" The “$res” component of the output contains the numerical labels for the patterns as well as their string labels:

mypattemp<-myemp$res$mat_label_tags
mypattempn<-myemp$res$mat_without_summaries
mypattempn
#> # A tibble: 28 x 17
#>    Country 2002/2003 2003/2004 2004/2005 2005/2006 2006/2007
#>    <chr>         <int>       <int>       <int>       <int>       <int>
#>  1 AT                4           3           4           4           4
#>  2 BE                6           1           1          21           1
#>  3 BG                1           1           1           1           1
#>  4 CY                4           4           3           4           2
#>  5 CZ                3           3           4           2           2
#>  6 DE                3          19          20           4           4
#>  7 DK                3           4           3           4           3
#>  8 EE                4           4           4           4           2
#>  9 EL                1           1           5           1           5
#> 10 ES                1           1           1           1           5
#> # … with 18 more rows, and 11 more variables: 2007/2008 <int>,
#> #   2008/2009 <int>, 2009/2010 <int>, 2010/2011 <int>, 2011/2012 <int>,
#> #   2012/2013 <int>, 2013/2014 <int>, 2014/2015 <int>, 2015/2016 <int>,
#> #   2016/2017 <int>, 2017/2018 <int>
mypattemp
#> # A tibble: 28 x 17
#>    Country 2002/2003 2003/2004 2004/2005 2005/2006 2006/2007
#>    <chr>   <chr>       <chr>       <chr>       <chr>       <chr>
#>  1 AT      Outperform… Inversion   Outperform… Outperform… Outperform…
#>  2 BE      Diving      Catching up Catching up Other (Ins… Catching up
#>  3 BG      Catching up Catching up Catching up Catching up Catching up
#>  4 CY      Outperform… Outperform… Inversion   Outperform… Flattening
#>  5 CZ      Inversion   Inversion   Outperform… Flattening  Flattening
#>  6 DE      Inversion   Crossing    Crossing r… Outperform… Outperform…
#>  7 DK      Inversion   Outperform… Inversion   Outperform… Inversion
#>  8 EE      Outperform… Outperform… Outperform… Outperform… Flattening
#>  9 EL      Catching up Catching up Slower pace Catching up Slower pace
#> 10 ES      Catching up Catching up Catching up Catching up Slower pace
#> # … with 18 more rows, and 11 more variables: 2007/2008 <chr>,
#> #   2008/2009 <chr>, 2009/2010 <chr>, 2010/2011 <chr>, 2011/2012 <chr>,
#> #   2012/2013 <chr>, 2013/2014 <chr>, 2014/2015 <chr>, 2015/2016 <chr>,
#> #   2016/2017 <chr>, 2017/2018 <chr>

Let’s illustrate more in detail one of the obtained patterns; to this end, we consider the time period 2006-2007 and the country France (“FR”) for which the obtained pattern is “Slower pace”:

mypattemp[["2006/2007"]][12]
#> [1] "Slower pace"

with the following graphical plot of the calculated pattern where the dashed blue line refers to the France and the black solid line refers to the EU: The interpretation of the “Slower pace” pattern is straightforward as illustrated in the Eurofound report “Monitoring convergence in the European Union Upward convergence in the EU: Concepts, measurements and indicators” (2018, p. 25).

A second example relates to an indicator of type “low is better”. To this end, let’s consider the indicator Unemployment rate by sex, age and educational attainment - annual averages for which the data are stored in the une_educ_a.xls file (Subsection 4.2, Tutorial for analyzing convergence with the convergEU package). First, we import the data from the xls file as explained in details in the Tutorial (Subsection 4.2):

# library(readxl)
file_name <- system.file("vign/une_educ_a.xls", package = "convergEU")
myxls2<-read_excel(file_name,
sheet="Data",range = "A12:AP22", na=":")
myxls2 <- dplyr::mutate(myxls2, TIME/GEO = as.numeric(TIME/GEO))

where “une_educ_a.xls” specifies the path (eventually including disk unit or folders) in which the xls file is stored. Then, the cluster EU27_2020 of MS is chosen, the data are checked for unsuited features, and missing values imputation is performed as follows:

EU27estr<-convergEU_glb()$EU27_2020$memberStates$codeMS myxls<- dplyr::select(myxls2,TIME/GEO,all_of(EU27estr)) check_data(myxls) #>$res
#> NULL
#>
#> $msg #> NULL #> #>$err
#> [1] "Error: one or more missing values in the dataframe."
myxls3<- dplyr::rename(myxls,time=TIME/GEO)
myxlsf <- impute_dataset(myxls3, timeName ="time",
countries=convergEU_glb()$EU27_2020$memberStates$codeMS, headMiss = c("cut", "constant")[2], tailMiss = c("cut", "constant")[2])$res
check_data(myxlsf)
#> $res #> [1] TRUE #> #>$msg
#> NULL
#>
#> $err #> NULL in order to obtain the final dataset myxlsf for calculating patterns. The indicator une_educ_a is of type “low is better”; thus, the syntax to find the patterns is as follows: myres <- ms_pattern_ori(myxlsf, "time",type="lowBest") where the “$res” component of the output contains the numerical labels for the patterns as well as their string labels:

mypattl<-myres$res$mat_label_tags
mypattn<-myres$res$mat_without_summaries
mypattn
#> # A tibble: 27 x 10
#>    Country 2009/2010 2010/2011 2011/2012 2012/2013 2013/2014
#>    <chr>         <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
#>  1 BE                7           8           7          10           3
#>  2 DK               10           7           7           8           2
#>  3 FR               20          21          21          21          21
#>  4 DE                8           8           8           8           2
#>  5 EL               10          10          19           9           1
#>  6 IE                9           9           9          11           1
#>  7 IT                7           7          10          10           3
#>  8 LU                8          10           7          10          21
#>  9 NL                7           8          10          10           3
#> 10 PT                7          10          10          10           4
#> # … with 17 more rows, and 4 more variables: 2014/2015 <dbl>,
#> #   2015/2016 <dbl>, 2016/2017 <dbl>, 2017/2018 <dbl>
mypattl
#> # A tibble: 27 x 10
#>    Country 2009/2010 2010/2011 2011/2012 2012/2013 2013/2014
#>    <chr>   <chr>       <chr>       <chr>       <chr>       <chr>
#>  1 BE      Defending … Escaping    Defending … Underperfo… Inversion
#>  2 DK      Underperfo… Defending … Defending … Escaping    Flattening
#>  3 FR      Crossing r… Other (Ins… Other (Ins… Other (Ins… Other (Ins…
#>  4 DE      Escaping    Escaping    Escaping    Escaping    Flattening
#>  5 EL      Underperfo… Underperfo… Crossing    Falling aw… Catching up
#>  6 IE      Falling aw… Falling aw… Falling aw… Recovering  Catching up
#>  7 IT      Defending … Defending … Underperfo… Underperfo… Inversion
#>  8 LU      Escaping    Underperfo… Defending … Underperfo… Other (Ins…
#>  9 NL      Defending … Escaping    Underperfo… Underperfo… Inversion
#> 10 PT      Defending … Underperfo… Underperfo… Underperfo… Outperform…
#> # … with 17 more rows, and 4 more variables: 2014/2015 <chr>,
#> #   2015/2016 <chr>, 2016/2017 <chr>, 2017/2018 <chr>

For this indicator, let’s take the time period 2015-2016 and the MS Finland (“FI”) for which the obtained pattern is again “Slower pace”:

mypattl$2015/2016[14] #> [1] "Slower pace" In this case, given that the indicator is of type “low is better”, the plot for the “Slower pace” pattern is: where the dashed blue line refers to Finland and the black solid line refers to the EU. Recall that differently from the previous indicator of type “high is better”, for this type of indicator the results for the “Slower pace” pattern should be interpreted according to the assumption that “higher the indicator value, worse the considered socio/economic feature in a member country”. To further illustrate other possible patterns, consider again the first example related to the emp_20_64_MS dataset (indicators of type “high is better”). For example, let’s take the time period 2011-2012 and the MS Portugal (“PT”) for which the pattern is “Crossing”: mypattemp$2011/2012[23]
#> [1] "Crossing"

where the corresponding plot for this pattern is:

Similarly, for the member country Lithuania (“LT”) in the same time period, the obtained pattern is now “Crossing reversed” and the plot for this pattern is:

# 2 Types of convergence/ divergence within the convergEU package

Convergence and divergence may be strict or weak, upward or downward. In the convergEU package, the function upDo_CoDi is specifically implemented for assessing the type of convergence/ divergence occurring for a given indicator, a collection of member states and a period of time:

help(upDo_CoDi)

The interpretation depends on the type of indicator, that is “highBest” or “lowBest”. Let’s consider a first example for the emp_20_64_MS dataset in which the indicator “Employment rate” is of type highBest. Suppose that we wish to determine the type of convergence/ divergence by considering as reference time the year 2008 (time_0), as target time the year 2010 (time_t), and the variance for summarizing dispersion (argument heter_fun):

Empconv<-upDo_CoDi(emp_20_64_MS,
timeName = "time",
indiType = "highBest",
time_0 = 2008,
time_t = 2010,
heter_fun = "var")

The output of the upDo_CoDi function consists of the usual three list components: “$res” that contains the results, “$msg” that possibly carries messages for the user and “$err” which is a string containing an error message, if an error occurs: names(Empconv) #> [1] "res" "msg" "err" Empconv$msg
#> NULL
Empconv$err #> NULL By considering more in details the component “$res”, it contains for example:

• a statement if a convergence or a divergence is occurred:
Empconv$res$declaration_type
#> [1] "Convergence"
• a statement of the type of convergence/divergence, i.e. strict or weak, downward or upward:
Empconv$res$declaration_strict
#> [1] "none"
Empconv$res$declaration_weak
#> [1] "Weak downward"
• a list of the member states labels for which the differences for a given indicator between the target time and the reference time are greater than zero :
Empconv$res$declaration_split$names_incre #> [1] "AT" "DE" "LU" "MT" "RO" • a list of the member states labels for which the differences for a given indicator between the target time and the reference time are smaller than zero : Empconv$res$declaration_split$names_decre
#>  [1] "BE" "BG" "CY" "CZ" "DK" "EE" "EL" "ES" "FI" "FR" "HR" "HU" "IE" "IT" "LT"
#> [16] "LV" "NL" "PL" "PT" "SE" "SI" "SK" "UK"
• a list of the differences for a given idicator between the target time and the reference time for each member state:
Empconv$res$diffe_MS
#>    AT   BE BG   CY CZ DE   DK    EE   EL   ES   FI   FR   HR   HU IE   IT   LT
#> 1 0.1 -0.4 -6 -1.5 -2  1 -3.8 -10.3 -2.5 -5.7 -2.8 -1.2 -2.8 -1.6 -8 -1.9 -7.7
#>    LU    LV  MT   NL   PL   PT  RO   SE   SI   SK   UK
#> 1 1.9 -11.1 0.9 -0.7 -0.7 -2.8 0.4 -2.3 -2.7 -4.2 -1.7
• the average of such differences:
Empconv$res$diffe_averages
#> [1] -2.860714
• the dispersion for the reference time and the target time respectively, and computed on the basis of the type of dispersion specified in the argument heter_fun (e.g. the variance in this example):
Empconv$res$dispersions
#> Time: 2008 Time: 2010
#>   29.76417   28.44423

Note that if the argument heter_fun is set to var (as in this example) or sd (i.e. the standard deviation), then calculations for those statistics are performed using as a denominator $$n-1$$, i.e. the number of observations decreased by 1. Thus, if the users prefer to adopt n as a denominator, then the function pop_var may be used as follows:

Empconvpop<-upDo_CoDi(emp_20_64_MS,
timeName = "time",
indiType = "highBest",
time_0 = 2008,
time_t = 2010,
heter_fun = "pop_var")

User-developed function are also allowed in the argument heter_fun, as illustrated in the following example related to an indicator of type lowBest. To this end, we consider the dataset myxlsf illustrated in the previous Section and related to the indicator Unemployment rate by sex, age and educational attainment. We choose as a reference time the year 2009 and as a target time the year 2011. Moreover, in this case we consider the following user-developed function for summarizing dispersion:

diffQQmu <-  function(vettore){
(quantile(vettore,0.75)-quantile(vettore,0.25))/mean(vettore)
}

This user-developed function diffQQmu is specified in the argument heter_fun of the function upDo_CoDi:

unempconvvar<-upDo_CoDi(myxlsf,
timeName = "time",
indiType = "lowBest",
time_0 = 2009,
time_t = 2011,
heter_fun = "diffQQmu")
unempconvvar
#> $res #>$res$declaration_type #> [1] "Divergence" #> #>$res$declaration_strict #> [1] "none" #> #>$res$declaration_weak #> [1] "Weak downward" #> #>$res$declaration_split #>$res$declaration_split$names_incre
#>  [1] "BE" "DK" "FR" "EL" "IE" "IT" "LU" "NL" "PT" "ES" "FI" "SE" "CY" "CZ" "HU"
#> [16] "LT" "MT" "PL" "SK" "SI" "BG" "HR"
#>
#> $res$declaration_split$names_decre #> [1] "DE" "AT" "EE" "LV" "RO" #> #> #>$res$diffe_MS #> BE DK FR DE EL IE IT LU NL PT ES AT FI SE CY CZ EE HU LV #> 1 0.3 2.2 0 -2.7 8.6 5.9 1.1 0.2 1 3.6 4.4 -1.6 1.3 0.7 1.4 0.2 -2.4 1.8 -1.9 #> LT MT PL SK SI BG RO HR #> 1 8.7 0 3.6 1 4.6 10.9 -0.3 7.5 #> #>$res$diffe_averages #> [1] 2.225926 #> #>$res$dispersions #> Time: 2009 Time: 2011 #> 0.7080638 0.7504565 #> #> #>$msg
#> NULL
#>
#> \$err
#> NULL

According to the obtained results, for this type of indicator there is an evidence of divergence of type “weak upward” in the period 2009-2011.

References

The following reference may be consulted to find further details on convergence:

• Eurofound (2018), Upward convergence in the EU: Concepts, measurements and indicators, Publications Office of the European Union, Luxembourg; by: Massimiliano Mascherini, Martina Bisello, Hans Dubois and Franz Eiffe.

• Nedka D. Nikiforova, Federico M. Stefanini, Chiara Litardi, Eleonora Peruffo and Massimiliano Mascherini (2020) Tutorial: analysis of convergence with the convergEU package. Package vignette URL https://local.disia.unifi.it/stefanini/RESEARCH/coneu/tutorial-conv.html