OECD PIAAC Study 2013 data available for analysis

Technical Value

Data Analysis | R | Tableau

The recent PIAAC study of OECD covers skills of adults in 24 countries. You can find the study results together with background information here: http://www.oecd.org/site/piaac/ . The study is very interesting as it is the first study targeting adult skills for these countries. Fortunately OECD has made the case data of the study available for download under the ‘public use files’ section. Here is the direct link to the files: http://vs-web-fs-1.oecd.org/piaac/puf-data

Data is available in SPSS and SAS format. I went for the SPSS format, downloaded all files (all countries except Australia) and processed the files using R. Here is the script I created for this purpose:

  1. # OECD PIAAC Study 2013      
  2. # Interesting links:      
  3. #
  4. http://www.oecd.org/site/piaac/
  6. #
  7. http://www.oecd.org/site/piaac/publicdataandanalysis.htm
  9. #  SPSS files:
  10. http://vs-web-fs-1.oecd.org/piaac/puf-data
  11. library(foreign)      
  12. library(stringr)      
  13. library(reshape2)
  14. setwd("C:\\temp\\OECD_PIAAC_Study_2013_SPSS")      
  15. file_list <- list.files(path=".")      
  16. for (file in file_list){
  17.   if (!exists("piaac")){      
  18.     piaac <- read.spss(file, to.data.frame=T)      
  19.   }
  20.   if (exists("piaac")){      
  21.     temp_dataset <-read.spss(file, to.data.frame=T)      
  22.     piaac<-rbind(piaac, temp_dataset)      
  23.     rm(temp_dataset)      
  24.   }      
  25. }
  26. # only include this set of variables:      
  27. collist_meta<-c('CNTRYID_E','GENDER_R','AGEG5LFS','B_Q01a','YEARLYINCPR')
  28. # also include all score columns (10 per domain)      
  29. collist_lit<-names(piaac)[str_sub(names(piaac),start=1, end=5)=="PVLIT"]      
  30. collist_num<-names(piaac)[str_sub(names(piaac),start=1, end=5)=="PVNUM"]      
  31. collist_psl<-names(piaac)[str_sub(names(piaac),start=1, end=5)=="PVPSL"]      
  32. collist<-c(collist_meta,collist_lit,collist_num,collist_psl)      
  33. piaac2<-piaac[collist]
  34. # combine score values to single values      
  35. p<-data.frame(piaac2[collist_meta], "Score_Lit"=rowMeans(piaac2[collist_lit]), "Score_Num"=rowMeans(piaac2[collist_num]), "Score_Psl"=rowMeans(piaac2[collist_psl]))
  36. # remove incomplete cases      
  37. p<-melt(p, id=collist_meta)      
  38. p<-p[!is.na(p$value) & !is.na(p$CNTRYID_E),]
  39. # write output to csv-file      
  40. write.table(p, col.names=T, row.names=F, file="c:\\temp\\piaac.csv", sep=",", dec=".", na="", qmethod="double")

Or course, we could include additional variables, but for the moment, let’s focus on the variables from the script above. A full list with all variables and details is available using the “International Codebook” link in the “public use files” section.

The values can then be imported to Tableau desktop for further analysis.


The official charts are looking more dramatic (click here for an example ) as the value axis does not start with zero. This is a common method of making results look more impressive or – as Prof. Hichert says – it adds a certain “lie-factor” to the visualization where it’s up to the creator of the chart to scale this factor to whatever effect is desired.

A result that somewhat surprises me, was that the skills decrease with higher age, as shown below for the reading skill (Lit) in Germany:


Here is the map visualization (from red over grey to green, again for the reading skills) for the European region as Tableau automatically maps country names to geographic regions.


Another interesting chart shows the relationship between income (shown as quantile here) and test score (again for the reading skills, showing some selected countries):


And finally the comparison between highest education (according to the International Standard Classification of Education ) and score (again shown for reading skills) for Germany:


I’m not going into interpretations here. There is a lot of material available on the OECD website. The study is also a great source for demographic data. So if you like to discover more maybe the R script above can help you getting started.

Neuen Kommentar schreiben

Der Inhalt dieses Feldes wird nicht öffentlich zugänglich angezeigt.


  • Keine HTML-Tags erlaubt.
  • HTML - Zeilenumbrüche und Absätze werden automatisch erzeugt.
  • Web page addresses and email addresses turn into links automatically.
Teilen auf

Newsletter Anmeldung

Abonnieren Sie unseren Newsletter!
Lassen Sie sich regelmäßig über alle Neuigkeiten rundum ORAYLIS und die BI- & Big-Data-Branche informieren.

Jetzt anmelden