GGIR is an R-package primarily designed to process multi-day raw accelerometer data for physical activity, sleep, and circadian rhythm research. The term raw refers to data being expressed in m/s2 or gravitational acceleration as opposed to the previous generation of accelerometers which stored data in accelerometer brand-specific units. Despite the focus on the raw data, GGIR also offers functionality to process previous-generation accelerometer data. The signal processing for raw data includes automatic calibration to gravity, detection of abnormally high values, imputation of raw-level time gaps (only for specific sensor brands), and calculation of orientation angle and average magnitude acceleration based on a variety of metrics. Next, the signal processing for both raw and previous-generation data continue with the detection of non-wear and epoch-level imputation. Finally, GGIR uses all this information to describe the data at multiple time resolutions on data quality and data summary metrics that could be interpreted as estimates of physical activity, inactivity, sleep, and circadian rhythm. The time resolutions of GGIR output are:
maxRecordingInterval
and documentation on how to use
parameters as explained further down.An elaborate reflection on GGIR’s first 10 years of existence can be found in this blog post. In short, GGIR evolved from a series of R scripts used in research around 2010-2012 to a first release in 2013. A key factor for the growth of GGIR has been its adoption by the research community and the willingness of a variety of researchers to invest in GGIR either in terms of time investment or financially. GGIR would not be what it is without all their efforts.
The research field is highly heterogeneous in: the choice of sensor brand, data format, the study protocols used, and the research questions it tries to answer. At the same time many within the field lack the time or skills to write their own custom data processing software. GGIR aims to be flexible to handle all these different scenarios and at the same time remain accessible to those who lack time or skills to write software. Further, we hope GGIR is of use to those without the financial resources for commercial software, although we would like stress that we are not a charity and depend on paid or unpaid voluntary efforts from contributors.
The philosophy behind the algorithms as implemented in GGIR is that biologically explainable (heuristic) approaches to data analyses are preferable over purely data-driven approaches, unless there is no other way. The idea is that to advance knowledge in this field of research, it is essential to have an understanding of the causal relation between the phenomena being observed (e.g. body movement), the way an (acceleration) sensor works, what we do with the data produced, and how we interpret the data. In contrast, data-driven methods look by design for optimal correlation with data and are not or much less concerned with such causal links. Identical to how correlation is not necessarily equal to causation in health research, it can also confound the process of measurement. Some examples: We may be able to capture the differences in acceleration that correlate with different activity types or different levels of energy expenditure but that does not mean that we actually measure the activity type or the energy expenditure level. Ignoring such aspects can easily lead to overestimating the value of accelerometers to measure those constructs and to underestimate the value of an accelerometer to capture acceleration as a useful measure of behaviour if appropriately used and interpreted.
A second problem with data-driven methods is that they heavily depend on the availability of reliable criterion methods. It is argued that such reliable criterion methods do not exist:
As a result, it is essential to put strong emphasis on algorithms that have descriptive value on their own regardless of whether they offer a high correlation with supposed criterion methods.
It may sound obvious to some that research software is open-source, but in the fields of physical activity and sleep research, this is far from the accepted approach. GGIR is one of the very few research tools in this field that has a permissive license aimed to maximise its potential for re-use and collaboration.
We have structured the chapters in line with the GGIR training course we have been organising in recent years. The documentation that existed before was a collection of ad-hoc written paragraphs, that lacked a clear overarching structure and narrative. As a result, it was difficult to use the documentation in the training course. Further, we also wanted to provide a good level of documentation for those who do not follow the course or want to refresh their understanding of GGIR.
The documentation is mainly written in a narrative style where we
have tried to explain both the theory and practice of all GGIR
functionalities. Everything you need to type in your R script is
highlighted like this
.
This documentation is not intended as an academic review: We only cite publications to clarify the origin of algorithms and we only discuss what is part of GGIR.
Finally, the first version of this documentation was sponsored by Accelting with the commitment that this will remain available as free open-access documentation. However, things like this are much easier to maintain as a community: We would be grateful for your help to improve the documentation either by giving feedback, pull requests (for those who know how to do it), or financially.