GitHub - FloorBuschenhenke/NonLinearityMethod: This code accompanies a methodological paper proposing a new non-linearity analysis for keystroke logged writing process data.

Introduction

This repository presents the code used to create a non-linearity analysis on the basis of keystroke data from the logger Inputlog, and accompanies the paper Measuring non-linearity of long-term writing processes (written by Floor Buschenhenke, Rianne Conijn, Luuk Van Waes, submitted to Reading & Writing special issue on methodology). For context, the abstract of our paper follows below.

Abstract

Non-linearity in writing provides important insight into the dynamics of writing and writing disfluencies. Currently, a range of linearity measures are available. These metrics are calculated based upon the leading edge, and are mostly used for short texts and single writing sessions. However, for longer, multi-session writing processes, the concept of the leading edge, as the singular outer boundary of the text-in-progress, is not enough to distinguish between linear production and non-linear text alterations. Therefore, in the current study, we propose a novel automatized non-linearity analysis. Within this approach, all backwards and forwards cursor and mouse operations from the point of utterance are extracted from keystroke data, and characterized both based on duration and distance. We illustrate this approach by analyzing the writing process of a complete novel based on more than 400 writing sessions totaling 276 hours of writing. The results show that this approach allows us to successfully cluster these writing sessions using the non-linearity characteristics.

Overview of the steps we have taken

We are using Inputlog-General Analysis files as our source material. You could apply the steps to the output of other keystroke loggers as well. The General Analysis is a table in which each keystroke action has its own row. It is quite similar to the raw output from other academic keystroke loggers.

Steps:

load all General Analysis files (one for each writing session) into R and merge them into one table, preserving the original session-numbering. (not including in this notebook)
aggregate/annotate the keystroke events in this table into larger segments of typing, deleting, jumping/navigating and focus (activities outside of the work document).
Calculate characteristics for each jump event and each typing event
Create a summary table with descriptive statistics for each session
Running a correlation matrix, removing highly correlation variables, then performing a cluster analysis to explore similarities between sessions. This step is not included in this notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
NonlinearityCodeshare.Rproj		NonlinearityCodeshare.Rproj
README.md		README.md
demo_data_nonlinearity.csv		demo_data_nonlinearity.csv
load_data.R		load_data.R
nonlinearitycodeshare.Rmd		nonlinearitycodeshare.Rmd
nonlinearitycodeshare.pdf		nonlinearitycodeshare.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

.gitignore

.gitignore

LICENSE

LICENSE

NonlinearityCodeshare.Rproj

NonlinearityCodeshare.Rproj

README.md

README.md

demo_data_nonlinearity.csv

demo_data_nonlinearity.csv

load_data.R

load_data.R

nonlinearitycodeshare.Rmd

nonlinearitycodeshare.Rmd

nonlinearitycodeshare.pdf

nonlinearitycodeshare.pdf

Repository files navigation

Introduction

Abstract

Overview of the steps we have taken

About

Releases 1

Packages

Languages

License

FloorBuschenhenke/NonLinearityMethod

Folders and files

Latest commit

History

Repository files navigation

Introduction

Abstract

Overview of the steps we have taken

About

Resources

License

Stars

Watchers

Forks

Languages