Image schemas were introduced by Johnson and Lakoff in 1987 and received attentionfromvariousdisciplinesofthecognitive sciences, eg cognitive linguistics, developmental psychology, or neuroscience. They describe cognitive building blocks, often called spatio-temporal relations, which are learned during infancy through physical interactions with the environment. These building blocks not only help us generalize to new and unseen situations but are also hypothesized to shape our abstract thinking and reasoning, as well as the language through which we express it. Automatically extracting image schemas from natural language is still an unsolved problem. Gromann and Hedblom propose a semi-automated method which identifies verb-preposition occurrences which serve as indicators for spatio-temporal structures in language. The extracted verbpreposition pairs are then grouped by a cluster analysis based on their coocurrring nouns. This way, spatial and non-spatial structures are divided, while text based on the same image schemas is clustered together, e.g. “continue along road” should be clustered with other instances of the image schema source-path-goal. The goal of the present study is to contribute to the enhancement of this method.