Motus Project

From Cognitive Typology
Jump to navigation Jump to search

The Motus (Latin: movement) Project was a research project developed by J.E. Sandoval in 2012, aiming "to catalog the entirety of all human motions and mannerisms" and to "understand what relationship exists between a person psychology and their facial expressions." [1] The project was open-source, following an atheoretical methodology and documented thousands of body movements as looping GIFs into an online data bank. The project was discontinued in 2013 due to funding and labor costs.


Screenshot demonstrating two bits matching

A summary of the project was introduced in a pilot video on August 18, 2012.[2] The following text is an import of the transcript of the video, provided with the author's consent. Headings are added for organization.


Hi my name is Auburn and I'm a developer at In this video I'll be giving you a short preview of what our project is about. We're not quite ready yet for a full presentation so this will just be a brief overview. In this video I'll be covering three main points. Number one, the nature of our project. Number two, exactly how it works and can accomplish its aim, and lastly I'll be giving our personal address to Pod'lair's challenge.
So first I'd like to mention that is a site where many projects will emerge in the future but all of them have the same goal in mind which is understanding what exactly defines the diversity we see in humanity and whether or not that diversity has some innate configuration that's more than just arbitrary environmental shaping and then developing ever more empirical ways of testing for that from different angles. And one such project, which is the focus of this video, is called motus. So motus is a project that focuses on the visual dimension and the project's aim is to catalog the entirety of all human motions and mannerisms and to understand what relationship exists between a person psychology and their facial expressions. Now, the concept is simple enough but to do this in a way that can yield empirical results a rigorous process has to be taken-- one that can weed out as much ambiguity as possible and give us data that can be interpreted the same way regardless of who's looking at it.


And that leads us to our second point which is the methodology. So first I'm going to start by explaining the process from start to finish and then I'll explain some key points afterwards. First a volunteer is selected, and the video of the volunteer selected. The video size is usually between 5 to 10 minutes and the video is interview format preferably showing the upper body and the hands. Once the video is selected the video is broken down into a list of dozens of individual clips according to the natural starting and ending points of each individual motion that the person exhibits. So each clip is approximately 1 to 3 seconds long resulting in about 100-200 clips per video. And each clip is given a name according to the initials of person and alphabetic character that serves as an ID of that video, since more than one video of the same person may be processed in the future, and then four digits to note the starting time and four digits to note the ending time of the clip. These clips are then called bits and they're converted into animated gif's which makes it much easier for the developers to observe the same motion repeatedly and to use as a tangible hard reference for any motion they wish to discuss. This is very important because in doing this it creates the raw material and the common ground from which anybody within the project can communicate. It eliminates ambiguity and makes it possible to reach a consensus about any specific motion and what it means rather than using only abstract verbal descriptions to describe a motion which can be interpreted in multiple ways by multiple people. So after the video is bitted we move on to the next stage which is identifying recurring instances of identical motions and clustering them together. So in other words if within those 100 plus bits there are any that are identical those are grouped together and that motion is called an X. And it's given a name that describes what the motion is. This is done with the whole video until only those bits that don't have a pair are left over and those bits are placed within the miscellaneous X pile where all other miscellaneous bits from other videos of other people are. And then those bits are also analyzed to see whether or not they correlate to the bits of other videos in which case if they do another X will be formed. So as you can see the main goal is to find motion correlations both within the same video and also within other videos and giving those correlations an exact name from which we can discuss them. And this basic cycle of bidding and clustering is repeated each time giving us with time a very rich database of motions and their interconnections to one another.

Statistical Tracking

Now we not only catalog all of these individual animated gif's but we also track statistics, which leads us to the third stage. When statistical analysis of the data is made we can see whether or not certain X's or cues appear together all the time, some of the time, or none of the time, and also see whether or not certain clusters of X's appear together. For example a certain type of head nod with a certain type of shoulder shrug might always be seen together or never seen together. And this is the point where the project starts to get interesting and have predictability. If for example we observe that a person demonstrates X 005 and X005 has been shown to have a strong tie with X 012, then we can predict that this person will also display X 012. And the final stage is identifying patterns represented by a P. Now patterns are whole person body signatures or in other words two or more people that share the whole package-- the entire rhythm of their body motions is identical. And this can be called the equivalent of a "type" or modus type.

Open source and atheoretical

Now the beauty of creating a typology using this methodology is that they can actually be debated and perfected on a platform where everybody is open to discuss it. Motus is an open source project which means all the data is available for anyone to critique or add on to it, and it's a potentially infinite project. The data is displayed on where all the bits are uploaded in gif format as well as a full listing of all the X's and all the groupings and all the patterns. Now this might seem like a very time-consuming project and actually it is, but it's capable of a legitimacy that other models cannot have. For example this project does not have any predefined laws or assumptions to dictate where the data goes or how to interpret it and so creating a typology using this method allows for the emergence of patterns to come forth holistically, should they exist in nature. This project doesn't start out with the assumption that there are only 4, 12 or 16 motus types. It allows for the data itself to dictate how many clusters of body motions humanities falls into, whether that number is 18, 34 or 79, without being tainted by our previous systems conceptualizations. But instead it is a empirical base point from which more complex conceptualizations and principles can eventually be extracted.

Relationship to psychology

Now another key point I want to discuss which I think is a question many of you might have is-- does motus type correlate at all with psychology or in other words do to people that have the same motus type also have similar cognitions. This is actually a very interesting question and one which this project alone cannot satisfactorily answer and that is a question which will be followed through by our neuroscience project as I mentioned in the beginning. Physiognomy is a site from which many projects will emerge all of which are designed to test a different dimension of humanity and then these projects can be compared side-by-side to get a fuller picture so let me give you a taste of just how this would work using neuroscience plus the motus project. When our neuro science project begins we will be looking to record brain activity using EEG head scans now before the volunteer goes through the EEG head scan. Their motus type would be identified then during the EEG brain scan they will also be visually recorded along side with the mural recording of the brain activity. Once the EEG scan is completed both recordings will be examined side-by-side to see what correlations if any exist between a certain body motion and the brain activity that occurs during the exact instance of that motion. So in this way we can understand whether or not a link exists between motions and cognition and by extension clusters of motion and clusters or patterns of cognition. And so in this way, if we see that there is indeed a noteworthy correlation between body movements and neural activity, then we can make predictions about what type of brain activity a person would have if placed under a brain scan, just by looking at them visually -- essentially creating the most empirical and strongest typological model in existence.

An address to Pod'lair

Now, pertaining to Pod'lair's challenge-- first I'd like to say that the first point of your challenge is already taking for granted that you are correct in your assumption that there are 16 cognitive configurations and thus 32 when gender is factored in. I want to note that this is not a proper challenge because the very parameters for the challenge carry assumptions which are exactly what we should be investigating. As you yourself noted about the work of Carl Jung, inventing conjectures and rules of how the psyche operates without an empirical basis provides no way of knowing whether such conjectures carry more weight than blind guesswork into the operations of the mind. Yet I see this very pitfall mimicked in your own model. From where did Pod'lair extrapolate the rules of the psyche that it has and where is the evidence for this. You make the claim that Pod'lair's interpretation of the psyche is true yet you've shown no empirical evidence of your own for why this is. Handing out a PDF list of what you think people's configurations are ultimately says nothing if there is no explanation of why their configuration is what it is. It does not count as empirical evidence and neither does giving percentages of accuracy of the reads of Lenore Thompson and Keirsey actually validate your own reads anymore than theirs. All that is really saying is how compatible their reads are to your own. There's a lot of talk about how Pod'lair is so much more empirical than any other model yet I see very little to show for this. Thus far Xylenne's mojo reading revealed series is leaving enormous gaps of ambiguity with each signal she describes, using words like aware and unaware to describe a visual phenomenon is no better than how you criticize the use of the phrase "a spotted beast with four legs and sharp teeth" to describe a cheetah. Now if Pod'lair truly wanted to take this seriously and establish their model in the realm of empirical evidence they should be developing a methodology not too dissimilar to the one outlined in this video by which every visual cue and signal can be unambiguously quantified and also open for review.
It is not enough to just give the samples and give vague interpretations of what the signals look like. You would actually have to take each video within your listing and break it down cue by cue, second by second, and explain the deduction. Until this is actually done Pod'lair cannot claim to be an empirical model. Your proof of concept is actually more just a concept of proof, as in it is a concept of how, theoretically, your model could be scientifically proven but until it actually happens there is no proof, there's only a concept that it might be provable. Nonetheless, will be happy to take a look at your evidence once it's properly compiled and you're more than welcome to take a look at the work we have so far. So far we have around ten samples, 1,200 bits and 170 X's. And that's all for the short preview, we hope you enjoyed it and thanks for watching.


The Motus Project, or a project with a similar methodology, is slated to re-appear given sufficient research funding. Computer vision software has been proposed as a potential means to identify body motions and expressions, allowing for the possibility of an automated video analysis program to be written[3]. The creation of a visual analysis software is proposed as a way to combat the labor costs of the project and allow for the analysis of tens of thousands of samples in rapid timeframes.