# Activity Recognition using Neblina

The Neblina™ motion tracking platform can be used as a wearable device for tracking the human body movements and activities. In this blog, we will go through a number of activity recognition methods that can be applied to the motion data extracted from Neblina.

#### Activity Recognition Using Body-Joint Orientation Information

If the Neblina motion tracking device is placed on a body joint, e.g., an arm, it would track the 3D orientation of the joint using its built-in advanced sensor fusion library. This would allow us to characterize several basic human activities.

Let us assume that the device is placed in a front pocket or attached to the thigh. Under such conditions, the thigh's elevation angle can directly be found by the built-in orientation filter on Neblina. This information can help us detect sitting, standing, walking, and running activities. Here are the results of an experiment conducted by placing Neblina in a front pocket.

As shown in the above figure, the elevation angle of the thigh can distinguish between sitting and standing, while the periodic gait cycles will determine the swing of the thigh (steps) as well as the cadence (steps per minute). As the cadence increases, one can distinguish between walking and running activities.

The orientation information of the thigh can help us detect certain abnormal walking patterns as well. This is particularly helpful for medical diagnostics. There are several abnormal walking patterns, which can each be associated with a certain disease/condition. Typically, doctors watch patients walk, and they subjectively diagnose the abnormal gait pattern. Using a motion tracking device like Neblina would help to objectively characterize these walking patterns and improve the diagnostic procedure.

As an example, one of the most common abnormal walking (gait) patterns is the Hemiplegic gait, which can be caused by genetics, brain stroke, etc. In this gait pattern, each step is rotated away from the body, then towards it, forming a semicircle:

Looking at the pattern of the Hemiplegic gait, one can verify that the leg's side tilt angle is the key factor in characterizing this particular gait. Namely, over one gait cycle, we observe a higher variation in the side tilt angle compared to a normal walk. Here are the results of an experiment comparing the normal walk against the Hemiplegic gait, where Neblina has been attached to the front of the thigh:

The Hemiplegic gait constitutes almost twice the variation in the thigh's side tilt angle (~20 degrees peak-to-peak) compared to the normal walk, which is within ~10 degrees variation. This is a key factor for objectively characterizing the Hemiplegic gait.

One can also extend the body orientation analysis to multiple joints. For example, the bent-over lateral raise is a workout exercise, which can be characterized by 3 motion tracking devices, one placed on the chest, and one on each wrist:

The orientation information of these three nodes together will determine the proper execution of this workout exercise. Namely, the chest's pitch angle should first be correct to indicate the "bent over" part. Furthermore, the arms should reach a high enough elevation as well as ~180 degrees difference in heading direction to showcase a full stretch. All these characteristics are derived from the 3D orientation of each joint.

#### Activity Recognition Using Supervised Learning

Even though many human body activities can directly be characterized by the orientation trajectory of the body joints, certain activities might still require more advanced machine learning solutions. An effective solution to characterize such activities is to use supervised learning. Namely, we should collect as much data as we can and build a huge training data set. The type of activity should be known to the training data set. This means that we have to provide the information regarding the type of conducted activity to the machine learning tool.

As an example, we aimed to distinguish between three activities: 1) walking on a flat ground, 2) going upstairs, and 3) going downstairs. Four different subjects have been considered and over 4000 steps including different patterns of walking have been recorded using our Neblina device. Neblina is capable of segmenting the gait cycles using its pedometer. The segmented gait cycle data including raw acceleration, raw angular velocity, raw magnetic field, fused external force vector, and the fused orientation information have been all recorded on the device and have then been pulled out to Excel files (CSV files) on a PC using our open-source Python scripts (available here):

We have then used the R tool for offline data analysis and the supervised learning process. A script has been written in R that does the following steps:

###### 1- Build a feature list

In this step, we read the segmented raw/fused data (CSV files) that have been extracted from Neblian, and generate statistical characteristics over each individual gait cycle. The raw/fused data includes: accelerometer raw data (ax, ay, az), gyroscope raw data (gx, gy, gz), magnetic raw data (mx, my, mz), the fused external force vector (fx, fy, fz), and the sine and cosine of the three fused Euler angles. The statistical characteristics include: mean, median, mean absolute deviation, median absolute deviation, root-mean-square and variance. These characteristics have been calculated for all the raw/fused CSV data files within gait segments and form a huge feature list.

###### 2- Remove the redundant features

After building the feature list, we perform a redundancy analysis in R to remove the highly correlated features. Namely, a feature that can mostly likely be found using a linear combination of other features is eliminated from the list. The procedure removes 76 features from the list and keeps the following features:

mad(rollSine), mad(pitch), mad(yawSine), mad(yawCosine), mad(fx), mad(fy), mad(fz), mad(gx), mad(gy), mad(gz), mad(mx), mad(my), mad(mz), mean(az), mean(rollSine), mean(yawSine), mean(yawCosine), mean(fz), mean(gx), mean(gy), median(ay), median(az), median(rollCosine), median(pitch), median(fx), median(fy), median(fz), median(gy), median(gz), median(my), median(mz), rms(mx), var(ax), var(ay), var(rollCosine), var(yawSine), var(yawCosine), var(fz), var(gx), var(gz), var(mx), var(my), var(mz),

where mad() is the median absolute deviation function, var() is the variance function, and rms() is the root-mean-square.

###### 3- Build a generalized linear model for each activity

The glm() function in R is used next to build a generalized linear model for each activity. The activity is known per experiment (supervised learning), and thus, the glm() function aims to find the most suitable linear fit between the feature list and the type of activity.

###### 4- Remove the insignificant features from the list

After building a model for each activity, the features will all be tagged with a significance value, which will determine the statistical significance they have towards detecting the particular activity. Using these values, we can remove the features with low statistical significance. Namely, if the significance value of a feature is lower than a threshold, it is dropped from the list. Here are the final refined features for each activity:

###### Upstairs:

The analysis finds 10 statistically significant metrics to detect this activity as follows:

mad(fz), mad(gy), mad(mx), mean(rollSine), mean(gx), mean(gy), median(fx), median(gy), var(ax), var(gx)

###### Downstairs:

The analysis finds 19 statistically significant metrics to detect this activity as follows:

mad(pitch), mad(gx), mad(gy), mad(gz), mad(mx), mad(mz), mean(gx), mean(gy), median(ay), median(rollCosine), median(pitch), median(fx), median(fy), median(fz), median(gy), median(mz), var(gx), var(gz), var(mz)

###### Flat Walk:

The analysis finds 7 statistically significant metrics to detect this activity as follows:

mad(gy), median(ay), median(rollCosine), median(pitch), median(fy), var(fz), var(gx)

It is evident from the above results that the fused information from the pitch and roll angles as well as the external force vector (fx, fy, fz) are indeed significant towards the overall recognition of the activities. Furthermore, the downstairs recognition has been the most challenging scenario, as it requires the highest number of features, i.e., 19.

###### 5- Build a new model for each activity using the final refined feature list

After removing the statistically insignificant features from the list, we can re-build a model for each activity using the reduced number of features. Each model is built again using the glm() function in R.

The final results are visualized in the below figure. Each point in this image represents a full gait cycle. The correct type of activity (supervised learning information) is written on top of each segment. The classification engine calculates a confidence level for each gait cycle, showing the probability of the gait cycle belonging to a certain activity. The confidence values towards belonging to the "flat walk" class/activity are shown with the black color, while the up and down stairs confidence values are shown with the blue and green colors, respectively. One can easily visualize that the classification engine performs the activity recognition with a high accuracy. In fact, the activities have been successfully classified in 97% of the cases.

Our open-source experimental script in R for data analysis and activity recognition is available here.