I”m trying to reproduce an algorithm designed in a paper. And everything is going well except one thing:

It says we considered the lengths zero-meaned accelerometer vectors and created a feature for the mean and standard deviation of this value. and I do not understand what is it zero-meaned vectors?

Example dataset:

Can any body help me?

I found only this information https://www.quora.com/What-does-it-mean-when-a-vector-is-zero-mean but I”m not sure about it.

Đang xem: Zero mean là gì, forum machine learning cơ bản

Thank you.

machine-learning pandas
Share
Improve this question
Follow
asked Sep 25 “16 at 13:59

*

ArsenArsen
11311 silver badge44 bronze badges
$endgroup$
Add a comment |

2 Answers 2

Active Oldest Votes
3
$egingroup$
“Zero-meaned” means the vector has been transformed so that its mean is 0.

Typically, you would do this by subtracting the mean of each column from that column. (This is for dimensional as well as algorithmic reasons; you don”t want to subtract a person”s weight from their height.)

It sounds like here they”re actually talking about the row mean–that is, $(-0.6946377, 12.680544, 0.50395286)$ would be transformed to $(-4.857924, 8.5172577, -3.65933344, 4.1632863, 7.40047)$, where the first three are the original features minus the row mean, the fourth is the row mean, and the fifth is the standard deviation of the original features.

Xem thêm: Chỉ Số Octan Là Gì – Nghĩa Của Từ Chỉ Số Octan Trong Tiếng Việt

This would make sense if the three have the same units (if they”re all accelerations at the same scale, this works), and so you want a separate measure of how much it”s being accelerated at all and how much it”s being accelerated in a particular direction.

Share
Improve this answer
Follow
edited Sep 25 “16 at 16:12
answered Sep 25 “16 at 16:10

*

Matthew GravesMatthew Graves
73544 silver badges1010 bronze badges
$endgroup$
2
Add a comment |
2
$egingroup$
Mean centering is one of many related techniques to preprocess data for downstream analysis in multivariate methods.

Xem thêm: Tổng Quan Thông Tin Ngành Địa Chất Là Gì ? Địa Chất Học

It might sound odd at first, but it means exactly what it says: the vector has a mean of zero. In pseudocode, (sum(vector) / len(vector)) == 0.

In multivariate data, this typically is applied along each column in a dataset so each column can be more easily compared to another within a similar range of data. After mean centering, each row only includes how it differs from the average sample from that variable in the original data. Typically, samples are also scaled to have unit variance as well, allowing you to more readily compare the data across continuous variables with different ranges.

For example, if you had a dataset of patients with variables height, weight, age, household_income, despite each variable being a continuous value each of these variables will be in different range. Height might be between 60 — 75 inches, weight between 100 — 300 lbs, and so on.

Why do all this? Removing the mean and standardizing the variance will help downstream methods not “learn” the mean and variance of your data, making it easier to find relationships between variables. Many assume that your data is centered / scaled / normalized in some way and will behave poorly if you don”t do so.

Leave a Reply

Your email address will not be published. Required fields are marked *