### Introduce.

NMF converts a matrix X into a 2-level lower maxtrix multiplication with approximations and small errors. The purpose is to reduce storage and calculation while preserving data characteristics (model properties).

As above, we have a data matrix X (pxn), the purpose of NMF is to reduce the data dimension from p to r by finding 2 sub matrices with multiplication ~ = X To do so, we will through work

random initial data.

Compare the generated data with the resulting X matrix.

Adjust the parameters appropriately => to optimize the error function (error function).

Repeat the above steps until the error is small enough (good enough).

### For example. (application in predicting netflix's software voted by NMF):

We have 4 people, with 2 movie genres like "comedy" and "action", the X matrix is equivalent to the big matrix, the numbers are the number of movie rating points. with matrices to find (split).

As you can see, for large matrices we need 2 million entries to store, while splitting up two matrices will only cost 300,000 entries.

To find these 2 matrices we randomize initial values, calculate and compare with the original matrix.

1.44 is less than 3, so we need to raise the param.

Or decrease (as for the next value).

The error function will be equal to the sum of the squared differences, and its derivative is exactly what we need to optimize.

After finding 2 corresponding matrices, we can predict the missing data points based on matrix multiplication (because the actual data is often incomplete, discrete, we can fill the prices). If we find the appropriate matrices, we can return to using matrix multiplication to predict the missing data).

Hope to help you understand NMF somewhat. Thank you everyone (bow)