diff options
Diffstat (limited to 'doc/fitmle.1')
-rw-r--r-- | doc/fitmle.1 | 136 |
1 files changed, 136 insertions, 0 deletions
diff --git a/doc/fitmle.1 b/doc/fitmle.1 new file mode 100644 index 0000000..75a43d1 --- /dev/null +++ b/doc/fitmle.1 @@ -0,0 +1,136 @@ +.\" generated with Ronn/v0.7.3 +.\" http://github.com/rtomayko/ronn/tree/0.7.3 +. +.TH "FITMLE" "1" "September 2017" "www.complex-networks.net" "www.complex-networks.net" +. +.SH "NAME" +\fBfitmle\fR \- Fit a set of values with a power\-law distribution +. +.SH "SYNOPSIS" +\fBfitmle\fR \fIdata_in\fR [\fItol\fR [TEST [\fInum_test\fR]]] +. +.SH "DESCRIPTION" +\fBfitmle\fR fits the data points contained in the file \fIdata_in\fR with a power\-law function P(k) ~ k, using the Maximum\-Likelihood Estimator (MLE)\. In particular, \fBfitmle\fR finds the exponent \fBgamma\fR and the minimum of the values provided on input for which the power\-law behaviour holds\. The second (optional) argument \fItol\fR sets the acceptable statistical error on the estimate of the exponent\. +. +.P +If \fBTEST\fR is provided, the program associates a p\-value to the goodness of the fit, based on the Kolmogorov\-Smirnov statistics computed on \fInum_test\fR sampled distributions from the theoretical power\-law function\. If \fInum_test\fR is not provided, the test is based on 100 sampled distributions\. +. +.SH "PARAMETERS" +. +.TP +\fIdata_in\fR +Set of values to fit\. If is equal to \fB\-\fR (dash), read the set from STDIN\. +. +.TP +\fItol\fR +The acceptable statistical error on the estimation of the exponent\. If omitted, it is set to 0\.1\. +. +.TP +TEST +If the third parameter is \fBTEST\fR, the program computes an estimate of the p\-value associated to the best\-fit, based on \fInum_test\fR synthetic samples of the same size of the input set\. +. +.TP +\fInum_test\fR +Number of synthetic samples to use for the estimation of the p\-value of the best fit\. +. +.SH "OUTPUT" +If \fBfitmle\fR is given less than three parameters (i\.e\., if \fBTEST\fR is not specified), the output is a line in the format: +. +.IP "" 4 +. +.nf + + gamma k_min ks +. +.fi +. +.IP "" 0 +. +.P +where \fBgamma\fR is the estimate for the exponent, \fBk_min\fR is the smallest of the input values for which the power\-law behaviour holds, and \fBks\fR is the value of the Kolmogorov\-Smirnov statistics of the best\-fit\. +. +.P +If \fBTEST\fR is specified, the output line contains also the estimate of the p\-value of the best fit: +. +.IP "" 4 +. +.nf + + gamma k_min ks p\-value +. +.fi +. +.IP "" 0 +. +.P +where \fBp\-value\fR is based on \fInum_test\fR samples (or just 100, if \fInum_test\fR is not specified) of the same size of the input, obtained from the theoretical power\-law function computed as a best fit\. +. +.SH "EXAMPLES" +Let us assume that the file \fBAS\-20010316\.net_degs\fR contains the degree sequence of the data set \fBAS\-20010316\.net\fR (the graph of the Internet at the AS level in March 2001)\. The exponent of the best\-fit power\-law distribution can be obtained by using: +. +.IP "" 4 +. +.nf + + $ fitmle AS\-20010316\.net_degs + Using discrete fit + 2\.06165 6 0\.031626 0\.17 + $ +. +.fi +. +.IP "" 0 +. +.P +where \fB2\.06165\fR is the estimated value of the exponent \fBgamma\fR, \fB6\fR is the minimum degree value for which the power\-law behaviour holds, and \fB0\.031626\fR is the value of the Kolmogorov\-Smirnov statistics of the best\-fit\. The program is also telling us that it decided to use the discrete fitting procedure, since all the values in \fBAS\-20010316\.net_degs\fR are integers\. The latter information is printed to STDERR\. +. +.P +It is possible to compute the p\-value of the estimate by running: +. +.IP "" 4 +. +.nf + + $ fitmle AS\-20010316\.net_degs 0\.1 TEST + Using discrete fit + 2\.06165 6 0\.031626 0\.17 + $ +. +.fi +. +.IP "" 0 +. +.P +which provides a p\-value equal to 0\.17, meaning that 17% of the synthetic samples showed a value of the KS statistics larger than that of the best\-fit\. The estimation of the p\-value here is based on 100 synthetic samples, since \fInum_test\fR was not provided\. If we allow a slightly larger value of the statistical error on the estimate of the exponent \fBgamma\fR, we obtain different values of \fBgamma\fR and \fBk_min\fR, and a much higher p\-value: +. +.IP "" 4 +. +.nf + + $ fitmle AS\-20010316\.net_degs 0\.15 TEST 1000 + Using discrete fit + 2\.0585 19 0\.0253754 0\.924 + $ +. +.fi +. +.IP "" 0 +. +.P +Notice that in this case, the p\-value of the estimate is equal to 0\.924, and is based on 1000 synthetic samples\. +. +.SH "SEE ALSO" +deg_seq(1), power_law(1) +. +.SH "REFERENCES" +. +.IP "\(bu" 4 +A\. Clauset, C\. R\. Shalizi, and M\. E\. J\. Newman\. "Power\-law distributions in empirical data"\. SIAM Rev\. 51, (2007), 661\-703\. +. +.IP "\(bu" 4 +V\. Latora, V\. Nicosia, G\. Russo, "Complex Networks: Principles, Methods and Applications", Chapter 5, Cambridge University Press (2017) +. +.IP "" 0 +. +.SH "AUTHORS" +(c) Vincenzo \'KatolaZ\' Nicosia 2009\-2017 \fB<v\.nicosia@qmul\.ac\.uk>\fR\. |